SDL Web 8.5 broker performance woes.

With the advent of SDL Web 8.5, we’ve seen many nice improvements to the software. We’re able to move items in the blueprint hierarchy, we can now set up a whole content delivery environment with a number of powershell scripts, … but in our most recent upgrade, we’ve found a big problem.

Let’s start at the beginning. A few years back, I helped a client build a website which was heavily relying on dynamic data. Each page in their website was constructed using broker queries and dynamic component presentations.

We knew that, when we load a lot of dynamic data, we needed to get our code right in order to get fast page load times (we don’t want our pages to take minutes to load now do we).

Using DD4T running on a Tridion 2013 SP1 environment, we managed to pull it off : blazing fast load times, a fully personalized website, loads of broker queries, life was good.

Over time, we migrated from DD4T 1 to DD4T 2 without too much problems, but recently, we hit some serious problems.

Since support for SDL Tridion 2013 was stopping, we wanted to move everything over to a new and shiny SDL Web 8.5 installation. We installed the new CM, upgraded our databases, installed some microservices, updated the dlls of our DD4T website, and everything compiled, pages still looked the same. Job well done (we thought)…

After some initial testing, we found that pages would load correctly, and once caching kicked in, pages loaded fast. The big problem we noticed were the initial load times. Where in our 2013 environment, initial load times were also relatively fast (with an IIS worker process startup and first .NET compilations, which takes around 15 seconds, the webpages started to show up after 20 seconds on a first load. Subsequent requests were handled in milliseconds (since then most of the content was served from cache).

loading-screen

In our new setup, we noticed that our initial page load for some data intensive pages were taking over a minute to load. After that first request, pages were served in milliseconds again.

Since this load time was unacceptable, and the previous setup (using Tridion 2013) proved we could get quick load times, we started our search for a bottleneck. The codebase hadn’t changed, the only big change we had at first was moving from an in-process broker API to using the Microservices to get the broker content.

When trying to find what caused such load times, we noticed that in some cases (when we were loading multiple pages at once for the first time, so little or no content was cached), our DB servers CPU spiked to 100 %, and a bunch of broker DB queries were starting to get queued, until pages started to timeout. Using SQL Profiler and the execution plans, we were able to find a number of expensive queries. The SQL activity monitor also suggested some index optimizations that would make the queries quicker.

activity monitor

 

execution planWith the approval of SDL Support, we added those indexes to the Broker DB, and after that we did notice some improvements. CPU spikes were gone, pages started to load quicker, life looked a little better again, until we tested one more page.

This page alone had again a startup time of over one minute. So something else must be wrong. Again we analysed the queries, and now also compared the queries and execution times of our old environment.

We finally found this huge broker query (basically, the query selects one or more components, based on some taxonomy value criteria and a schema id)

declare @p1 int;
exec sp_prepexec @p1 output,N'@P0 int,@P1 int,@P2 int,@P3 int,@P4 int,
@P5 nvarchar(4000),@P6 int,@P7 int,@P8 nvarchar(4000),@P9 int,@P10 int,
@P11 nvarchar(4000),@P12 nvarchar(4000),@P13 int,@P14 int,@P15 int,
@P16 int,@P17 int,@P18 nvarchar(4000),@P19 float,@P20 int,@P21 int,
@P22 nvarchar(4000)',N'select distinct TOP(@P0) 
itemmeta0_.PUBLICATION_ID as col_0_0_, itemmeta0_.ITEM_REFERENCE_ID as 
col_1_0_, itemmeta0_.ITEM_TYPE as col_2_0_, itemmeta0_.LAST_PUBLISHED_DATE 
as col_3_0_ from ITEMS itemmeta0_ cross joinITEM_CATEGORIES_AND_KEYWORDS 
relatedkey1_ cross join TAXFACETS taxonomyit2_ 
cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey3_ 
cross join TAXFACETS taxonomyit4_ 
cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey5_ 
cross join TAXFACETS taxonomyit6_ 
cross joinCOMPONENT componentm7_ 
inner join ITEMS componentm7_1_ on 
componentm7_.PUBLICATION_ID=componentm7_1_.PUBLICATION_ID and 
componentm7_.NAMESPACE_ID=componentm7_1_.NAMESPACE_ID and 
componentm7_.ITEM_REFERENCE_ID=componentm7_1_.ITEM_REFERENCE_ID 
cross join TAXFACETStaxonomyit8_ cross join TAXFACETS taxonomyit9_ 
cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey10_ 
cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey11_ 
cross join TAXFACETS taxonomyit12_ 
cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey13_ 
cross join CUSTOM_METAcustommeta14_ 
where itemmeta0_.PUBLICATION_ID=@P1 and 
itemmeta0_.ITEM_REFERENCE_ID=componentm7_.ITEM_REFERENCE_ID and 
itemmeta0_.PUBLICATION_ID=componentm7_.PUBLICATION_ID and 
componentm7_.SCHEMA_ID=@P2 and 
itemmeta0_.ITEM_REFERENCE_ID=relatedkey5_.ITEM_REFERENCE_ID and 
itemmeta0_.PUBLICATION_ID=relatedkey5_.PUBLICATION_ID and 
relatedkey5_.KEYWORD_ID=taxonomyit6_.FACET_ID and 
relatedkey5_.PUBLICATION_ID=taxonomyit6_.PUBLICATION_ID and 
relatedkey5_.PUBLICATION_ID=@P3 and taxonomyit6_.TAXONOMY_ID=@P4 
and taxonomyit6_.FACET_NAME=@P5 and 
itemmeta0_.ITEM_REFERENCE_ID=relatedkey3_.ITEM_REFERENCE_ID and 
itemmeta0_.PUBLICATION_ID=relatedkey3_.PUBLICATION_ID and 
relatedkey3_.KEYWORD_ID=taxonomyit4_.FACET_ID and 
relatedkey3_.PUBLICATION_ID=taxonomyit4_.PUBLICATION_ID and 
relatedkey3_.PUBLICATION_ID=@P6 and taxonomyit4_.TAXONOMY_ID=@P7 and 
taxonomyit4_.FACET_KEY=@P8 and 
itemmeta0_.ITEM_REFERENCE_ID=relatedkey1_.ITEM_REFERENCE_ID and 
itemmeta0_.PUBLICATION_ID=relatedkey1_.PUBLICATION_ID and 
relatedkey1_.KEYWORD_ID=taxonomyit2_.FACET_ID 
and relatedkey1_.PUBLICATION_ID=taxonomyit2_.PUBLICATION_ID and 
relatedkey1_.PUBLICATION_ID=@P9 and taxonomyit2_.TAXONOMY_ID=@P10 and 
taxonomyit2_.FACET_KEY=@P11 and 
taxonomyit8_.PUBLICATION_ID=itemmeta0_.PUBLICATION_ID and 
taxonomyit9_.TAXONOMY_ID=taxonomyit8_.TAXONOMY_ID and 
taxonomyit9_.PUBLICATION_ID=taxonomyit8_.PUBLICATION_ID and 
itemmeta0_.ITEM_REFERENCE_ID=relatedkey10_.ITEM_REFERENCE_ID and 
relatedkey10_.KEYWORD_ID=taxonomyit9_.FACET_ID and 
taxonomyit8_.FACET_KEY=@P12 and taxonomyit8_.TAXONOMY_ID=@P13 and 
taxonomyit8_.PUBLICATION_ID=@P14 and taxonomyit9_.FACET_LEFT>=taxonomyit8_.FACET_LEFT 
and taxonomyit9_.FACET_RIGHT<=taxonomyit8_.FACET_RIGHT and
itemmeta0_.ITEM_REFERENCE_ID=relatedkey13_.ITEM_REFERENCE_ID and 
itemmeta0_.PUBLICATION_ID=relatedkey13_.PUBLICATION_ID and 
relatedkey13_.KEYWORD_ID=@P15 and relatedkey13_.TAXONOMY_ID=@P16 and 
relatedkey13_.PUBLICATION_ID=@P17 and 
itemmeta0_.PUBLICATION_ID=custommeta14_.PUBLICATION_ID and 
itemmeta0_.ITEM_REFERENCE_ID=custommeta14_.ITEM_ID and 
itemmeta0_.ITEM_TYPE=custommeta14_.ITEM_TYPE and 
custommeta14_.KEY_NAME=@P18 and custommeta14_.KEY_FLOAT_VALUE=@P19 and 
itemmeta0_.ITEM_REFERENCE_ID=relatedkey11_.ITEM_REFERENCE_ID and 
itemmeta0_.PUBLICATION_ID=relatedkey11_.PUBLICATION_ID and 
relatedkey11_.KEYWORD_ID=taxonomyit12_.FACET_ID and 
relatedkey11_.PUBLICATION_ID=taxonomyit12_.PUBLICATION_ID and 
relatedkey11_.PUBLICATION_ID=@P20 and taxonomyit12_.TAXONOMY_ID=@P21 and
 taxonomyit12_.FACET_KEY=@P22 
order by itemmeta0_.LAST_PUBLISHED_DATE DESC        
',6,20,1815,20,79,N'2017',20,121,N'09',20,85,N'XXXX',N'1',183,20,2045,77,20,N'xxxxx',1,20,76,N'xxxxx'

We could also retrieve the similar query of our old environment, doing the exact same thing.

exec sp_prepexec @p1 output,N'@P0 int,@P1 int,@P2 int,@P3 int,        
@P4 nvarchar(4000),@P5 int,@P6 int,@P7 nvarchar(4000),@P8 int,@P9 int,
@P10 nvarchar(4000),@P11 nvarchar(4000),@P12 int,@P13 int,@P14 int,
@P15 int,@P16 int,@P17 nvarchar(4000),@P18 float,@P19 int,@P20 int,
@P21 nvarchar(4000)',
N'select distinct top 6    
itemmeta0_.PUBLICATION_ID as col_0_0_, 
itemmeta0_.ITEM_REFERENCE_ID as col_1_0_, 
itemmeta0_.ITEM_TYPE as col_2_0_, itemmeta0_.LAST_PUBLISHED_DATE 
as col_3_0_ from ITEMS itemmeta0_
cross join TAXFACETS taxonomyit1_
cross join TAXFACETS taxonomyit2_
cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey3_
cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey4_
cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey5_
cross join TAXFACETS taxonomyit6_
cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey7_
cross join TAXFACETS taxonomyit8_
cross join COMPONENT componentm9_
inner join ITEMS componentm9_1_ on 
componentm9_.PUBLICATION_ID=componentm9_1_.PUBLICATION_ID and 
componentm9_.ITEM_REFERENCE_ID=componentm9_1_.ITEM_REFERENCE_ID
cross join CUSTOM_META custommeta10_
cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey11_
cross join TAXFACETS taxonomyit12_
cross join ITEM_CATEGORIES_AND_KEYWORDS relatedkey13_
cross join TAXFACETS taxonomyit14_
where itemmeta0_.PUBLICATION_ID=@P0 and 
itemmeta0_.ITEM_REFERENCE_ID=componentm9_.ITEM_REFERENCE_ID and 
itemmeta0_.PUBLICATION_ID=componentm9_.PUBLICATION_ID and 
componentm9_.SCHEMA_ID=@P1 and 
itemmeta0_.ITEM_REFERENCE_ID=relatedkey13_.ITEM_REFERENCE_ID and 
itemmeta0_.PUBLICATION_ID=relatedkey13_.PUBLICATION_ID and 
relatedkey13_.KEYWORD_ID=taxonomyit14_.FACET_ID and 
relatedkey13_.PUBLICATION_ID=taxonomyit14_.PUBLICATION_ID and 
relatedkey13_.PUBLICATION_ID=@P2 and taxonomyit14_.TAXONOMY_ID=@P3 and 
taxonomyit14_.FACET_NAME=@P4 and 
itemmeta0_.ITEM_REFERENCE_ID=relatedkey11_.ITEM_REFERENCE_ID and 
itemmeta0_.PUBLICATION_ID=relatedkey11_.PUBLICATION_ID and 
relatedkey11_.KEYWORD_ID=taxonomyit12_.FACET_ID and 
relatedkey11_.PUBLICATION_ID=taxonomyit12_.PUBLICATION_ID and 
relatedkey11_.PUBLICATION_ID=@P5        and 
taxonomyit12_.TAXONOMY_ID=@P6 and taxonomyit12_.FACET_KEY=@P7 and 
itemmeta0_.ITEM_REFERENCE_ID=relatedkey7_.ITEM_REFERENCE_ID and 
itemmeta0_.PUBLICATION_ID=relatedkey7_.PUBLICATION_ID and 
relatedkey7_.KEYWORD_ID=taxonomyit8_.FACET_ID and 
relatedkey7_.PUBLICATION_ID=taxonomyit8_.PUBLICATION_ID and 
relatedkey7_.PUBLICATION_ID=@P8 and taxonomyit8_.TAXONOMY_ID=@P9 and 
taxonomyit8_.FACET_KEY=@P10 and 
taxonomyit1_.PUBLICATION_ID=itemmeta0_.PUBLICATION_ID and 
taxonomyit2_.TAXONOMY_ID=taxonomyit1_.TAXONOMY_ID and 
taxonomyit2_.PUBLICATION_ID=taxonomyit1_.PUBLICATION_ID and 
itemmeta0_.ITEM_REFERENCE_ID=relatedkey3_.ITEM_REFERENCE_ID and 
relatedkey3_.KEYWORD_ID=taxonomyit2_.FACET_ID and 
taxonomyit1_.FACET_KEY=@P11 and taxonomyit1_.TAXONOMY_ID=@P12 and 
taxonomyit1_.PUBLICATION_ID=@P13 and 
taxonomyit2_.FACET_LEFT>=taxonomyit1_.FACET_LEFT and 
taxonomyit2_.FACET_RIGHT<=taxonomyit1_.FACET_RIGHT and 
itemmeta0_.ITEM_REFERENCE_ID=relatedkey4_.ITEM_REFERENCE_ID and 
itemmeta0_.PUBLICATION_ID=relatedkey4_.PUBLICATION_ID and 
relatedkey4_.KEYWORD_ID=@P14 and relatedkey4_.TAXONOMY_ID=@P15 and 
relatedkey4_.PUBLICATION_ID=@P16 and 
itemmeta0_.PUBLICATION_ID=custommeta10_.PUBLICATION_ID and 
itemmeta0_.ITEM_REFERENCE_ID=custommeta10_.ITEM_ID and 
itemmeta0_.ITEM_TYPE=custommeta10_.ITEM_TYPE and 
custommeta10_.KEY_NAME=@P17 and custommeta10_.KEY_FLOAT_VALUE=@P18 and 
itemmeta0_.ITEM_REFERENCE_ID=relatedkey5_.ITEM_REFERENCE_ID and 
itemmeta0_.PUBLICATION_ID=relatedkey5_.PUBLICATION_ID and 
relatedkey5_.KEYWORD_ID=taxonomyit6_.FACET_ID and 
relatedkey5_.PUBLICATION_ID=taxonomyit6_.PUBLICATION_ID and 
relatedkey5_.PUBLICATION_ID=@P19 and taxonomyit6_.TAXONOMY_ID=@P20 and 
taxonomyit6_.FACET_KEY=@P21 order by itemmeta0_.LAST_PUBLISHED_DATE DESC'
,20,1815,20,79,N'2017',20,121,N'09',20,85,N'XXXX',N'1',183,20,2045,
77,20,N'XXXXX',1,20,76,N'XXXXX'

rtcwz

And then, we compared execution times (both queries running on the same DB server).

quick tip: to do that, just add this before and after your query

set statistics time on
<your query here>
set statistics time off

Using this, we got back these numbers:

SDL Web 8.5 query :
SQL Server Execution Times:
   CPU time = 2219 ms,  elapsed time = 3547 ms.

SDL Tridion 2013 SP1 query :
SQL Server Execution Times:
   CPU time = 172 ms,  elapsed time = 204 ms.

So, our query was performing over 10 times slower than the original one, this wasn’t right.

Analysing both queries, we found two major differences:

  • The order of the joins
  • The extra condition in the inner join
    “…and componentm7_.NAMESPACE_ID=componentm7_1_.NAMESPACE_ID..”

Since variations of this query were executed several times (since we needed to load different blocks of content dynamically from the broker, this explained most of the load times issues we were seeing.

So, now the task was to find out the problem with this query.

Since these queries are generated by the Broker API using Hibernate, we can’t really change them to optimize them, so we forwarded all this information to SDL support, so hopefully they could provide us with an answer.

Off course, we took it one step further.
We couldn’t change those queries, but we could still play with them to see if they can be improved. After some playing around with the query, we found that the extra condition was causing all of the delay. Removing the namespace_ID check in the inner join seemed to speed things up. Instead of 3 seconds to run, the query now also ran in milliseconds (yay!!!).

Now, this didn’t really help us, since we still couldn’t really change the broker API generating that sql statement, so we looked a bit further.

We couldn’t change those queries, but we could try to change the DB (at least the indexes). If we could add an index to optimize that query, and SDL Support would approve this fix (or come up with an alternative fix), our problems would be solved.

So, after some more testing, we found that adding this index to the Component table

CREATE NONCLUSTERED INDEX [index_name_here] 
ON [dbo].[COMPONENT]
(
[ITEM_REFERENCE_ID] ASC,
[PUBLICATION_ID] ASC,
[NAMESPACE_ID] ASC
)WITH (PAD_INDEX = OFF, 
STATISTICS_NORECOMPUTE = OFF, 
SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, 
ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) 
ON [PRIMARY]
GO

Adding this made the performance of the query as fast as the Tridion 2013 version.

As a result, suddenly our problematic page also loaded in 20 seconds (on first load, without anything cached, and iis starting up), instead of in a minute.

I’m not sure if this is the final fix, as we’re still waiting for SDL support to come back with a final answer. Once I get more information, i’ll update this post.

 

Disclaimer :

Changing any of the Tridion CM or Broker databases is not a good practice and is not recommended, and voids your support contract with SDL. Any of the techniques described in this post were done on a development environment, with a backup of the original broker DB.

After we found any improvements by adding indexes, we first consulted SDL Support to get approval to install these. If you ever want to do this as well, first contact SDL support.

One thought on “SDL Web 8.5 broker performance woes.

  1. Probably worth noting that whilst the indexing improves performance it doesn’t address the root cause; the root cause is _likely_ to be some rogue taxonomy sub-queries and SDL Support are looking to resolve this at the core product level.

    Also I believe that we are OK to update the broker database (within reason) but the CM database is a great big no no.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>