Analyzing CXPACKET waits
One of the most common wait types you will see on a SQL Server is the CXPACKET wait type.
This wait type is usually at the top of the list when you query the sys.dm_os_wait_stats on servers where you configured the MDOP setting to be anything but ‘1’. CXPACKET waits have everything to do with how your SQL Server handles parallelism.
Does this mean your server is running into problems concerning parallelism? Well as most things concerning SQL Server, it depends…
This article will explain how CXPACKET waits are collected and help you get a better understanding of this wait type.
What is CXPACKET and how does it work
The way Microsoft describes CXPACKET waits in BOL is
Occurs with parallel query plans when trying to synchronize the query processor exchange iterator
To make this a bit easier to understand take a look at the image bellow, this is a case where a query (for instance a table scan) generated a parallel plan and the load on each of the 4 threads is divided equally:
Notice that next to the 4 threads that actually did the work there is a Thread 0. This thread is the actually time it takes the complete query to run and is the thread that records the CXPACKET wait if a parallel plan is used to execute the query.
In the example above we can see the load is divided equally on the 4 available threads. A lot of times though this load will not be divided equally and some threads have more work to do then other threads. In that case the image will look a little different as illustrated by the image below:
In this case we can see that Thread 1 and Thread 4 are completed while Thread 2 and Thread 3 are still running with Thread 2 finishing last. The Thread 0 records the time between the start of the threads till the last thread is finished, in this case your CXPACKET wait time will be the time Thread 2 ran.
Behavior like this can indicate parallelism problems since the work load was not distributed equally between the available 4 threads. To use parallelism optimally it is very important that the work load is distributed equally between the threads!
You can find out how your workload is balanced between the available threads by viewing the properties of your execution plan. The image below shows the distribution for a query I ran on my test server:
In this case you can see that 69236 rows where processed by thread 1 and 73398 by thread 2.
A way to avoid getting a skewed distribution between your threads in parallel plans, is making sure your statistics are up-to-date.
As you hopefully noticed the CXPACKET doesn’t necessarily indicate a problem, it just shows you parallelism is being used. It can be a problem if executing the query serially (using no parallelism) is actually faster then using a parallel plan or the amount of work is not distributed equally as you can see in the image above.
How to reduce CXPACKET waits
Rule number one of fixing CXPACKET waits is don’t freak out and turn of parallelism! Setting MDOP to 1 to just disable parallelism altogether is generally not a good idea, maybe a lot of queries do benefit from using a parallel plan and turning of parallelism will slow down those queries.
A good start is to take a look at your MDOP setting. By default SQL Server will set your MDOP value to ‘0’ which means every available scheduler can be used by parallel plans. This was fine when we only had 4 processors available in our servers but, nowadays it can be overkill to use all your schedulers in a parallel plan. This is because when using more then 8 schedulers diminishing returns will kick in. Microsoft actually recommends the following configuration in kb2806535 :
- Servers with more then 8 cores -> MAXDOP 8
- Servers with 8 or fewer cores -> MAXDOP 0 to N (where N is the number of cores in your machine)
There are a lot of discussions on the internet discussing the optimal MDOP setting depending on your database type. For OLTP workloads you should set your MDOP to ‘1’ to disable parallelism completely, while for datawarehouse workloads you should set it to a higher value to use parallelism. Do not blindly follow this advice! Even in most OLTP databases huge queries are being run from time to time that would have benefited from parallelism if it was enabled. The key to setting your MDOP value is knowing your workload!
Next to setting your MDOP option you should take a look at the “Cost threshold for parallelism”. This means the cost of a query needs to be X or higher before it has a chance to generate a parallel plan. The default value of the “Cost threshold for parallelism” is ‘5’. This is actually a pretty low value by default which means a lot of queries that are actually better of running a serial plan will receive a parallel plan.
You can view the cost of your query plan by enabling the “Include actual execution plan” option in SSMS and mouse over the first step in the Execution Plan window after execution your query.
In this case the cost of this serial query is 12,4859 and with the default setting of ‘5’ at the “Cost threshold for parallelism” this will generate a parallel plan. Keep in mind a parallel plan will only be generated when the cost of a serial (non-parallelized) plan is higher then the “Cost threshold for parallelism” setting!
Again it is very important to understand your workload before changing this setting. A good way to do this is taking a look at the plan cache for parallel plans. There are various scripts you can use to do this, for example Pinal Dave has a great query you can use over at his blog: //blog.sqlauthority.com/2010/07/24/sql-server-find-queries-using-parallelism-from-cached-plan/
Take a look at the plans and try to find out what your optimal “Cost threshold for parallelism” should be.
CXPACKET waits on your SQL Server doesn’t necessarily mean you are running into trouble so avoid freaking out and turning of parallelism completely! Try to analyze your workload and change your “MDOP” and “Cost threshold for parallelism” for optimized parallelism usage for your system!