Cost Threshold for Parallelism

Cost Threshold for Parallelism is a server wide setting which specifies the lower bound over which plans will execute in parallel.

The default value on installing SQL Server is 5. This means any query with a cost deemed to be greater than 5 will execute in parallel. There are no units for this measure and none are required. Just know that SQL Server is a cost based engine and determines a cost for each operator in a query plan and an overall cost. We just need to know that 6 is greater than 5 and 50 is less than 199 etc.

You can view and specify this setting like so


EXEC sys.sp_configure @configname = 'cost threshold for parallelism',
 @configvalue = 1000;

RECONFIGURE;

SELECT *
FROM sys.configurations
WHERE name = 'cost threshold for parallelism';

If we look at the execution plan for the following query


DBCC FREEPROCCACHE
DBCC SETCPUWEIGHT(1000)
GO

USE AdventureWorks2012;
SELECT * FROM Sales.SalesOrderHeader AS h
INNER JOIN Sales.SalesOrderDetail AS d ON d.SalesOrderID = h.SalesOrderID
GO

DBCC SETCPUWEIGHT(1)
DBCC FREEPROCCACHE
go

We see that the overall cost is 538.914 (DBCC SETCPUWEIGHT is an undocumented command allowing me to artificially increase the cost here for demo purposes).

Cost for query

This means we will have a serial plan as our cost threshold is 1000 as set above. If I change the cost threshold to 538, then the new plan looks like this

Parallel plan

We know this is a parallel plan because of the double yellow arrows on the operators.

The big question remains however, what is the optimal value for cost threshold for parallelism? There is no definitive correct answer to this question. There is overhead for SQL Server to split a workload over multiple cores, then merge the output back together. That is why everything shouldn’t be parallel, for low cost queries it is detrimental for performance.

The range of valid values are 0 – 32,767 and trying the full range would probably result in a bell curve, too low and everything is parallel, too high and the majority of queries will remain serial and you will lose the potential performance benefit.

The answer is to test. Run a production like workload with different cost threshold values on a test server and monitor all metrics using a third party tool and compare to your baseline. If you were feeling very brave, you could change the value in live and monitor similarly. I have heard wiser people than me say they change the default to 25 or 50 as a first sensible change.

Some further points before wrapping up. When threads sharing the load in a parallel plan are waiting for the last to complete, the wait type is CXPACKET. Similarly when the repartition operator is waiting on all threads to complete and deliver their results, the wait type is the same. This is why CXPACKET waits are a perfectly normal symptom of parallelism and are not necessarily something to worry about when performance tuning.

Lastly, the whole post above depends on you having more than one CPU core of course. Also that the number of available CPU cores for use has not been throttled by setting max degree of parallelism to be 1 at either the instance or query level.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s