Cost Threshold for Parallelism is a server wide setting which specifies the lower bound over which plans will execute in parallel.
The default value on installing SQL Server is 5. This means any query with a cost deemed to be greater than 5 will execute in parallel. There are no units for this measure and none are required. Just know that SQL Server is a cost based engine and determines a cost for each operator in a query plan and an overall cost. We just need to know that 6 is greater than 5 and 50 is less than 199 etc.
You can view and specify this setting like so
EXEC sys.sp_configure @configname = 'cost threshold for parallelism', @configvalue = 1000; RECONFIGURE; SELECT * FROM sys.configurations WHERE name = 'cost threshold for parallelism';
If we look at the execution plan for the following query
DBCC FREEPROCCACHE DBCC SETCPUWEIGHT(1000) GO USE AdventureWorks2012; SELECT * FROM Sales.SalesOrderHeader AS h INNER JOIN Sales.SalesOrderDetail AS d ON d.SalesOrderID = h.SalesOrderID GO DBCC SETCPUWEIGHT(1) DBCC FREEPROCCACHE go
We see that the overall cost is 538.914 (DBCC SETCPUWEIGHT is an undocumented command allowing me to artificially increase the cost here for demo purposes).
This means we will have a serial plan as our cost threshold is 1000 as set above. If I change the cost threshold to 538, then the new plan looks like this
We know this is a parallel plan because of the double yellow arrows on the operators.
The big question remains however, what is the optimal value for cost threshold for parallelism? There is no definitive correct answer to this question. There is overhead for SQL Server to split a workload over multiple cores, then merge the output back together. That is why everything shouldn’t be parallel, for low cost queries it is detrimental for performance.
The range of valid values are 0 – 32,767 and trying the full range would probably result in a bell curve, too low and everything is parallel, too high and the majority of queries will remain serial and you will lose the potential performance benefit.
The answer is to test. Run a production like workload with different cost threshold values on a test server and monitor all metrics using a third party tool and compare to your baseline. If you were feeling very brave, you could change the value in live and monitor similarly. I have heard wiser people than me say they change the default to 25 or 50 as a first sensible change.
Some further points before wrapping up. When threads sharing the load in a parallel plan are waiting for the last to complete, the wait type is CXPACKET. Similarly when the repartition operator is waiting on all threads to complete and deliver their results, the wait type is the same. This is why CXPACKET waits are a perfectly normal symptom of parallelism and are not necessarily something to worry about when performance tuning.
Lastly, the whole post above depends on you having more than one CPU core of course. Also that the number of available CPU cores for use has not been throttled by setting max degree of parallelism to be 1 at either the instance or query level.