Parameter sniffing

A previous blog post regarding querying the plan cache, referred to a concept called parameter sniffing, here I will examine this further. As an example, I will create the following table and stored procedure: –

USE DBA;

SET ANSI_NULLS ON;
SET QUOTED_IDENTIFIER ON;
SET ANSI_PADDING ON;
SET NOCOUNT ON;

IF EXISTS (SELECT 1 FROM sys.tables WHERE [object_id] = OBJECT_ID(N'dbo.Sale'))
BEGIN;
DROP TABLE dbo.Sale;
END;

CREATE TABLE dbo.Sale (
SaleID INT IDENTITY(1,1) NOT NULL,
Dt DATETIME NOT NULL CONSTRAINT df_Sale_Dt DEFAULT (CURRENT_TIMESTAMP),
CountryID INT NOT NULL
);

INSERT INTO dbo.Sale (CountryID)
SELECT 1
FROM master.dbo.spt_values AS a
CROSS JOIN master.dbo.spt_values AS b;

INSERT INTO dbo.Sale (CountryID) VALUES (2);

ALTER TABLE dbo.Sale ADD CONSTRAINT pkSale PRIMARY KEY CLUSTERED (SaleID);
CREATE NONCLUSTERED INDEX ix_Sale_CountryID ON dbo.Sale(CountryID);
GO

USE DBA;

SET ANSI_NULLS ON;
SET QUOTED_IDENTIFIER ON;
SET ANSI_PADDING ON;

IF EXISTS (SELECT 1 FROM sys.procedures WHERE [object_id] = OBJECT_ID(N'dbo.LatestOrder'))
BEGIN;
DROP PROCEDURE dbo.LatestOrder;
END;

GO

CREATE PROCEDURE dbo.LatestOrder (@CountryID INT) AS

SELECT MaxDt = MAX(Dt)
FROM dbo.Sale
WHERE CountryID = @CountryID;

GO

The data inserted was for 6,325,225 sales in CountryID=1 but only 1 sale in CountryID=2. The stored procedure returns the maximum order date for the CountryID. If I then execute the dbo.LatestOrder stored procedure three times like so: –


SET STATISTICS IO ON;
EXECUTE sp_recompile N'dbo.LatestOrder';
EXECUTE dbo.LatestOrder @CountryID = 2;
EXECUTE dbo.LatestOrder @CountryID = 1;
EXECUTE sp_recompile N'dbo.LatestOrder';
EXECUTE dbo.LatestOrder @CountryID = 1;

We see the following three plans: –

LatestOrder query plans

And logical reads of: –

Table ‘Sale’. Scan count 1, logical reads 6, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Sale’. Scan count 1, logical reads 18986644, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Sale’. Scan count 5, logical reads 19743, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

A plan is compiled for the first execution of the stored procedure that correctly estimates that one row will result from the initial seek. That plan is then placed into cache and is subsequently used for the second execution. The second execution again estimates one row will be returned from the index seek where actually 6,325,225 rows are returned. This is parameter sniffing – the parameters used for the first execution are those used to create the cached execution plan and are used in estimates for subsequent executions regardless of the actual parameter values in those subsequent executions. We can see in the third example the optimal query plan for @CountryID = 1 after all previous plans for this procedure are removed from cache. This results in 19,743 logical reads or 0.1% of the 18,986,644 logical reads from the second cached plan.

This example also demonstrates query costs in actual execution plans are in fact estimated. These % cost estimates should never be used as a basis of query tuning as they are always estimated, even in actual plans. They simply corroborate the erroneous estimates that caused the performance issue in the first place.

There are different fixes that could be applied here. We could perform some index tuning to add an included column: –

CREATE NONCLUSTERED INDEX ix_Sale_CountryID ON dbo.Sale(CountryID) INCLUDE (Dt) WITH DROP_EXISTING;

That will not remove parameter sniffing, but will reduce the differential of logical reads between queries two and three. Further work would probably be required.

We could also add the WITH RECOMPILE statement to the stored procedure definition, add WITH RECOMPILE to an individual statement within a stored procedure or lastly to the invocation of the stored procedure. This will cause the execution plan to be compiled for each invocation. This may be suitable for longer running stored procedures where the plan compilation cost is negligible but certainly will not be appropriate in all scenarios and should be used with caution.

Another option would be to add sub-procedures within the main procedure and logically flow execution to the appropriate sub-procedure depending on some initial user-defined logic. Those sub-procedures will each then have a plan in cache with their own set of sniffed parameters with an appropriate plan for each. This extra development overhead might be suitable in certain circumstances where parameter sniffing is a real issue causing performance pain.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s