Caching Mechanisms

Artigo
01/24/2014

SQL Server can avoid compilations of previously executed queries by using four mechanisms to make plan caching accessible in a wide set of situations.

Adhoc query caching
Autoparameterization
Prepared queries, using either sp_executesql or the prepare and execute method invoked through your API
Stored procedures or other compiled objects (triggers, TVFs, etc.)

To determine which mechanism is being used for each plan in cache, we need to look at the values in the cacheobjtype and objtype columns in the sys.dm_exec_cached_plans view. The cacheobjtype column can have one of five possible values:

Compiled Plan
Parse Tree
Extended Proc
CLR Compiled Func
CLR Compiled Proc

In this section, the only value we’ll be looking at is Compiled Plan. Notice that we filter usecount query to limit the results to rows with this value.

There are 11 different possible values for the objtype column:

Proc (Stored procedure)
Prepared (Prepared statement)
Adhoc (Adhoc query)
ReplProc (Replication-filter-procedure)
Trigger ()
View
Default (Default constraint or default object)
UsrTab (User table)
SysTab (System table)
Check (CHECK constraint)
Rule (Rule object)

We’ll be mainly examining the first three values, but many caching details that apply to stored procedures also apply to replication filter procedures and triggers.

Adhoc Query Caching

If the caching metadata indicates a cacheobjtype value of Compiled Plan and an objtype value of Adhoc, the plan is considered to be an adhoc plan. Prior to SQL Server 2005, adhoc plans were occasionally cached, but it was not something on which you could depend. However, even when SQL Server caches your adhoc queries, you might not be able to depend on their reuse. When SQL Server caches the plan from an adhoc query, the cached plan will be used only if a subsequent batch matches exactly. This feature requires no extra work to use, but it is limited to exact textual matches. For example, if the following three queries are executed in the Northwind2 database (which can be found on the companion website), the first and third queries will use the same plan, but the second one will need to generate a new plan:

SELECT * FROM Orders WHERE CustomerID = 'HANAR'
SELECT * FROM Orders WHERE CustomerID = 'CHOPS'
SELECT * FROM Orders WHERE CustomerID = 'HANAR'

You can verify this by first clearing out plan cache, and then running the three queries above, in separate batches. Then run the usecount query referred to above:

USE Northwind2;
DBCC FREEPROCCACHE;
GO
SELECT * FROM Orders WHERE CustomerID = 'HANAR';
GO
SELECT * FROM Orders WHERE CustomerID = 'CHOPS';
GO
SELECT * FROM Orders WHERE CustomerID = 'HANAR';
GO
SELECT usecounts, cacheobjtype, objtype, [text]
FROM sys.dm_exec_cached_plans P
    CROSS APPLY sys.dm_exec_sql_text (plan_handle)
WHERE cacheobjtype = 'Compiled Plan'
    AND [text] NOT LIKE '%dm_exec_cached_plans%';

You should get two rows back, because the NOT LIKE condition filters out the row for the usecount query itself. The two rows are shown here and indicate that one plan was used only once, and the other was used twice:

Cc293623.table_C05621961_a(en-us,TechNet.10).jpg

The results show that with a change of the actual data value, the same plan cannot be reused. However, to take advantage of reuse of adhoc query plans, you need to make sure that not only are the same data values used in the queries, but that the queries are identical, character for character. If one query has a new line or an extra space that another one doesn’t have, they will not be treated as the same query. If one query contains a comment that the other doesn’t have, they will not be identical. In addition, if one query uses a different case for either identifiers or keywords, even in a database with a case-insensitive collation, the queries will not be the same. If you run the code below, you will see that none of the queries can reuse the same plan.

USE Northwind2; 
DBCC FREEPROCCACHE; 
GO
SELECT * FROM orders WHERE customerID = 'HANAR' 
GO
-- Try it again
SELECT * FROM orders WHERE customerID = 'HANAR' 
GO
SELECT * FROM orders
WHERE customerID = 'HANAR'; 
GO
SELECT * FROM Orders WHERE CustomerID = 'HANAR' 
GO 
select * from orders where customerid = 'HANAR' 
GO
SELECT usecounts, cacheobjtype, objtype, [text] 
FROM sys.dm_exec_cached_plans P
CROSS APPLY sys.dm_exec_sql_text (plan_handle) 
WHERE cacheobjtype = 'Compiled Plan'
AND [text] NOT LIKE '%dm_exec_cached_plans%';

Your results should show five rows in sys.dm_exec_cached_plans, each with a usecounts value of 1.

Note Note that the SELECT statements are all in their own batch, separated by GO. If there were not GOs, there would just be one batch, and each batch has its own plan containing the execution plan for each individual query within the batch. For reuse of adhoc query plans, the entire batch must be identical.

Autoparameterization

For certain queries, SQL Server can decide to treat one or more of the constants as parameters. When this happens, subsequent queries that follow the same basic template can use the same plan. For example, these two queries run in the Northwind2 database can use the same plan:

SELECT FirstName, LastName, Title FROM Employees 
WHERE EmployeeID = 6; 
SELECT FirstName, LastName, Title FROM Employees 
WHERE EmployeeID = 2;

Internally, SQL Server parameterizes these queries as follows:

SELECT FirstName, LastName, Title FROM Employees WHERE EmployeeID = @1

You can observe this behavior by running the following code, and observing the output of the usecount query:

USE Northwind2 
GO
DBCC FREEPROCCACHE; 
GO
SELECT FirstName, LastName, Title FROM Employees WHERE EmployeeID = 6; 
GO
SELECT FirstName, LastName, Title FROM Employees WHERE EmployeeID = 2; 
GO
SELECT usecounts, cacheobjtype, objtype, [text] 
FROM sys.dm_exec_cached_plans P
CROSS APPLY sys.dm_exec_sql_text (plan_handle) 
WHERE cacheobjtype = 'Compiled Plan'
AND [text] NOT LIKE '%dm_exec_cached_plans%'; 
GO

You should get three rows returned, similar to the following:

Cc293623.table_C05621961_b(en-us,TechNet.10).jpg

You should notice that the two individual queries with their distinct constants do get cached as adhoc queries. However, these are only considered shell queries and are only cached to make it easier to find the autoparameterized version of the query if the exact same query with the same constant is reused at a later time. These shell queries do not contain the full execution plan but only a pointer to the full plan in the corresponding prepared plan. The third row returned from sys.dm_exec_cached_plans has an objtype of prepared. The query plan is associated with the prepared plan, and you can observe that the plan was used twice. In addition, the text now shows a parameter in place of a constant.

By default, SQL Server is very conservative about deciding when to autoparameterize. SQL Server will autoparameterize queries only if the query template is considered to be safe. A template is safe if the plan selected will not change even if the actual parameters change. This ensures that autoparameterization won’t degrade a query’s performance. The employees table used in the queries above has a unique index, so any query that has an equality comparison on employeeID is guaranteed to never find more than one row, so a plan using a seek on that unique index can be useful no matter what actual value is used.

However, consider a query that has either an inequality comparison or an equality comparison on a nonunique column. In those situations, some actual values may return many rows, and others return no rows, or only one. A nonclustered index seek might be a good choice when only a few rows are returned, but a terrible choice when many rows are returned. So a query for which there is more than one possible best plan, depending on the value used in the query, is not consider safe, and it will not be autoparameterized. By default, the only way for SQL Server to reuse a plan for such a query is to use the adhoc plan caching described in the previous section.

In addition to requiring that there only be one possible plan for a query template, there are many query constructs that normally disallow autoparameterization. Such constructs include any statements with the following elements:

JOIN
BULK INSERT
IN lists
UNION
INTO
FOR BROWSE
OPTION <query hints>
DISTINCT
TOP
WAITFOR statements
GROUP BY, HAVING, COMPUTE
Full-text predicates
Subqueries
FROM clause of a SELECT statement has table valued method or full-text table or OPENROWSET or OPENXML or OPENQUERY or OPENDATASOURCE
Comparison predicate of the form EXPR <> a non-null constant

Autoparameterization is also disallowed for data modification statements that use the following constructs:

DELETE/UPDATE with FROM CLAUSE
UPDATE with SET clause that has variables

Forced Parameterization

If your application uses many similar queries that you know will benefit from the same plan, but are not autoparameterized, either because SQL Server doesn’t consider the plans safe or because they use one of the disallowed constructs, SQL Server 2005 provides an alternative. A new database option called FORCED PARAMETERIZATION can be enabled with the following command:

ALTER DATABASE <database_name> SET PARAMETERIZATION FORCED;

Once this option is enabled, SQL Server will always treat constants as parameters, with only a very few exceptions. These exceptions as listed in the SQL Server 2005 Books Online include:

INSERT . . . EXECUTE statements.
Statements inside the bodies of stored procedures, triggers, or user-defined functions. SQL Server already reuses query plans for these routines.
Prepared statements that have already been parameterized on the client-side application.
Statements that contain XQuery method calls, in which the method appears in a context in which its arguments would typically be parameterized, such as a WHERE clause. If the method appears in a context in which its arguments would not be parameterized, the rest of the statement is parameterized.
Statements inside a Transact-SQL cursor. (SELECT statements inside API cursors are parameterized.)
Deprecated query constructs.
Any statement that is run in the context of ANSI_PADDING or ANSI_NULLS set to OFF.
Statements that contain more than 2,097 literals.
Statements that reference variables, such as WHERE T.col2 >= @p.
Statements that contain the RECOMPILE or OPTIMIZE FOR query hints.
Statements that contain a COMPUTE clause.
Statements that contain a WHERE CURRENT OF clause.

You need to be careful when setting this option on for the entire database, because assuming that all constants should be treated as parameters during optimization, and then reusing existing plans, frequently gives very poor performance. An alternative that allows only selected queries to be autoparameterized is to use plan guides, which will be discussed at the end of this chapter. In addition, plan guides can also be used to override forced parameterization for selected queries, if the database has been set to FORCED PARAMETERIZATION.

Drawbacks of Autoparameterization

A feature of autoparameterization that you might have noticed in the output from the use-count query above is that SQL Server makes its own decision as to the datatype of the parameter, which might not be the datatype you think should be used. In the earlier example, looking at the employees table, SQL Server chose to assume a parameter of type tinyint. If we rerun the batch, and use a value that doesn’t fit into the tinyint range (that is, a value less than 0 or larger than 255), SQL Server will not be able to use the same autoparameterized query. The batch below autoparameterizes both SELECT statements, but it is not able to use the same plan for both queries. The output from the usecount query should show two adhoc shell queries, and two prepared queries. One prepared query will have a parameter of type tinyint and the other will be smallint. As strange as it may seem, even if you switch the order of the queries, and use the bigger value first, you will get two prepared queries with two different parameter datatypes.

USE Northwind2 
GO
DBCC FREEPROCCACHE; 
GO
SELECT FirstName, LastName, Title FROM Employees WHERE EmployeeID = 6; 
GO
SELECT FirstName, LastName, Title FROM Employees WHERE EmployeeID = 622; 
GO
SELECT usecounts, cacheobjtype, objtype, [text] 
FROM sys.dm_exec_cached_plans P
CROSS APPLY sys.dm_exec_sql_text (plan_handle) 
WHERE cacheobjtype = 'Compiled Plan'
AND [text] NOT LIKE '%dm_exec_cached_plans%'; 
GO

The only way to force SQL Server to use the same datatype for both queries is to enable FORCED PARAMETERIZATION for the database.

As mentioned, autoparameterization is not always appropriate, which is why SQL Server is so conservative in choosing to use it. Consider the following example. The BigOrders table in the Northwind2 database has 4,150 rows and 105 pages, so we might expect that a table scan reading 105 pages would be the worst possible performance for any query accessing the BigOrders table. There is a nonclustered nonunique index on the CustomerID column. If we enabled forced parameterization for the Northwind2 database, the plan used for the first SELECT will also be used for the second SELECT, even though the constants are different. The first query returns 5 rows and the second returns 155. Normally, a nonclustered index would be chosen for the first SELECT, and a clustered index scan for the second, because the number of qualifying rows exceeds the number of pages in the table. However, with PARAMETERIZATION FORCED, that’s not what we get, as you can see when you run the code below.

USE Northwind2
GO
ALTER DATABASE Northwind2 SET PARAMETERIZATION FORCED 
GO
SET STATISTICS IO ON 
GO
DBCC FREEPROCCACHE 
GO
SELECT * FROM BigOrders WHERE CustomerID = 'CENTC' 
GO
SELECT * FROM BigOrders WHERE CustomerID = 'SAVEA' 
GO
SELECT usecounts, cacheobjtype, objtype, [text] 
FROM sys.dm_exec_cached_plans P
CROSS APPLY sys.dm_exec_sql_text (plan_handle) 
WHERE cacheobjtype = 'Compiled Plan'
AND [text] NOT LIKE '%dm_exec_cached_plans%'; 
GO
ALTER DATABASE Northwind2 SET PARAMETERIZATION SIMPLE 
GO

When we run the above code, we see that the first SELECT required 12 logical reads and the second required 312, almost three times as many reads as would have been required if scanning the table. The output of the usecount query shows that forced parameterization was applied and the parameterized prepared plan was used twice:

Cc293623.table_C05621961_c(en-us,TechNet.10).jpg

In this example, forcing SQL Server to treat the constant as a parameter is not a good thing, and the batch sets the database back to SIMPLE PARAMETERIZATION (the default) as the last step. Note also that while we are using PARAMETERIZATION FORCED, the datatype chosen for the autoparameterized query is the largest possible regular character datatype.

So what can you do if you have many queries that should not be autoparameterized, and many others that should be? As we’ve seen, the SQL Server query processor is much more conservative about deciding whether a template is safe than an application can be. SQL Server guesses which values are really parameters, whereas your application should actually know. Rather than rely on autoparameterization in all cases, you can use one of the prepared query mechanisms to mark values as parameters when they are known.

The SQL Server Performance Monitor includes a counter called SQLServer:SQL Statistics that has several counters dealing with autoparameterization. You can monitor these counters to determine whether there are many unsafe or failed autoparameterization attempts. If these numbers are high, you can inspect your applications for situations in which the application can take responsibility for explicitly marking the parameters.

Prepared Queries

As we saw previously, a query that is autoparameterized by SQL Server shows an objtype of prepared in the cached plan metadata. There are two other constructs that also show up as having prepared plans. Both of these constructs allow the programmer to take control over which values are parameters and which aren’t and in addition, unlike simple autoparameterization, the programmer also determines the datatype that will be used for the parameters. One construct is the SQL Server stored procedure sp_executesql, that is called from within a Transact-SQL batch, and the other is to use the prepare/execute method from the client application.

The sp_executesql Procedure

The stored procedure sp_executesql is halfway between adhoc caching and stored procedures. Using sp_executesql requires that you identify the parameters and their datatypes, but doesn’t require all the persistent object management needed for stored procedures and other programmed objects.

Here’s the general syntax for the procedure:

sp_executesql @batch_text, @batch_parameter_definitions, 
    param1,...paramN

Repeated calls with the same values for @batch_text and @batch_parameter_definitions use the same cached plan, with the new parameter values specified. The plan will be reused as long as the plan has not been removed from cache, and as long as it is still valid. The section “Causes of Recompilation,” later in this chapter, discusses those situations in which SQL Server determines that a plan is no longer valid. The same cached plan can be used for all the following queries:

EXEC sp_executesql N'SELECT FirstName, LastName, Title
FROM Employees
WHERE EmployeeID = @p', N'@p tinyint', 6
EXEC sp_executesql N'SELECT FirstName, LastName, Title
FROM Employees
WHERE EmployeeID = @p', N'@p tinyint', 2
EXEC sp_executesql N'SELECT FirstName, LastName, Title
FROM Employees
WHERE EmployeeID = @p', N'@p tinyint', 6

Just like forcing autoparameterization, using sp_executesql to force reuse of a plan is not always appropriate. If we take the same example used earlier when we set the database to PARAMETERIZATION FORCED, we can see that using sp_executesql is just as inappropriate.

USE Northwind2; 
GO
SET STATISTICS IO ON; 
GO
DBCC FREEPROCCACHE; 
GO
EXEC sp_executesql N'SELECT * FROM BigOrders
WHERE CustomerID = @p', N'@p nvarchar(10)', 'CENTC'; 
GO
EXEC sp_executesql N'SELECT * FROM BigOrders
WHERE CustomerID = @p', N'@p nvarchar(10)', 'SAVEA'; 
GO
SELECT usecounts, cacheobjtype, objtype, [text] 
FROM sys.dm_exec_cached_plans P
CROSS APPLY sys.dm_exec_sql_text (plan_handle) 
WHERE cacheobjtype = 'Compiled Plan'
AND [text] NOT LIKE '%dm_exec_cached_plans%'; 
GO
SET STATISTICS IO OFF; 
GO

Again, we can see that the first SELECT required 12 logical reads and the second required 312. The output of the usecount query shows the parameterized query being used twice. Note that with sp_executesql, we do not have any entries for the adhoc shell query (unparameterized) queries.

Cc293623.table_C05621961_d(en-us,TechNet.10).jpg

The Prepare and Execute Method

This last mechanism is like sp_executesql in that parameters to the batch are identified by the application, but there are some key differences. The prepare and execute method does not require the full text of the batch to be sent at each execution. Rather, the full text is sent once at prepare time; a handle that can be used to invoke the batch at execute time is returned.

ODBC and OLE DB expose this functionality via SQLPrepare/SQLExecute and ICommandPrepare. You can also use this mechanism via ODBC and OLE DB when cursors are involved. When you use these functions, SQL Server is informed that this batch is meant to be used repeatedly.

Caching Prepared Queries

If your queries have been parameterized at the client using the Prepare/Execute method, the metadata will show you prepared queries, just as for queries that are parameterized at the server, either automatically or by using sp_executesql. However, queries that are not autoparameterized (either under SIMPLE or FORCED parameterization) will not have any corresponding adhoc shell queries in cache, containing the unparameterized actual values; they will only have the prepared plans. There is no guaranteed way to detect whether a prepared plan was prepared through autoparameterization or through client-side parameterization. If you see a corresponding shell query, you can know that the query was autoparameterized, but the opposite is not always true. Because the shell queries have a zero cost, they are among the first candidates to be removed when SQL Server is under memory pressure. So a lack of a shell query might just mean that adhoc plan was already removed from cache, not that there never was a shell query.

Compiled Objects

When looking at the metadata in sys.dm_exec_cached_plans, we’ve seen compiled plans with objtype values of adhoc and prepared. The third objtype value that we will be discussing is Proc, and you will see this type used when executing stored procedures, triggers, user-defined scalar functions, and multistatement table-valued functions. With the exception of triggers, which are never called directly, you have full control over what values are parameters and what their datatypes are when executing these objects.

Stored Procedures

Stored procedures and user-defined scalar functions are treated almost identically. The meta-data indicates that a compiled plan with an objtype value of Proc is cached and can be reused repeatedly. By default, the cached plan will be reused for all successive executions, and as we’ve seen with the sp_executesql, this is not always desirable. However, unlike the plans cached and reused with sp_executesql, you have an option with stored procedures and user-defined scalar functions to force recompilation when the object is executed. In addition, for stored procedures, you can create the object so that a new plan is created every single time it is executed.

To force recompilation for a single execution, you can use the EXECUTE . . . WITH RECOMPILE option. Here is an example in the Northwind2 database of forcing recompilation for a stored procedure:

USE Northwind2; 
GO
CREATE PROCEDURE P_Customers
@cust nvarchar(10) 
AS
SELECT RowNum, CustomerID, OrderDate, ShipCountry 
FROM BigOrders 
WHERE CustomerID = @cust;
GO
DBCC FREEPROCCACHE; 
GO
SET STATISTICS IO ON; 
GO
EXEC P_Customers 'CENTC'; 
GO
EXEC P_Customers 'SAVEA'; 
GO
EXEC P_Customers 'SAVEA' WITH RECOMPILE;

If you look at the output from STATISTICS IO, you’ll see that the second execution used a suboptimal plan that required more pages to be read than would be needed by a table scan. This is the situation that you may have seen referred to as parameter sniffing. SQL Server is basing the plan for the procedure on the first actual parameter, in this case, CENTC, and then subsequent executions assume the same or a similar parameter is used. The third execution uses the WITH RECOMPILE option to force SQL Server to come up with a new plan, and you should see that the number of logical page reads is equal to the number of pages in the table.

If you look at the results from running the usecounts query, you should see that the cached plan for P_Customers procedure has a usecounts value of 2, instead of 3. The plan developed for a procedure executed with the WITH RECOMPILE option is just considered valid for the current execution, and is never kept in cache for reuse.

Cc293623.table_C05621961_e(en-us,TechNet.10).jpg

Functions

User-defined scalar functions can behave exactly the same way as procedures. If you execute them using the EXECUTE statement, instead of as part of an expression, you can also force recompilation. Here is an example of a function that masks part of a Social Security number. We will create it in the pubs sample database, because the authors table contains a Social Security number in the au_id column.

USE pubs; 
GO
CREATE FUNCTION dbo.fnMaskSSN (@ssn char(11)) 
RETURNS char(11) 
AS
BEGIN
SELECT @SSN = 'xxx-xx-' + right (@ssn,4) 
RETURN @SSN
END; 
GO
DBCC FREEPROCCACHE; 
GO
SET STATISTICS IO ON; 
GO
DECLARE @mask char(11);
EXEC @mask = dbo.fnMaskSSN '123-45-6789'; 
SELECT @mask; 
GO
DECLARE @mask char(11);
EXEC @mask = dbo.fnMaskSSN '123-66-1111'; 
SELECT @mask; 
GO
DECLARE @mask char(11);
EXEC @mask = dbo.fnMaskSSN '123-66-1111' WITH RECOMPILE; 
SELECT @mask; 
GO

If you run the usecounts query you should notice the cached plan for the function has an objtype of proc and has a usecounts value of 2. If a scalar function is used within an expression, as in the example below, there is no way to request recompilation.

SELECT dbo.fnMaskSSN(au_id), au_lname, au_fname, au_id FROM authors;

TVFs may or may not be treated like procedures depending on how you define them. You can define a TVF as an inline function, or as a multistatement function. Neither method allows you to force recompilation when the function is called. Here are two functions that do the same thing:

CREATE FUNCTION Fnc_Inline_Customers (@cust nvarchar(10)) 
RETURNS TABLE
AS
RETURN
(SELECT RowNum, CustomerID, OrderDate, ShipCountry 
FROM BigOrders 
WHERE CustomerID = @cust);
GO
CREATE FUNCTION Fnc_Multi_Customers (@cust nvarchar(10))
RETURNS @T TABLE (RowNum int, CustomerID nchar(10), OrderDate datetime,
ShipCountry nvarchar(30)) 
AS
BEGIN
 INSERT INTO @T
      SELECT RowNum, CustomerID, OrderDate, ShipCountry 
   FROM BigOrders 
   WHERE CustomerID = @cust
 RETURN 
 END; 
 GO

Here are the calls to the functions:

DBCC FREEPROCCACHE 
GO
SELECT * FROM Fnc_Multi_Customers('CENTC') 
GO
SELECT * FROM Fnc_Inline_Customers('CENTC') 
GO
SELECT * FROM Fnc_Multi_Customers('SAVEA') 
GO
SELECT * FROM Fnc_Inline_Customers('SAVEA') 
GO

If you run the usecounts query you will see that only the multistatement function has its plan reused. The inline function is actually treated like a view, and the only way the plan can be reused would be if the exact same query were reexecuted, that is, if the same SELECT statement called the function with the exact same parameter.

Causes of Recompilation

Up to this point, we’ve been discussing the situations in which SQL Server automatically reuses a plan, and the situation in which a plan may be reused inappropriately so that you need to force recompilation. However, there are also situations in which an existing plan will not be reused because of changes to the underlying objects or the execution environment. The reasons for these unexpected recompilations fall into one of two different categories, which we call correctness-based recompiles and optimality-based recompiles.

Correctness-Based Recompiles

SQL Server may choose to recompile a plan if it has a reason to suspect that the existing plan may no longer be correct. This can happen when there are explicit changes to the underlying objects, such as changing a datatype or dropping an index. Obviously, any existing plan that referenced the column assuming its former datatype, or that accessed data using the now nonexistent index, would not be correct. Correctness-based recompiles fall into two general categories: schema changes and environmental changes. The following changes mark an object’s schema as changed:

Adding or dropping columns to/from a table or view.
Adding or dropping constraints, defaults, or rules to or from a table.
Adding an index to a table or an indexed view.
Dropping an index defined on a table or an indexed view if the index is used by the plan.
Dropping a statistic defined on a table will cause a correctness-related recompilation of any query plans that use that table.
Adding or dropping a trigger from a table.

In addition, running the procedure sp_recompile on a table or view will change the modification date for the object. Which you can observe in the modify_date column in sys.objects. This will make SQL Server believe that a schema change has occurred so that recompilation will take place at the next execution of any stored procedure, function, or trigger that accesses the table or view. Running sp_recompile on a procedure, trigger, or function will clear all the plans for the executable object out of cache, to guarantee that the next time it is executed, it will be recompiled.

Other correctness-based recompiles are invoked when the environment changes by changing one of a list of SET options. Changes in certain SET options can cause a query to return different results, so when one of these values changes, SQL Server wants to make sure a plan is used that was created in a similar environment. SQL Server keeps track of which SET options are set when a plan is executed, and you have access to a bitmap of these SET options in the DMF called sys.dm_exec_plan_attributes. This function is called by passing in a plan handle value that you can obtain from the sys.dm_exec_cached_plans view and returns one row for each of a list of plan attributes. You’ll need to make sure you include plan_handle in the list of columns to be retrieved, not just the few columns we used earlier in the usecounts query. Here’s an example of retrieving all the plan attributes when we supply a plan_handle value. Table 5-1 shows the results.

SELECT * FROM sys.dm_exec_plan_attributes
(0x06001200CF0B831CB821AA05000000000000000000000000)

Later in the chapter, when we explore cache management and caching internals, you’ll learn about some of these values in which the meaning is not obvious. To get the attributes to be returned in a row along with each plan_handle, you can use the SQL Server 2005 PIVOT operator, and list each of the attributes that you want to turn into a column. In this next query, we want to retrieve the set_options, the object_id, and the sql_handle from the list of attributes.

SELECT plan_handle, pvt.set_options, pvt.object_id, pvt.sql_handle 
FROM (SELECT plan_handle, epa.attribute, epa.value
FROM sys.dm_exec_cached_plans
OUTER APPLY sys.dm_exec_plan_attributes(plan_handle) AS epa 
WHERE cacheobjtype = 'Compiled Plan'
 ) AS ecpa
PIVOT (MAX(ecpa.value) FOR ecpa.attribute
IN (“set_options”, “object_id”, “sql_handle”)) AS pvt;

We get a value of 4,347 for set_options which is equivalent to the bit string 1000011111011. To see which bit refers to which SET options, we could change one option and then see how the bits have changed. For example, if we clear the plan cache and change ANSI_NULLS to OFF, the set_options value will change to 4315, or binary 1000011011011. The difference is the 6th bit from the right, which has a value of 32, the difference between 4,347 and 4,315. If we didn’t clear the plan cache, we would end up with two plans for the same batch, one for each set_options value.

Table 5-1 Attributes Corresponding to a Particular plan_handle

Cc293623.table_C05621961_1(en-us,TechNet.10).jpg

Not all changes to SET options will cause a recompile, although many of them will. The following is a list of the SET options that will cause a recompile when changed:

ANSI_NULL_DFLT_OFF
ANSI_NULL_DFLT_ON
ANSI_NULLS
ANSI_PADDING
ANSI_WARNINGS
ARITHABORT
CONCAT_NULL_YIELDS_NULL
DATEFIRST
DATEFORMAT
FORCEPLAN
LANGUAGE
NO_BROWSETABLE
NUMERIC_ROUNDABORT
QUOTED_IDENTIFIER

Two of the SET options in the list above have a special behavior in relationship to objects, including stored procedures, functions, views, and triggers. The SET option settings for ANSI_NULLS and QUOTED_IDENTIFIER are actually saved along with the object definition and the procedure or function will always execute with the SET values as they were when the object was first created. You can determine what values these two SET options had for your objects by selecting from the OBJECTPROPERTY function, as shown:

SELECT OBJECTPROPERTY(object_id('<object name>'), 'ExecIsQuotedIdentOn'); 
SELECT OBJECTPROPERTY(object_id('<object name>'), 'ExecIsAnsiNullsOn');

A returned value of 0 means the SET option is OFF, a value of 1 means the option is ON, and a value of NULL means that you typed something incorrectly or that you don’t have appropriate permissions. However, even though changing the value of either of these options does not cause any difference in execution of the objects, SQL Server may still recompile the statement that accesses the object. The only objects for which RECOMPILE will be avoided is for cached plans with an objtype value of Proc, namely stored procedures, multistatement TVFs, and triggers. For these compiled objects, the usecounts query will show you the same plan being reused, and will not show additional plans with different set_options values. Inline TVFs and views will create new plans if these options are changed, and the set_options value will indicate a different bitmap. However, the behavior of the underlying SELECT statement will not change.

Optimality-Based Recompiles

SQL Server may also choose to recompile a plan if it has reason to suspect that the existing plan is no longer optimal. The primary reasons for suspecting a nonoptimal plan deal with changes to the underlying data. If any of the statistics used to generate the query plan have been updated since the plan was created, or if any of the statistics are considered stale, SQL Server will recompile the query plan.

Updated Statistics Statistics can be updated either manually or automatically. Manual updates happen when someone runs sp_updatestats or the UPDATE STATISTICS command.

Automatic updates happen when SQL Server determines that existing statistics are out of date, or stale, and these updates only happen when the database has the option AUTO_UPDATE_STATISTICS set to ON. This could happen if another batch had tried to use one of the same tables or indexes used in the current plan, detected the statistics were stale, and initiated an UPDATE STATISTICS operation.

Stale Statistics SQL Server will detect out-of-date statistics when it is first compiling a batch that has no plan in cache. It will also detect stale statistics for existing plans. Figure 5-1 shows a flowchart of the steps involved in finding an existing plan and checking to see if recompilation is required. You can see that SQL Server checks for stale statistics after first checking to see if there already are updated statistics available. If there are stale statistics, the statistics will be updated, and then a recompile will begin on the batch. If AUTO_UPDATE_STATISTICS_ASYNC is ON for the database, SQL Server will not wait for the update of statistics to complete, it will just recompile based on the stale statistics.

Statistics are considered to be stale if a sufficient number of modifications have occurred on the column supporting the statistics. Each table has a recompilation threshold, or RT, that determines how many changes can take place before any statistics on that table are marked as stale. The RT values for all of the tables referenced in a batch are stored with the query plans of that batch.

The RT values depend on the type of table, that is, whether it is permanent or temporary, and on the current number of rows in the table at the time a plan is compiled. The exact algorithms for determining the RT values are subject to change with each service pack, so I will show you the algorithm for the RTM release of SQL Server 2005. The formulas used in the various service packs will be similar to this, but are not guaranteed to be exactly the same. N indicates the cardinality of the table.

For both permanent and temporary tables, if N is less or equal to 500, the RT value is 500. This means that for a relatively small table, you must make at least 500 changes to trigger recompilation. For larger tables, at least 500 changes must be made, plus 20 percent of the number of rows.
For temporary tables, the algorithm is the same, with one exception. If the table is very small or empty (N is less than 6 prior to any data modification operations), all we need are 6 changes to trigger a recompile. This means that a procedure that creates a temporary table, which is empty when created, and then inserts 6 or more rows into that table, will have to be recompiled as soon as the temp table is accessed.
You can get around this frequent recompilation of batches that create temporary tables by using the KEEP PLAN query hint. Use of this hint changes the recompilation thresholds for temporary tables and makes them identical to those for permanent tables. So if changes to temporary tables are causing many recompilations, and you suspect that the recompilations are affecting overall system performance, you can use this hint and see if there is a performance improvement. The hint can be specified as shown in this query:

Cc293623.figure_C05621961_1(en-us,TechNet.10).jpg

Figure 5-1 Checking an existing plan to see if recompilation is necessary

SELECT <column list>
FROM dbo.PermTable A INNER JOIN #TempTable B ON A.col1 = B.col2 
WHERE <filter conditions> 
OPTION (KEEP PLAN)

For table variables, there is no RT value. This means that you will not get recompilations caused by changes in the number of rows in a table variable.

Modification Counters The RT values discussed above are the number of changes required for SQL Server to recognize that statistics are stale. In versions of SQL Server prior to SQL Server 2005, the sysindexes system table keeps track of the number of changes that had actually occurred in a table in a column called rowmodctr. These counters keep track of any changes in any row of the table or index, even if the change was to a column that was not involved in any index or useful statistics. SQL Server 2005 now uses a set of Column Modification Counters or colmodctr values, with a separate count being maintained for each column in a table, except for computed nonpersisted columns. These counters are not transactional, which means that if a transaction starts, inserts thousands of rows into a table, and then is rolled back, the changes to the modification counters will not be rolled back. Unlike the rowmodctr values in sysindexes, the colmodctr values are not visible to the user. They are only available internally to the query processor.

Tracking Changes to Tables and Indexed Views Using colmodctr Values The colmodctr values that SQL Server keeps track of are continually modified as the table data changes. Table 5-2 describes when and how the colmodctr values are modified based on changes to your data, including INSERT, UPDATE, DELETE, bulk insert, and TRUNCATE TABLE operations. Although we are only mentioning table modifications specifically, keep in mind the same colmodctr values are kept track of for indexed views.

Table 5-2 Factors Affecting Changes to the Internal colmodctr Values

Cc293623.table_C05621961_2(en-us,TechNet.10).jpg

Skipping the Recompilation Step

There are several situations in which SQL Server will bypass recompiling a statement for plan optimality reasons. These include:

When the plan is a trivial plan. A trivial plan is one for which there are no alternative plans, based on the tables referenced by the query, and the indexes (or lack of indexes) on those tables. In these cases, where there really is only one way to process a query, any recompilation would be a waste of resources, no matter how much the statistics had changed. Keep in mind that there is no assurance that a query will continue to have a trivial plan just because it originally had a trivial plan. If new indexes have been added since the query was last compiled, there may now be multiple possible ways to process the query.
If the query contains the OPTION hint KEEPFIXED PLAN, SQL Server will not recompile the plan for any optimality-related reasons.
If automatic updates of statistics for indexes and statistics defined on a table or indexed view are disabled, all plan optimality-related recompilations caused by those indexes or statistics will stop. Note Turning off the auto-statistics feature is usually not a good idea because the query optimizer will no longer be sensitive to data changes in those objects, and suboptimal query plans could easily result. You can consider using this technique only as a last resort after exhausting all of the other alternative ways to avoid recompilation. Make sure you thoroughly test your applications after changing the auto-statistics options to verify that you are not hurting performance in other areas.
If all of the tables referenced in the query are read-only, SQL Server will not recompile the plan.

Multiple Recompilations

In the previous discussion of unplanned recompilation, we primarily described situations in which a cached plan would be recompiled prior to execution. However, even if SQL Server decides it can reuse an existing plan, there may be cases where stale statistics or schema changes are discovered after the batch begins execution, and then a recompile will occur after execution starts. Each batch or stored procedure can contain multiple query plans, one for each optimizable statement. Before SQL Server begins executing any of the individual query plans, it checks for correctness and optimality of that plan. If one of the checks fails, the corresponding statement is compiled again, and a possibly different query plan is produced.

In some cases, query plans may be recompiled even if the plan for the batch was not cached. For example, if a batch contains a literal larger than 8 KB, it will never be cached. However, if this batch creates a temporary table, and then inserts multiple rows into that table, the insertion of the seventh row will cause a recompilation because of passing the recompilation threshold for temporary tables. Because of the large literal, the batch was not cached, but the currently executing plan needs to be recompiled.

In SQL Server 2000, when a batch was recompiled, all of the statements in the batch were recompiled, not just the one that initiated the recompilation. SQL Server 2005 introduces statement-level recompilation, which means that only the statement that causes the recompilation has a new plan created, not the entire batch. This means that SQL Server 2005 will spend less CPU time and memory during recompilations.

Removing Plans from Cache

In addition to needing to recompile a plan based on schema or statistics changes, SQL Server will need to compile plans for batches if all previous plans have been removed from the plan cache. Plans are removed from cache based on memory pressure, which we’ll talk about in the section on “Cache Management.” However, other operations can cause plans to be removed from cache. Some of these operations remove all the plans from a particular database, and others remove all the plans for the entire SQL Server instance.

The following operations flush the entire plan cache so that all batches submitted afterwards will need a fresh plan. Note that although some of these operations only affect a single database, the entire plan cache is cleared.

Detaching any database.
Upgrading any database to SQL Server 2005 (on SQL Server 2005 server).
Running the DBCC FREEPROCCACHE command.
Running the ALTER DATABASE . . . MODIFY FILEGROUP command for any database.
Modifying a collation for any database using ALTER DATABASE . . . COLLATE command.

The following operations clear all plans associated with a particular database and cause new compilations the next time a batch is executed.

Running the DBCC FLUSHPROCINDB command.
Altering a database with any of the following options:
1. ALTER DATABASE . . . MODIFY NAME=command
2. ALTER DATABASE . . . SET ONLINE command
3. ALTER DATABASE . . . SET OFFLINE command
4. ALTER DATABASE . . . SET EMERGENCY command
5. Dropping a DATABASE
6. When a database auto-closes

Note that there is no way to force SQL Server to remove just a single query plan from cache.

< Back Next >

Compartilhar via