UPDATE STATISTICS undocumented options

Artikel
07/21/2006

If you read the Books Online page describing the UPDATE STATISTICS command, you will see that there are some undocumented options.

UPDATE STATISTICS table | view
[
{
{ index | statistics_name }
| ( { index |statistics_name } [ ,...n ] )
}
]
[ WITH
[
[ FULLSCAN ]
| SAMPLE number { PERCENT | ROWS } ]
| RESAMPLE
| <update_stats_stream_option> [ ,...n ]
]
[ [ , ] [ ALL | COLUMNS | INDEX ]
[ [ , ] NORECOMPUTE ]
] ;

<update_stats_stream_option> ::=
[ STATS_STREAM = stats_stream ]
[ ROWCOUNT = numeric_constant ]
[ PAGECOUNT = numeric contant ]

<update_stats_stream_option>

This syntax is for internal use only and is not supported. Microsoft reserves the right to change this syntax at any time.

There is a very good reason why these options are undocumented. They are meant for testing and debugging purposes, and should never ever be used on production systems.

However, these options can also be extremely helpful to create examples and sample scripts to demonstrate various features and behaviors of the Query Optimizer and the related query plan shapes. We decided to explain what ROWCOUNT and PAGECOUNT do, so that to be able to use these commands in examples we’ll be posting in some future. Feel free to use these options on development systems for experimental and educational purposes, but please do not play with fire and do not use them in production!

As the name of these options suggest, ROWCOUNT and PAGECOUNT alter the internal metadata of the specified table or index by overriding the counters containing the row and page counts of the object. These counters are in turn read by the Query Optimizer when processing queries that access the table and/or index in question. These commands can basically cheat the Optimizer into thinking that a table or index is extremely large.

SQL Server’s Query Optimization process is structured in multiple stages. Further optimization stages consider a progressively larger and more sophisticated set of possible tree transformations and query optimizations. Later stages of optimization are only entered when the estimated cost of the query is sufficiently high, in order to avoid wasting precious CPU cycles against simple queries that do not need that level of sophistication anyway. The multiple optimization stages are a mean to produce efficient query plans without consuming excessive amounts of CPU. Typically, in order to make the Optimizer “think” a lot and enter these later stages it is necessary to have big tables with a large number of rows and pages, which in turn take time and space to populate. Using ROWCOUNT and PAGECOUNT allows us to exercise these code paths with relatively simple scripts that do not require an extremely complex setup phase.

Here is an example. When running this simple script on your SQL 2005 instance you will likely see a different query plan for the two selects before and after updating the statistics. The recompile option is used to ensure that the query plans are regenerated. From the statistics profile, you’ll also see very different estimated row counts and consequently costs.

use tempdb

create table t1(i int, j int)

create table t2(h int, k int)

set statistics profile on

select distinct(i) from t1

select * from t1, t2 where i = k order by j + k

update statistics t1 with rowcount = 10000, pagecount = 10000

update statistics t2 with rowcount = 100000, pagecount = 100000

select distinct(i) from t1 option (recompile)

select * from t1, t2 where i = k order by j + k option (recompile)

Enjoy!

Stefano

Comments

Anonymous
July 24, 2006
These kind of posts are exactly why I love blogs. This could save us hours of data generation effort. Most of the DBA's here tend to 'forget' testing with a realistic amount of data.

DTA (ITW) also allows you to fool SQL Server into having more rows, is this based on the same principle?
Anonymous
July 25, 2006
Hi Wesley - thank you for the interest and feedback.

I believe that DTA uses these same interfaces, but am not 100% sure.

About testing with realistic amounts of data: i want to be perfectly clear about the fact that UPDATE STATISTICS WITH ROWCOUNT and PAGECOUNT will only fool the Optimizer into believing that the table is actually bigger (or smaller for that matter) than in reality. However, the content of the actual tables and indexes will remain perfectly intact. So while it is true that these commands will let you see what kind of query plans the Optimizer would generate against larger amounts of data, actually running these plans would not provide particularly interesting information. All query plans will complete extremely quickly when run against empty or very small tables.

What you can do, however, is to look at the progression of the estimated query cost based on the size of the tables. If the cost grows significantly with the size – i.e. the query performances do not seem to scale with the amount of data - it might be a hint that maybe some tuning is required.

Stefano
Anonymous
July 25, 2006
Thanks for the extra explanation.

I do understand that nothing can compete with realistic data. But I'd rather see people try some first optimizations with artificial statistics than not optimizing at all because they have no time to generate test data, especially since I'm typically the one who has to solve it once it goes to production :-)

I still love the black art of query tuning though ;-)

Wesley
Anonymous
July 30, 2006
The comment has been removed
Anonymous
July 31, 2006
Unfortunately, at the moment there is no way to change the statistics for internal hidden tables like those used for Full Text Search. This is a limitation we should definitely attempt to remove in some future release.

Thanks you for the great feedback!
Stefano
Anonymous
November 30, 2006
PingBack from http://www.dosql.com/blog/?p=87
Anonymous
March 09, 2007
these are undocumented for a reason as they will affect performance and you will break your sql production
Anonymous
March 12, 2007
This is very cool - an undocumented option to UPDATE STATISTICS that allows you to fool the server into
Anonymous
March 17, 2007
In SQL Server designing efficient indexes is paramount to achieving good database and application performance.
Anonymous
June 09, 2009
PingBack from http://quickdietsite.info/story.php?id=4048
Anonymous
June 15, 2009
PingBack from http://edebtsettlementprogram.info/story.php?id=24748
Anonymous
June 18, 2009
PingBack from http://thestoragebench.info/story.php?id=9514

Freigeben über

UPDATE STATISTICS undocumented options

Comments

Zusätzliche Ressourcen