SAP Application leveraging SQL Server Automatic Update Statistics
Hi folks, new Blog, new author. The Blog itself should give information about running SAP applications on SQL Server. It will not only be me, Juergen, alone contributing to the Blog. We also will have colleagues of mine contributing who either work in our IT Department running a SAP landscape on SQL Server and other colleagues working with SAP on SAP BI. The intent is to give you information on simply running SAP Application on the Windows/SQL Server platform. But let’s start with the first article. Please note some corrections in the article made in October 2010 in regards to asynchronous update statistics
Lately I spent a week in Japan at customer round tables. We got a lot of questions which we will go through in subsequent articles here. One of the first questions which got repeatedly asked was circling on how good SQL Server’s automatic update statistics does work underneath SAP applications. There are a lot of recommendations and rumors out what to do. So let’s go through our experience, which we collected over the last 9 years in this area. We introduced automatic update statistics with SQL Server 7.0 and subsequently improved the feature in more recent releases. More details around how update statistics exactly works and what the different statistics are and how to read those can be found in:
https://www.microsoft.com/technet/prodtechnol/sql/2005/qrystats.mspx
With the introduction of SQL Server Automatic Update Statistics SAP with their applications relied on SQL Server updating statistics and the automatic creation of column statistics. From SAP side no additional maintenance functionality got delivered since SQL Server 7.0 to schedule updating of statistics. The experience in thousands of productive SAP systems running on SQL Server was overall great. The basic experience is that the mechanisms of SQL Server automatic update statistics and automatic create of statistics works great for SAP applications.
No Rule without exceptions
However there usually is no rule without exceptions. This also is true for our automatic update of the statistics. Reading through the whitepaper referred to above the following restrictions of automatic update statistics become apparent:
· In order to update statistics 20% of the rows need to be modified
· The table at least needs to have 500 rows
For SAP ERP type of workload these restrictions usually hardly cause problems. However in the SAP BW and SAP SCM it got visible that the second restriction had some impact on query performance. Analyzing the problem it became apparent, that a lot of the SAP dimension and sub-dimension tables simply never reached the threshold of 500 rows. Think about the ‘time’ dimension or the dimension of ‘countries’ or ‘sales regions’. Those tables hardly ever will grow as big as 500 rows. In SQL Server 2000 this resulted in not even having any statistics on those tables at all. In SQL Server 2005 a first set of statistics is getting generated automatically in the case that there are no statistics yet (independent of the fact of how many rows are in the table). However in both releases statistics will not get updated anymore before the # of rows would get beyond 500. The result of having no statistics or out-dated statistics usually are query plans which are not the optimal in their join order. This impacts the run time negatively. In some cases we saw run time increased by factors of 10 to 100. In order to avoid such problems with BW and SCM, logic got implemented in SAP BW/SCM which will update statistics manually after Delta Loads got uploaded into an info cube. Please check out OSS notes #542468, #849062 and #1009511 which will describe what to implement and configure to avoid these problems
Investigating cases in other SAP products
Nevertheless a lot of myth got created about manual updating statistics helping to improve on this or that case of a slow running queries. Usually the myth is for the wrong reason. Here the reason why. If you update statistics on a table manually, two things will happen:
· SQL Server indeed will update the statistics with the default sample (unless specified differently)
· SQL Server as well will recompile all statements in the statement cache which reference the update of statistics was executed against .
Very often the fact that a query afterwards ran faster with a different query plan than before cannot be attributed to the update of the statistics but the fact that the new query plan got compiled with different parameter values submitted in the manual test case or the current operation. The parameter values the query originally got compiled with might have been completely different and hence produced a different query plan.
Therefore when investigating problems one needs to distinguish between these two cases, before taking the step of updating statistics manually. The first step in order to find out whether it really is an issue of outdated statistics is to execute the T-SQL Command:
sp_recompile <table/view name or stored procedure name>
you only need to do this against one of the tables touched by the query or view. Then you run your scenario again. Or if you have a SQL Trace in ST05 with the long running statement, simply get an Explain of the statement again. SQL Server will now generate a new plan with the parameters submitted. In case of ST05 SQL Trace, the parameters will be the values of the scenario the trace was executed with. In case of ST04 Query Statistics it will be the parameter values of the slowest execution. If the statement now gets a different plan, then the issue is not around updating the statistics, but a case around vastly different sets of parameter values which can demand different access plans (we will discuss this in later publications).
In case of getting the same non-optimal plan, despite the fact that a better index is available, the next step is to check on SQL Server’s cardinality estimations for the query. This outcome will help you to determine, why SQL Server is not using the “ideal” index: a) SQL Server does estimate a disadvantage taking a non-clustered index because of the amount of rows coming back or b) out-dated statistics. Investigating deeper into this direction also is something we will go through in a later publication.
Synchronous vs. Asynchronous Update Statistics
With SQL Server 2005 an asynchronous update statistics got introduced as described in the whitepaper referred to earlier. This should be the default setting for all SAP products. Former recommendation not to use asycnhronous update statistics for SAP products like SAP BW and SAP SCM, got reverted. The issue we were afraid to encounter didn't hold true in real life. Therefore asynchronous update statistics as established in SQL Server 2005 is dfefault for all SAP products and can be used with all of them.
Another well known exception
Another scenario which resulted in some confusion in the past is the one where you are running a system with multiple clients (MANDT) and you just imported a small client which makes a small percentage of the volume of most tables (less than 20% of rows). In such a case, SQL Server does not see any reason to update statistics automatically since in most of the tables the changes probably were below 20%. What can happen now when you operate against the new imported small client is that SQL Server will tend to prefer using the clustered index on the tables where non-clustered indexes would be preferable. Due to the fact that the query execution plans are cached in the statement cache, the unfavorable plan impacts also users of the other Clients in your SAP system. Means after client copies within a system it can make sense to run a complete update statistics over the complete database as well. This can be done with the T-SQL command ‘sap_updatestats forcesample’