ABAP Select For All Entries - what is happening in SQL Server?
One of the query types one gets confronted with when running SAP Netweaver based applications is a special form of Open SQL statement in ABAP which is called ‘For all Entries’ (FAE). Standard ANSI SQL does not contain the command SELECT FOR ALL ENTRIES and therefore this command can be translated a number of different ways into native database specific SQL by the Database Dependent Shared Library (DBSL). But even within the port to one specific DBMS like SQL Server we changed the way how such a FAE statement is generated several times. Last time we introduced a severe change was when moving from the 6.40 kernel releases to the 7.00 kernel releases.
How does a FAE statement look in ABAP?
A typical FAE Statement in ABAP would look like:
SELECT (t_fsel_makt) FROM makt
INTO CORRESPONDING FIELDS OF TABLE yanlb
FOR ALL ENTRIES IN obj
WHERE matnr = obj-matnr
AND spras = obj-spras.
Compared to non-FAE statement, the important differences in the FAE statement are marked. In principle the FAE statement does work pretty simple. Values for parts of the Where-clause of the statement (in the upper example the matnr and spras columns) are taken out of an internal table (in this case ‘obj’). This internal table has an arbitrary number of rows which usually results out of another query against the database. Hence the number of rows in that internal table and the values are determined at runtime and not foreseeable. Like this example of 4 rows:
Now the task is to query the database in a way that the where clause will check for the values of the client (automatically added) and the 4 pairs of matnr and spras values displayed above.
Translation into Database Queries
The way how the SQL Server dependent porting layer of SAP solved the generation of such FAE statements is excellently documented in SAP OSS note #652634. At first we were taking the easiest solution of batching single statements. Means for each row of the internal table, one database query got created. Those got batched up and sent to SQL Server as one batch of statements. In order not to generate too large statements, since those internal tables can contain thousands of rows, SAP introduced a profile parameters to introduce a batch factor. Profile parameters like:
- rsdb/max_blocking_factor (current default of 50 with SQL Server)
- rsdb/min_blocking_factor (current default of 5 with SQL Server)
- rsdb/max_in_blocking_factor (current default of 255 with SQL Server)
- rsdb/min_in_blocking_factor (current default of 10 with SQL Server)
- rsdb/prefer_in_itab_opt (current default of 0 with SQL Server)
- rsdb/prefer_fix_blocking (current default of 0 with SQL Server)
became common as parameters to shape the amount of rows of these internal table in each batch sent to the DBMS. Also to restrict the amount of different statements resulting out of the different # of rows in the internal table. Especially the first parameter is notable since this parameter defines that we try to use the column values of 50 rows of the internal ABAP table to create one single SQL Server query. If there are more than 50 rows in the internal table, a second query will be executed taking the column values out of the next 50 rows and so on. If there are less than 50 rows remaining, those remaining column values will be used for a last statement. As the defaults for SQL Server currently are there is no IN-clause used in the Where-clause for the case that the internal ABAP table is one column wide only.
So the first solution which is referred to as FAE Batch method would simply have constructed four SQL statements each specifying the MANDT, MATNR and SPRAS value out of the variables out of one of the rows of the internal tables. In the example above, it would have looked like:
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000000’ AND SPRAS = ‘D’
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000001’ AND SPRAS = ‘D’
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000002’ AND SPRAS = ‘D’
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000003’ AND SPRAS = ‘D’
GO
Please note that these SQL Queries are meant for illustration and as such might not pass the syntax text. Those are also not exactly in the form as SAP is issuing them since SAP issues parameterized queries. Hence you would not see the parameter values in the queries. But instead you would see variables like @P0, @p1, etc replacing the values in the SQL Text when you use SQL Server Profiler to trace the queries.
Over the years we retired this way of executing FAE statements to a more efficient type of query. The type of query which in OSS note #652634 is named as ‘UNION ALL’ method. A slight change only, but a change which resulted in a more efficient execution from CPU resource consumption point of view. Also a change which left the query complexity in a moderate state. The query issued against SQL Server with the example above would look like:
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000000’ AND SPRAS = ‘D’
UNION ALL
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000001’ AND SPRAS = ‘D’
UNION ALL
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000002’ AND SPRAS = ‘D’
UNION ALL
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000003’ AND SPRAS = ‘D’
GO
From SQL Server side it is one query now instead of 4 queries and from a query optimizer view, the queries are on the same level of complexity as before. This way of executing a FAE statement was used as default for quite a few releases of SAP Kernels, including all the 6.xx SAP kernel releases.
For the 7.00 SAP kernels, the default was changed to a method referenced in the OSS note #652634 as ‘FAE JOIN’. This new method promised even more efficient execution of the FAE statements on the database side. However at the cost that the query resulting out of the ABAP statement became quite more complex. Also forming the SQL Statement for this new method got moved out of the database dependent part into the database independent part of the database access layer in the ABAP stack.
The generated SQL Code for this new method would look like:
SELECT <column_list> from MAKT T_00, (
SELECT "C_01" = ‘SD000000’, "C_02" = ‘D’
UNION
SELECT ‘SD000001’, ‘D’
UNION
SELECT ‘SD000002’, ‘D’
UNION
SELECT ‘SD000003’, ‘D’
) T_01
WHERE T_00 ."MANDT" = … AND T_00 ."MATNR" = T_01 .C_01 AND T_00 ."SPRAS" =T_01 .C_02
What we basically look at is one statement where the columns out of the where clause are added via a join to the table queried. Using SAP’s SQL Explain functionality, a good form of a query plan resulting out of such a statement looks like:
You got a few merge joins and constant scans ordered in the way as displayed above.
In SQL Server Management Studio, the plan would look like:
Back in the SAP Statement Analysis, we can double click on the last line which is the clustered index seek against MAKT. We then get a display like:
Looking at the right hand side, it tells us that a Clustered Index Seek is performed by checking the values for MANDT, MATNR and SPRAS. You can do the same thing in SSMS when moving the mouse over the lowest left node which also represents the access to the clustered index. As seek columns (predicates) it also would give you those columns like seen here:
So what is executed is essentially represents nothing than a few seeks into the clustered index where all the values in the Where Clause are evaluated as we look through the index pages. Only rows which apply to the value combinations of the complete Where clause are read. Nothing one could do better.
Things which could go wrong.
In the case above we looked at a plan where only a few rows got handled. We also looked at a data distribution over the column MATNR which was very even and didn’t have any values which were occurring millions of times whereas others would show up only a few times or not at all. Hence it was an easy task for SQL Server finding an efficient plan for the query.
Things can change dramatically when columns have a very uneven distribution of their values. Uneven Distribution of Data is quite common in some SAP columns. Typical examples would be BUKRS where the vast majority (~95%) of sales orders come from the two or 3 “sales office”. The remaining 5% of the rows in a table would be sales orders from possibly 10-20 small sales offices and account for less than 1% of the total rows in a table. SQL Server then suddenly can take a very different plan which is not as efficient and especially for the case the plan is re-used it can be very hindering.
A less optimal plan which came from a customer query against the table ANLB is shown below:
In this case we seem to access the clustered index of the table ANLB, with a Where Clause specifying MANDT, BUKRS, ANLN1 and ANLN2. These 4 columns represent the first 4 columns of the clustered index on ANLB and in combination are very selective, so these are good fields for the Where clause. However the disturbing factor of the plan is that it is using a filter with the Constant Scan. Expectation would have been to get the same plan structure as shown before with MAKT where we would use the clustered index to evaluate all values of the columns in the Where Clause and not to use a separate Filter which would evaluate values after a row has been read from a data page
When moving the mouse over one of those filter symbols, we get this:
Means the columns ANLN1 and ANLN2 are filtered after looking up the rows. Moving the mouse over the node in the plan which characterizes the clustered index seek, it looks like:
Looking for the columns of the index we use for eliminating rows, it figures out that we only use MANDT and BUKRS, which certainly are not very selective. Means at the end for each row in the internal ABAP table, we are comparing the MANDT and BUKRS values against the clustered index, read all the rows which fit the MANDT and BUKRS value and then only we check whether the values of the ANLN1 and ANLN2 columns are matching our where condition. Despite the fact that these columns are #3 and #4 in the clustered index ANLB~0, we are not evaluating the values specified in the where clause against the index, but against the rows read. Not a good plan.
As mentioned earlier, the plan we call FAE_JOIN is more complex than a number of relatively simple statements combined with a UNION ALL operator. Hence SQL Server will evaluate very different plan options which are more sensitive to extreme data distributions. As a result plans like the one shown above might suddenly occur.
Several customers did report issues like this after moving their ERP to 7.0x based SAP ERP. One area which seems to be very sensitive to this seems to be FAE queries against the tables ANLA, ANLB, etc.
Also the problem does not always express itself in exactly the same types of query plans as described above, but often in similar problems where either an index is not fully used, despite the fact that all the columns in the where-clause are restricted with ‘=’ or by plans which simply evaluate the client value on the index and then evaluate all other columns after the rows were read already.
Solutions
There are three possible solutions to this.
Query specific solution 1:
As documented in SAP OSS note #652634 there is a query hint which reverses the resolution of a FAE query from FAE_JOIN back to the UNION ALL case where the simple statements are combined with union all. This hint looks like:
%_HINTS MSSQLNT '&prefer_join 0&'
The way how to apply query hints in ABAP is handled in several OSS notes which we listed earlier in this blog article: https://blogs.msdn.com/b/saponsqlserver/archive/2011/08/31/how-to-integrate-sql-server-specific-hints-in-abap.aspx which also includes more detailed explanations around query hints in ABAP.
Query specific solution 2:
Another possibility is to overwrite the max_blocking_factor default setting (explained at the beginning) by parameter in the instance profile. This would change the setting system wide. However one also can change the max_blocking_factor specific for one query as ABAP query hint. This can be done with a query hint which would look like:
%_HINTS MSSQLNT '&max_blocking_factor 5&'
The number marked yellow would be the new blocking factor for this specific query. We usually made good experiences with taking the max_blocking_factor down into the range of 5-15.
System wide method:
The system wide method would be to reverse the whole SAP system back to the default behavior of the older releases (UNION_ALL case) by setting this parameter in the instance profiles of the different instances:
rsdb/prefer_join = 0
This is the way how several customers including Microsoft resolved the issues with the new way of issuing FAE queries causing more instability. Just setting this parameter will disable the usage of the FAE_JOIN method throughout the system and will result in a fallback to the older method which in the past proved less sensitive to skewed data distribution.
Conclusion
- Observing FAE queries which are performing slow after upgrading from pre-7.xx SAP releases to 7.xx SAP Netweaver release, usually behave different because the default for creating such queries was introduced with 7.00. This problem can be resolved by returning to the old behavior with the system wide solution
- Observing some slow FAE queries here and there and realizing that the query plan is less optimal, can be resolved by query specific hints either returning to the old behavior on a query basis or reducing the default blocking factor
We wish you all a great start into the New year and a successful Year 2012.
Comments
- Anonymous
August 12, 2017
I do not know whether it's just me or if everybody else encountering issues with your site. It appears as if some of the written text in your content are running off the screen. Can someone else please provide feedback and let me know if this is happening to them as well?This might be a issue with my browser because I've had this happen before. Kudos- Anonymous
August 15, 2017
I checked the posting and couldn't find any issue with it, please try a different browser to see if this solves the problem.
- Anonymous