ABAP Select For All Entries what is happening in SQL Server
-
Upload
vaibhav-sambare -
Category
Documents
-
view
41 -
download
4
description
Transcript of ABAP Select For All Entries what is happening in SQL Server
ABAP Select For All Entries - what is happening in SQL Server?One of the query types one gets confronted with when running SAP Netweaver based
applications is a special form of Open SQL statement in ABAP which is called ‘For all Entries’
(FAE). Standard ANSI SQL does not contain the command SELECT FOR ALL ENTRIES and
therefore this command can be translated a number of different ways into native database
specific SQL by the Database Dependent Shared Library (DBSL). But even within the port to
one specific DBMS like SQL Server we changed the way how such a FAE statement is
generated several times. Last time we introduced a severe change was when moving from
the 6.40 kernel releases to the 7.00 kernel releases.
How does a FAE statement look in ABAP?
A typical FAE Statement in ABAP would look like:
SELECT (t_fsel_makt) FROM makt
INTO CORRESPONDING FIELDS OF TABLE yanlb
FOR ALL ENTRIES IN obj
WHERE matnr = obj-matnr
AND spras = obj-spras.
Compared to non-FAE statement, the important differences in the FAE statement
are marked. In principle the FAE statement does work pretty simple. Values for parts of the
Where-clause of the statement (in the upper example the matnr and spras columns) are
taken out of an internal table (in this case ‘obj’). This internal table has an arbitrary number
of rows which usually results out of another query against the database. Hence the number
of rows in that internal table and the values are determined at runtime and not foreseeable.
Like this example of 4 rows:
Now the task is to query the database in a way that the where clause will check for the
values of the client (automatically added) and the 4 pairs of matnr and spras values
displayed above.
Translation into Database Queries
The way how the SQL Server dependent porting layer of SAP solved the generation of such
FAE statements is excellently documented in SAP OSS note #652634. At first we were taking
the easiest solution of batching single statements. Means for each row of the internal table,
one database query got created. Those got batched up and sent to SQL Server as one batch
of statements. In order not to generate too large statements, since those internal tables can
contain thousands of rows, SAP introduced a profile parameters to introduce a batch factor.
Profile parameters like:
rsdb/max_blocking_factor (current default of 50 with SQL Server)
rsdb/min_blocking_factor (current default of 5 with SQL Server)
rsdb/max_in_blocking_factor (current default of 255 with SQL Server)
rsdb/min_in_blocking_factor (current default of 10 with SQL Server)
rsdb/prefer_in_itab_opt (current default of 0 with SQL Server)
rsdb/prefer_fix_blocking (current default of 0 with SQL Server)
became common as parameters to shape the amount of rows of these internal table in each
batch sent to the DBMS. Also to restrict the amount of different statements resulting out of
the different # of rows in the internal table. Especially the first parameter is notable since
this parameter defines that we try to use the column values of 50 rows of the internal ABAP
table to create one single SQL Server query. If there are more than 50 rows in the internal
table, a second query will be executed taking the column values out of the next 50 rows and
so on. If there are less than 50 rows remaining, those remaining column values will be used
for a last statement. As the defaults for SQL Server currently are there is no IN-clause used
in the Where-clause for the case that the internal ABAP table is one column wide only.
So the first solution which is referred to as FAE Batch method would simply have constructed
four SQL statements each specifying the MANDT, MATNR and SPRAS value out of the
variables out of one of the rows of the internal tables. In the example above, it would have
looked like:
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000000’ AND
SPRAS = ‘D’
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000001’ AND
SPRAS = ‘D’
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000002’ AND
SPRAS = ‘D’
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000003’ AND
SPRAS = ‘D’
GO
Please note that these SQL Queries are meant for illustration and as such might not pass the
syntax text. Those are also not exactly in the form as SAP is issuing them since SAP issues
parameterized queries. Hence you would not see the parameter values in the queries. But
instead you would see variables like @P0, @p1, etc replacing the values in the SQL Text
when you use SQL Server Profiler to trace the queries.
Over the years we retired this way of executing FAE statements to a more efficient type of
query. The type of query which in OSS note #652634 is named as ‘UNION ALL’ method. A
slight change only, but a change which resulted in a more efficient execution from CPU
resource consumption point of view. Also a change which left the query complexity in a
moderate state. The query issued against SQL Server with the example above would look
like:
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000000’ AND
SPRAS = ‘D’
UNION ALL
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000001’ AND
SPRAS = ‘D’
UNION ALL
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000002’ AND
SPRAS = ‘D’
UNION ALL
SELECT <column list> FROM MAKT WHERE MANDT = …. AND MATNR = ‘SD000003’ AND
SPRAS = ‘D’
GO
From SQL Server side it is one query now instead of 4 queries and from a query optimizer
view, the queries are on the same level of complexity as before. This way of executing a FAE
statement was used as default for quite a few releases of SAP Kernels, including all the 6.xx
SAP kernel releases.
For the 7.00 SAP kernels, the default was changed to a method referenced in the OSS note
#652634 as ‘FAE JOIN’. This new method promised even more efficient execution of the FAE
statements on the database side. However at the cost that the query resulting out of the
ABAP statement became quite more complex. Also forming the SQL Statement for this new
method got moved out of the database dependent part into the database independent part
of the database access layer in the ABAP stack.
The generated SQL Code for this new method would look like:
SELECT <column_list> from MAKT T_00, (
SELECT "C_01" = ‘SD000000’, "C_02" = ‘D’
UNION
SELECT ‘SD000001’, ‘D’
UNION
SELECT ‘SD000002’, ‘D’
UNION
SELECT ‘SD000003’, ‘D’
) T_01
WHERE T_00 ."MANDT" = … AND T_00 ."MATNR" = T_01 .C_01 AND T_00 ."SPRAS"
=T_01 .C_02
What we basically look at is one statement where the columns out of the where clause are
added via a join to the table queried. Using SAP’s SQL Explain functionality, a good form of a
query plan resulting out of such a statement looks like:
You got a few merge joins and constant scans ordered in the way as displayed above.
In SQL Server Management Studio, the plan would look like:
Back in the SAP Statement Analysis, we can double click on the last line which is the
clustered index seek against MAKT. We then get a display like:
Looking at the right hand side, it tells us that a Clustered Index Seek is performed by
checking the values for MANDT, MATNR and SPRAS. You can do the same thing in SSMS
when moving the mouse over the lowest left node which also represents the access to the
clustered index. As seek columns (predicates) it also would give you those columns like seen
here:
So what is executed is essentially represents nothing than a few seeks into the clustered
index where all the values in the Where Clause are evaluated as we look through the index
pages. Only rows which apply to the value combinations of the complete Where clause are
read. Nothing one could do better.
Things which could go wrong.
In the case above we looked at a plan where only a few rows got handled. We also looked at
a data distribution over the column MATNR which was very even and didn’t have any values
which were occurring millions of times whereas others would show up only a few times or
not at all. Hence it was an easy task for SQL Server finding an efficient plan for the query.
Things can change dramatically when columns have a very uneven distribution of their
values. Uneven Distribution of Data is quite common in some SAP columns. Typical
examples would be BUKRS where the vast majority (~95%) of sales orders come from the
two or 3 “sales office”. The remaining 5% of the rows in a table would be sales orders from
possibly 10-20 small sales offices and account for less than 1% of the total rows in a table.
SQL Server then suddenly can take a very different plan which is not as efficient and
especially for the case the plan is re-used it can be very hindering.
A less optimal plan which came from a customer query against the table ANLB is shown
below:
In this case we seem to access the clustered index of the table ANLB, with a Where Clause
specifying MANDT, BUKRS, ANLN1 and ANLN2. These 4 columns represent the first 4
columns of the clustered index on ANLB and in combination are very selective, so these are
good fields for the Where clause. However the disturbing factor of the plan is that it is using
a filter with the Constant Scan. Expectation would have been to get the same plan structure
as shown before with MAKT where we would use the clustered index to evaluate all values of
the columns in the Where Clause and not to use a separate Filter which would evaluate
values after a row has been read from a data page
When moving the mouse over one of those filter symbols, we get this:
Means the columns ANLN1 and ANLN2 are filtered after looking up the rows. Moving the
mouse over the node in the plan which characterizes the clustered index seek, it looks like:
Looking for the columns of the index we use for eliminating rows, it figures out that we only
use MANDT and BUKRS, which certainly are not very selective. Means at the end for each
row in the internal ABAP table, we are comparing the MANDT and BUKRS values against the
clustered index, read all the rows which fit the MANDT and BUKRS value and then only we
check whether the values of the ANLN1 and ANLN2 columns are matching our where
condition. Despite the fact that these columns are #3 and #4 in the clustered index
ANLB~0, we are not evaluating the values specified in the where clause against the index,
but against the rows read. Not a good plan.
As mentioned earlier, the plan we call FAE_JOIN is more complex than a number of relatively
simple statements combined with a UNION ALL operator. Hence SQL Server will evaluate
very different plan options which are more sensitive to extreme data distributions. As a
result plans like the one shown above might suddenly occur.
Several customers did report issues like this after moving their ERP to 7.0x based SAP ERP.
One area which seems to be very sensitive to this seems to be FAE queries against the
tables ANLA, ANLB, etc.
Also the problem does not always express itself in exactly the same types of query plans as
described above, but often in similar problems where either an index is not fully used,
despite the fact that all the columns in the where-clause are restricted with ‘=’ or by plans
which simply evaluate the client value on the index and then evaluate all other columns
after the rows were read already.
Solutions
Query specific solution :
Another possibility is to overwrite the max_blocking_factor default setting (explained at the
beginning) by parameter in the instance profile. This would change the setting system wide.
However one also can change the max_blocking_factor specific for one query as ABAP query
hint. This can be done with a query hint which would look like:
%_HINTS MSSQLNT '&max_blocking_factor 5&'
The number marked yellow would be the new blocking factor for this specific query. We
usually made good experiences with taking the max_blocking_factor down into the range of
5-15.
System wide method:
The system wide method would be to reverse the whole SAP system back to the default
behavior of the older releases (UNION_ALL case) by setting this parameter in the instance
profiles of the different instances:
rsdb/prefer_join = 0
This is the way how several customers including Microsoft resolved the issues with the new
way of issuing FAE queries causing more instability. Just setting this parameter will disable
the usage of the FAE_JOIN method throughout the system and will result in a fallback to the
older method which in the past proved less sensitive to skewed data distribution.
Conclusion
Observing FAE queries which are performing slow after upgrading from pre-7.xx SAP
releases to 7.xx SAP Netweaver release, usually behave different because the default
for creating such queries was introduced with 7.00. This problem can be resolved by
returning to the old behavior with the system wide solution
Observing some slow FAE queries here and there and realizing that the query plan is
less optimal, can be resolved by query specific hints either returning to the old
behavior on a query basis or reducing the default blocking factor