[SAP ABAP] Databases Part 2
Reading data from several tables
We have several possibilities for reading data from several tables in SQL and merging that data in one internal table. I will present three possibilities for getting the data from several tables into one internal table. Here, we can use
FOR ALL ENTRIES
, SELECT...ENDSELECT
, and JOIN
. As the mechanism and principle of the FOR ALL ENTRIES
operation has been discussed previously, I have only shown how to get data from several tables.
We need to split a query into two
SELECT
statements. However, first of all, I will prepare the structure and table when we have fields from two tables—sflight
and spfli
:TYPES:
BEGIN OF s_for_all_entries,
mandt TYPE mandt,
carrid TYPE s_carr_id,
connid TYPE s_conn_id,
fldate TYPE s_date,
countryfr TYPE land1,
cityfrom TYPE s_from_cit,
airpfrom TYPE s_fromairp,
countryto TYPE land1,
cityto TYPE s_to_city,
airpto TYPE s_toairp,
END OF s_for_all_entries.
DATA: gt_for_all_enties TYPE TABLE OF s_for_all_entries,
gt_for_all_enties_spfli TYPE TABLE OF spfli.
I have fields from
sflight mandt,
carrid
, connid
, and fldate
, as well as spfli countryfr
, cityfrom
, airpfrom
, countryto
, cityto
, and airpto
. In the preceding code snippet, I have used a local definition of structure and I have defined the structure and table as a variable.
The first
SELECT
statement is to be used to get data from the sflight
table. This might look as follows:SELECT mandt carrid connid fldate
INTO CORRESPONDING FIELDS OF TABLE gt_for_all_entries
FROM sflight.
The result of
SELECT
from sflight
is as follows:
When I use data from the
spfli
table, the code should look as follows:IF gt_for_all_entries IS NOT INITIAL.
SELECT mandt carrid conid cityfrom
airpfrom countryto cityto airpto
INTO CORRESPONDING FIELDS OF TABLE gt_for_all_entries_spfli
FROM spfli
FOR ALL ENTRIES IN gt_for_all_entries
WHERE carrid = gt_for_all_entries-carrid
AND connid = gt_for_all_entries-connid.
ENDIF.
The result of
SELECT
from spfli
is as follows:LOOP AT gt_for_all_entries ASSIGNING FIELD-SYMBOL(<gs_for_all_entires>).
READ TABLE gt_for_all_entries-spfli ASSIGNING field-symbol(<gs_for_all_entires_spfli>)
WITH KEY carrid = <gs_for_all_entires>-carrid connid = <gs_for_all_entires>-connid.
MOVE-CORRESPONDING <gs_for_all_entires_spfli> TO <gs_for_all_entires>.
ENDLOOP.
The result of this piece of code is presented in the following screenshot. All fields will be filled with data from two tables:
The next possibilities include using the
SELECT...ENDSELECT
statements.
For this method, we need to create two more variables, as we know that
SELECT...ENDSELECT
needs structure, and we need the next structure to select 'single'
inside the SELECT...ENDSELECT
loop. It is also worth noting that I renamed the table:DATA: gt_loop TYPE TABLE OF s_2_tables,
gs_loop TYPE s_2_tables,
gs_spfli TYPE spfli.
SELECT...ENDSELECT
is presented in the following code snippet. In this loop, we are using two SELECT
queries. First of all, to get data from sflight
, we must get corresponding data from spfli
, and then move all data to where it is needed from the spfli
structure to our target structure. As a final step, we will append our result to the table, as shown in the following code snippet:SELECT mandt carrid connid fldate
INTO CORRESPONDING FIELDS OF gs_loop
FROM sflight.
SELECT SINGLE mandt carrid connid cityfrom
airpfrom countryto cityto airpto
INTO CORRESPONDING FIELDS OF gs_spfli
FROM spfli
WHERE carrid = gs_loop-carrid
AND connid = gs_loop-connid.
MOVE-CORRESPONDING gs_spfli TO gs_loop.
APPEND gs_loop TO gt_loop.
ENDSELECT.
Of course, the result of this query is identical to
FOR ALL ENTRIES
.
The main difference between them is that, in
FOR ALL ENTRIES
, we make two SELECT
statements and merge tables directly on the application server, as we have all of the required data. In contrast, in SELECT...ENDSELECT
, we have to enter the database every time we need data. Let's assume that in the sflight
table, we have 400 rows. In FOR ALL ENTRIES
, we always have two independent SELECT
statements, regardless of how many rows are in the table. In the case of SELECT...ENDSELECT
, there are 800 separate SELECT
statements to include on the database, which increases the database workload.
The third option for getting data from more than one table is
JOIN
. For this option, I have renamed the main table and deleted the rest of the table and structure, as we do not need them:DATA: gt_join TYPE TABLE OF s_2_tables.
The
SELECT
statements used for JOIN
are shown as follows:SELECT sf~mandt sf~carrid sf~connid sf~fldate
sp~countryfrom sp~cityfrom sp~airpfrom
sp~countryto sp~cityto sp~airpto
INTO CORRESPONDING FIELDS OF TABLE gt_join
FROM sflight AS sf
JOIN spfli AS sp
ON sf~carrid = sp~carrid
AND sf~connid = sp~connid.
In the preceding code snippet, you may see pieces of code that were not previously used; for example,
sf~mandt
and sflight as sf
.
Using this statement, every field in a resulting set needs to have an alias corresponding to a table. Every field on a resulting set needs to be with an alias. This is because the SQL engine needs to define the table that each field is to be selected from.
Furthermore,
ON
can be a new statement. This uses joins to indicate the JOIN
condition. It is worth noting that inner and outer joins require a JOIN
condition. In JOIN
, we can use three types of join, which are listed as follows:JOIN
orINNER JOIN
: In this type ofJOIN
, theINNER JOIN
joins a column of rows in the result set of the left-hand side and right-hand side only if the rows meet theJOIN
condition and the statement creates all combinations of keys. If some rows do not have their equivalent on the first and second table, rows are not created.LEFT/RIGHT JOIN
orLEFT/RIGHT OUTER JOIN
:OUTER JOIN
is pretty much the same asINNER JOIN
, but with a number of differences. The difference between them is the following. In the case ofLEFT OUTER JOIN
, the function selects all rows from the left-hand side and all the matching rows from the right-hand side and includes them in the result set. TheRIGHT OUTER JOIN
function works identically, but selects rows from the right-hand side and matches them from the left-hand side. This is even the case if no corresponding rows are on the other side.CROSS JOIN
:CROSS JOIN
creates every possible combination from the rows on both tables, without any special conditions.
We can also create
JOIN
from more than two tables. For this example, I have created a new type with three tables:TYPES:
BEGIN OF s_3_tables,
mandt TYPE mandt,
carrid TYPE s_carr_id,
connid TYPE s_conn_id,
fldate TYPE s_date,
countryfr TYPE land1,
cityfrom TYPE s_from_cit,
airpfrom TYPE s_fromairp,
countryto TYPE land1,
cityto TYPE s_to_city,
airpto TYPE s_toairp,
bookid TYPE s_book_id,
customid TYPE s_customer,
END OF s_3_tables.
DATA: gt_3join TYPE TABLE OF s_3_tables.
The code to join three tables looks like this:
SELECT sf~mandt sf~carrid sf~connid sf~fldate
sp~countryfr sp~cityfrom sp~airpfrom
sp~countryto sp~cityto sp~airpto
sb~bookid sb~customid
INTO CORRESPONDING FIELDS OF TABLE gt_3join
FROM sflight AS sf
JOIN spfli AS sp
ON sf~carrid = sp~carrid
AND sf~connid = sp~connid
JOIN sbook AS sb
ON sb~carrid = sf~carrid
AND sb~connid = sf~connid
AND sb~fldate = sf~fldate .
To create a
JOIN
condition from three tables, we just need to expand the result set and the next table to join with the JOIN
condition.
If we want to select from three tables in the previous two ways (
FOR ALL ENTRIES
and SELECT...ENDSELECT
), we need to make three separate SELECT
statements in FOR ALL ENTRIES
. However, for SELECT...ENDSELECT
, as seen in the sbook
table, we can have more than one combination of carrid
and connid
. As a result of that, the number of SELECT
statements will increase significantly.
A comparison of the efficiency and execute time methods will be made in the explanation of the
SQL TRACE
tool.Identifying and saving the changes
Before you save the data from an application, you should reflect on which data needs to be stored. In most cases, you don't need to save all the data you've read. Some actions can be executed to identify whether it's an insertion, an update, or a deletion of data.
Let's have a short look on the different meanings of these terms:
INSERT
: A new database table entry is createdUPDATE
: Existing data is changedDELETE
: Existing data is deleted from the database table
In the case of implementing any logic for reading and maintaining data, you should follow the performance rules from the beginning. As a best practice, you can compare the data changed by the application with the unchanged data read in the beginning of your logical unit of work (LUW).
For this, you can hold a copy of the original data within a data object that is similar to your workspace. In comparison, you can identify all actions, such as
INSERT
, UPDATE
, or DELETE
. Depending on the amount of accessed data (single row or multiple rows), this is a simple comparison of structures, or a slightly more complex comparison of tables and their entries. When it comes to having multiple rows, you should compare each row of your workspace table with the corresponding row of original data. This logic is a bit more complex due to the situation in which some entries can be added, removed, or changed.
You have two possibilities for identifying the action taken by the user. The first option is to give a sign for the database action with the entry within your program. You can achieve this by extending the data structure with a database action flag and by setting this to
I
, U
, or D
for the INSERT
, UPDATE
, or DELETE
actions, for example. You have to take care of this flag in any action you might undertake with the data and some dependencies have to be used.
The second option is to make the comparison just before saving the data and making just the required changes to the methods of saving. This is a bit more convenient and gives you more flexibility within the program, as well as reducing the complexity of the program.
In the next sections, we will have a deeper look at the different actions on the database.
Datasets are created by using the
INSERT
statement. You can either create a single row or multiple rows with this statement.
To add a single row, you can use one of the following variants, which acts in an equal manner on the database:
INSERT INTO dbtab VALUES wa
Alternatively, you can use one of the following:
INSERT dbtab FROM wa.
The structure of
wa
should be identical to the structure of dbtab
. This operation will only be executed if there is no entry with the same primary key. If there is an entry with the same key, SY-SUBRC
will be set to 4
instead of 0
.
To add multiple datasets, you can use the following statement:
INSERT dbtab FROM TABLE itab [ACCEPTING DUPLICATE KEYS]
All entries from
itab
are inserted into the dbtab
table. If at least one entry with the same key exists on the database, an exception will be raised and no data will be inserted.
You can avoid this exception by using the
ACCEPT DUPLICATE KEYS
addition. In this case, the duplicated datasets are ignored and SY-SUBRC
will be set to 4
. All datasets without a duplicate will be inserted in the database. The SY-DBCNT
field contains the number of inserted entries.
Following the five golden performance rules, you should execute the operation with internal tables whenever possible instead of executing loops and single rows.
To change existing database entries, you can use the
UPDATE
statement. Three different options are available for the UPDATE
statement.
To change specific columns, you can use the following syntax:
UPDATE dbtab
SET [col1 = f1 col2 = f2 ...]
[col1 = col1 + f2 col2 = col2 + f2 ...]
[col1 = col1 – f1 col2 = col2 – f2 ...]
WHERE ...
With this statement, you change the columns named at the
SET
clause for all rows within the dbtab
tablefor which the conditions of the WHERE
clause take effect. Without the WHERE
clause, all rows of the table are changed.
You can either override the existing value or add to or subtract from those values.
To change the entire row, you can use the following statement:
UPDATE dbtab FROM wa
To change multiple entries, the syntax is as follows:
UPDATE dbtab FROM TABLE itab
As usual, the work area should be the same type as the database table.
The optimization of reading big datasets
The performance of a program is often determined by the efficiency of a database and how its operations are used on it.
The efficiency of using a database and downloading only necessary data to the application server can be critical to the general speed of a program, so operations on a database should be as low as possible.
We need to follow a rule that helps us to maintain an operation on a database in good performance.
In order to ensure correct performance, we must ensure that we follow these steps:
- Get only the required rows from a table: For example, if we need to get a flight from America Airlines from the
sflight
table, we must use the proper conditions inWHERE
. When it comes to the effective use of all data, what we can get to specify the condition is really important. - Get only the required columns from a table: In every
SELECT
statement, the result set should only contain a column that you really need. If we need to get data with regard to the flight, country, and airport we will fly from, we cannot use*
, because we do not need all columns. - Do not use more reads than necessary: To keep a low number of database reads, use a mass operation instead of a single one. For example, we should not use any reads in a loop. Instead of this, use
JOIN
,SELECT VIEW
, orSUBQUERY
. - Using local buffers and indexes: In all cases when the secondary index can improve selection performance, these indexes should be used. When the same data is read more than once, we can save this data to a local SAP buffer. This operation can significantly save time, since reading data from a local buffer is faster than getting data from a database.
Even if we performed all of these actions, the load of data may sometimes be too big and can cause dumps, resulting in getting a notification that the connection to the database took too long.
In this case, we need to enter the
CURSOR
statement. CURSOR
is the way to split SELECT
into smaller partitions. If we want to use CURSOR
, we need to declare them as a variable. A CURSOR
statement consists of three pieces. The first of these is OPEN CURSOR
, where we can define which SELECT
statement will be executed in this cursor.
When we used
OPEN CURSOR
, we created a database cursor, which is pointed to the result set of a database selection. A database CURSOR
is always assigned to a line in the result set. CURSOR
handling is usually implicit, but when we use CURSOR
, we can control the database cursor.
We then need to use
FETCH
to fetch the next rows, where we can decide how many rows FETCH
should select.
After all, we need to use
CLOSE CURSOR
to close the database cursor. An example of using the CURSOR
statement is as follows:DATA: c_cursor TYPE cursor,
gt_cursor TYPE TABLE OF sflight.
OPEN CURSOR c_cursor FOR
SELECT carrid connid
FROM sflight.
DO.
FETCH NEXT CURSOR c_cursor APPENDING TABLE gt_cursor PACKAGE SIZE 100.
IF sy-subrc <> 0.
EXIT.
ENDIF.
ENDDO.
CLOSE CURSOR c_cursor.
In this example, we get data from the
sflight
table with a package of at least 100 rows in 1 call.
We need to use the
DO
loop to get all of these records. Of course, the SELECT
statement of the CURSOR
can be more extensive, and in the DO
loop, we executed an operation. It is also really important to create an EXIT
statement, otherwise we can create an infinite loop.The new SQL syntax
- Inline declaration
- SQL expression
There are more changes, but here is the focus on the most important and most helpful in daily work.
The main changes compared to the old SQL is data declaration, where we needed to declare all the necessary fields that will be selected. By using inline declaration, this is no longer needed. During
select from database
, a structure or table will be created. This is really helpful, as when you need to select a new field or fields, just add them to the field list.
The following are three
SELECT
statements. The first of these is created in the old SQL, while the second and third are created with the inline declaration. All the SELECT
statement results are identical.
The first
SELECT
statement is also presented with the declaration of the table:TYPES:
BEGIN OF t_spfli,
mandt TYPE s_mandt,
carrid TYPE s_carr_id,
connid TYPE s_conn_id,
countryfr TYPE land1,
countryto TYPE land1,
END OF t_spfli.
DATA: lt_spfli TYPE TABLE OF t_spfli.
SELECT mandt carrid connid countryfr countryto
FROM spfli
INTO TABLE lt_spfli.
In the following code snippet, the fragment is presented as an inline declaration (a declaration of a variable is not needed):
SELECT mandt, carrid, connid, countryfr, countryto
FROM spfli
INTO TABLE @data(lt_spfli).
The following is the second version of the new SQL:
SELECT FROM spfli
FIELDS mandt, carrid, connid, countryfr, countryto
INTO TABLE @DATA(lt_splfi).
The declaration can also contain fields that are not selected, and the inline declaration was created with only the fields that were selected.
In the new SQL, it is possible to add a field that is not selected from the database.
To add this field, we need to add a
TYPE
variable to the list of the SELECT
fields. The following example shows the first SELECT
statement from the inline declaration with an added field named flag
, which is of the Boolean type:SELECT mandt, carrid, connid,
countryfr, countryto, lv_flag AS flag
FROM spfli
INTO TABLE @data(lt_spfli).
In this case, only one field is added, but it is also possible to add more fields, structure, tables, and so on.
If the new SQL is used, it is also necessary to make all variables a host variable, for example, if
WHERE
is used in SELECT
. In the following SELECT
statement, use the variable named lv_carrid
in WHERE
, and this variable needs to be escaped by @
:DATA: lv_carrid TYPE s_carr_id VALUE 'AA'.
SELECT mandt, carrid, connid
FROM sflight
INTO TABLE @DATA(lt_spfli)
WHERE carrid = @lv_carrid.
SQL expression introduced the ability to add, for example, arithmetic calculations or case to
SELECT
.
The first example involves creating the case. In code, you need to add the case after the list of fields, the list of conditions, and the name of the field where the result is shown.
In the following example, use a case in the
carrid
field. When a field in the entry is equal to AA
, a field named flag
will be X
; if carrid
is not equal to AA
, the flag
field is empty:SELECT mandt, carrid, connid, countryfr, countryto,
CASE carrid
WHEN 'AA' THEN 'X'
ELSE ' '
END AS flag
FROM spfli
INTO TABLE @DATA(lt_spfli).
To create the calculation, you need to create the new field with the result of the calculation. In the following example, we have created the result of an addition of two fields,
seatsocc_b
and seatsocc_f
:SELECT mandt, carrid, connid,
( seatsocc_b + seatsocc_f ) AS occupy
FROM sflight
INTO TABLE @DATA(lt_spfli).
The addition of fields in brackets results in the
occupy
field.Using the SQL Trace tool for performance analysis
We discussed how to exercise care in relation to the performance of database reads, but how can we measure that? For this, we can use the
SQL Trace
tool and the RUNTIME
analysis tool.
To start the SQL trace, we can open the
ST05
transaction and RUNTIME
analysis in the SAT transaction.
The main window of the SQL trace looks like this:
Here, we have several options to analyze the tracing, but right now, we will focus on
SQL Trace
. To start tracing, we need to click on the Activate Trace
button. Right after clicking on that button, tracing begins. Now, we can execute our program.
After ending a program, click on
Deactivate Trace
. When we want to see the trace, click on Display Trace
.
After doing this, we get the window where we can select which
Trace Types
will be shown, or the time period of the trace. When we select our values, click on RUN
. At this point, we will get the values of the selected trace:
Let's check how time-consuming it is to join two tables. To do this, we will follow these steps:
- Run the
ST05
transaction - Activate the trace
- Execute our program with
JOIN SELECT
- Deactivate the trace
- Display the trace
We now need to locate interesting rows. In the following window, we can see trace information about this call:
From this window, we can get values such as the following:
- The duration of SQL
- The number of selected records
- The object name that is used for this call
- The program
- The
SELECT
statement that is used for this call
To measure performance, we can also use
Runtime Analysis
. Runtime Analysis
can be called by the SAT transaction.
The main window of this transaction appears as follows:
In this transaction, we can run the chosen program in a similar manner to running the program in the SQL Trace tool. The remaining values can also be useful, such as the time to execute the loop in entries.
To run the program/transaction/function module, we mark the corresponding radio button, enter the name, and click on
Execute
.
The window that lists the values looks like this:
On the left-hand side of the screen, we can see the profile of the trace result, while on the
Hit List
on the right-hand side of the screen, we can see the time of the execution statement, as well as how long the entire statement needs.
We can now run an experiment to determine which method of selecting data from more than one table is the most efficient. The SAT transaction is the best option for this, as the join option only gets data directly in
SELECT
; other options also need other statements. The rules for the experiment are as follows:- Use the same result (same data and number of entries) of execution in the internal table
- Run only
SELECT
and the requisite statements in all versions of the code - Using the same code as is to be found in the Reading data from several tables section
- All versions will run three times
- All time will be presented in microseconds
The first version used in this comparison is the version with
FOR ALL ENTRIES
.
The whole program takes the following amount of time:
The following is the execution timetable of
FOR ALL ENTRIES
:
First run
|
Second run
|
Third run
|
Average
|
5,167
|
5,525
|
5,814
|
5,502
|
Now, we will look at the time required to execute
SELECT...ENDSELECT
:
The following is the execution timetable of
SELECT...ENDSELECT
:
First run
|
Second run
|
Third run
|
Average
|
5,983
|
6,647
|
6,700
|
6,443.3
|
The following is the execution time table of
JOIN
:
First run
|
Second run
|
Third run
|
Average
|
5011
|
5039
|
5081
|
5043
|
As you can see,
JOIN
is more efficient in this case. Small differences here are caused by small numbers of rows in tables.
Comments
Post a Comment