[SAP ABAP] Databases Part 2

Reading data from several tables

We have several possibilities for reading data from several tables in SQL and merging that data in one internal table. I will present three possibilities for getting the data from several tables into one internal table. Here, we can use FOR ALL ENTRIESSELECT...ENDSELECT, and JOIN. As the mechanism and principle of the FOR ALL ENTRIESoperation has been discussed previously, I have only shown how to get data from several tables.
We need to split a query into two SELECT statements. However, first of all, I will prepare the structure and table when we have fields from two tables—sflight and spfli:
TYPES:
BEGIN OF s_for_all_entries,
     mandt     TYPE mandt,
     carrid    TYPE s_carr_id,
     connid    TYPE s_conn_id,
     fldate    TYPE s_date,
     countryfr TYPE land1,
     cityfrom  TYPE s_from_cit,
     airpfrom  TYPE s_fromairp,
     countryto TYPE land1,
     cityto    TYPE s_to_city,
     airpto    TYPE s_toairp,
END OF s_for_all_entries.

DATA: gt_for_all_enties       TYPE TABLE OF s_for_all_entries,
       gt_for_all_enties_spfli TYPE TABLE OF spfli.
I have fields from sflight mandt,carridconnid, and fldate, as well as spfli countryfrcityfromairpfromcountrytocityto, and airpto. In the preceding code snippet, I have used a local definition of structure and I have defined the structure and table as a variable.
The first SELECT statement is to be used to get data from the sflight table. This might look as follows:
SELECT mandt carrid connid fldate
   INTO CORRESPONDING FIELDS OF TABLE gt_for_all_entries
   FROM sflight.
The result of SELECT from sflight is as follows:
When I use data from the spfli table, the code should look as follows:
IF gt_for_all_entries IS NOT INITIAL.

SELECT mandt carrid conid cityfrom
          airpfrom countryto cityto airpto
INTO CORRESPONDING FIELDS OF TABLE gt_for_all_entries_spfli
FROM spfli
FOR ALL ENTRIES IN gt_for_all_entries
WHERE carrid = gt_for_all_entries-carrid
AND connid = gt_for_all_entries-connid.

ENDIF.
The result of SELECT from spfli is as follows:
We then need to merge the table:
LOOP AT gt_for_all_entries ASSIGNING FIELD-SYMBOL(<gs_for_all_entires>).

READ TABLE gt_for_all_entries-spfli ASSIGNING field-symbol(<gs_for_all_entires_spfli>)
WITH KEY carrid = <gs_for_all_entires>-carrid connid = <gs_for_all_entires>-connid.

MOVE-CORRESPONDING <gs_for_all_entires_spfli> TO <gs_for_all_entires>.

ENDLOOP.
The result of this piece of code is presented in the following screenshot. All fields will be filled with data from two tables:
The next possibilities include using the SELECT...ENDSELECT statements.
For this method, we need to create two more variables, as we know that SELECT...ENDSELECT needs structure, and we need the next structure to select 'single' inside the SELECT...ENDSELECT loop. It is also worth noting that I renamed the table:
DATA: gt_loop  TYPE TABLE OF s_2_tables,
      gs_loop  TYPE s_2_tables,
      gs_spfli TYPE spfli.
SELECT...ENDSELECT is presented in the following code snippet. In this loop, we are using two SELECT queries. First of all, to get data from sflight, we must get corresponding data from spfli, and then move all data to where it is needed from the spfli structure to our target structure. As a final step, we will append our result to the table, as shown in the following code snippet:
SELECT mandt carrid connid fldate
INTO CORRESPONDING FIELDS OF gs_loop
FROM sflight.

SELECT SINGLE mandt carrid connid cityfrom
                 airpfrom countryto cityto airpto
INTO CORRESPONDING FIELDS OF gs_spfli
FROM spfli
WHERE carrid = gs_loop-carrid
AND connid = gs_loop-connid.

MOVE-CORRESPONDING gs_spfli TO gs_loop.
APPEND gs_loop TO gt_loop.

ENDSELECT.
Of course, the result of this query is identical to FOR ALL ENTRIES.
The main difference between them is that, in FOR ALL ENTRIES, we make two SELECT statements and merge tables directly on the application server, as we have all of the required data. In contrast, in SELECT...ENDSELECT, we have to enter the database every time we need data. Let's assume that in the sflight table, we have 400 rows. In FOR ALL ENTRIES, we always have two independent SELECT statements, regardless of how many rows are in the table. In the case of SELECT...ENDSELECT, there are 800 separate SELECT statements to include on the database, which increases the database workload.
The third option for getting data from more than one table is JOIN. For this option, I have renamed the main table and deleted the rest of the table and structure, as we do not need them:
DATA: gt_join TYPE TABLE OF s_2_tables.
The SELECT statements used for JOIN are shown as follows:
SELECT sf~mandt sf~carrid sf~connid sf~fldate
       sp~countryfrom sp~cityfrom sp~airpfrom
       sp~countryto sp~cityto sp~airpto
INTO CORRESPONDING FIELDS OF TABLE gt_join
FROM sflight AS sf
JOIN spfli AS sp
ON sf~carrid = sp~carrid
AND sf~connid = sp~connid.
In the preceding code snippet, you may see pieces of code that were not previously used; for example, sf~mandt and sflight as sf.
Using this statement, every field in a resulting set needs to have an alias corresponding to a table. Every field on a resulting set needs to be with an alias. This is because the SQL engine needs to define the table that each field is to be selected from.
Furthermore, ON can be a new statement. This uses joins to indicate the JOIN condition. It is worth noting that inner and outer joins require a JOIN condition. In JOIN, we can use three types of join, which are listed as follows:
  • JOIN or INNER JOIN: In this type of JOIN, the INNER JOIN joins a column of rows in the result set of the left-hand side and right-hand side only if the rows meet the JOIN condition and the statement creates all combinations of keys. If some rows do not have their equivalent on the first and second table, rows are not created.
  • LEFT/RIGHT JOIN or LEFT/RIGHT OUTER JOINOUTER JOIN is pretty much the same as INNER JOIN, but with a number of differences. The difference between them is the following. In the case of LEFT OUTER JOIN, the function selects all rows from the left-hand side and all the matching rows from the right-hand side and includes them in the result set. The RIGHT OUTER JOIN function works identically, but selects rows from the right-hand side and matches them from the left-hand side. This is even the case if no corresponding rows are on the other side.
  • CROSS JOIN: CROSS JOIN creates every possible combination from the rows on both tables, without any special conditions.
We can also create JOIN from more than two tables. For this example, I have created a new type with three tables:
TYPES:
BEGIN OF s_3_tables,
     mandt     TYPE mandt,
     carrid    TYPE s_carr_id,
     connid    TYPE s_conn_id,
     fldate    TYPE s_date,
     countryfr TYPE land1,
     cityfrom  TYPE s_from_cit,
     airpfrom  TYPE s_fromairp,
     countryto TYPE land1,
     cityto    TYPE s_to_city,
     airpto    TYPE s_toairp,
     bookid    TYPE s_book_id,
     customid  TYPE s_customer,
END OF s_3_tables.

DATA: gt_3join TYPE TABLE OF s_3_tables.
The code to join three tables looks like this:
SELECT sf~mandt sf~carrid sf~connid sf~fldate
        sp~countryfr sp~cityfrom sp~airpfrom
        sp~countryto sp~cityto sp~airpto
        sb~bookid sb~customid
INTO CORRESPONDING FIELDS OF TABLE gt_3join
FROM sflight AS sf
JOIN spfli AS sp
ON sf~carrid = sp~carrid
AND sf~connid = sp~connid
JOIN sbook AS sb
ON sb~carrid = sf~carrid
AND sb~connid = sf~connid
AND sb~fldate = sf~fldate .
To create a JOIN condition from three tables, we just need to expand the result set and the next table to join with the JOIN condition.
If we want to select from three tables in the previous two ways (FOR ALL ENTRIES and SELECT...ENDSELECT), we need to make three separate SELECT statements in FOR ALL ENTRIES. However, for SELECT...ENDSELECT, as seen in the sbook table, we can have more than one combination of carrid and connid. As a result of that, the number of SELECT statements will increase significantly.
A comparison of the efficiency and execute time methods will be made in the explanation of the SQL TRACE tool.

Identifying and saving the changes

Before you save the data from an application, you should reflect on which data needs to be stored. In most cases, you don't need to save all the data you've read. Some actions can be executed to identify whether it's an insertion, an update, or a deletion of data.
Let's have a short look on the different meanings of these terms:
  • INSERT: A new database table entry is created
  • UPDATE: Existing data is changed
  • DELETE: Existing data is deleted from the database table
In the case of implementing any logic for reading and maintaining data, you should follow the performance rules from the beginning. As a best practice, you can compare the data changed by the application with the unchanged data read in the beginning of your logical unit of work (LUW).
For this, you can hold a copy of the original data within a data object that is similar to your workspace. In comparison, you can identify all actions, such as  INSERTUPDATE, or DELETE. Depending on the amount of accessed data (single row or multiple rows), this is a simple comparison of structures, or a slightly more complex comparison of tables and their entries. When it comes to having multiple rows, you should compare each row of your workspace table with the corresponding row of original data. This logic is a bit more complex due to the situation in which some entries can be added, removed, or changed.
You have two possibilities for identifying the action taken by the user. The first option is to give a sign for the database action with the entry within your program. You can achieve this by extending the data structure with a database action flag and by setting this to IU, or D for the INSERTUPDATE, or DELETE actions, for example. You have to take care of this flag in any action you might undertake with the data and some dependencies have to be used.
The second option is to make the comparison just before saving the data and making just the required changes to the methods of saving. This is a bit more convenient and gives you more flexibility within the program, as well as reducing the complexity of the program.
In the next sections, we will have a deeper look at the different actions on the database.

Creating datasets

Datasets are created by using the INSERT statement. You can either create a single row or multiple rows with this statement.
To add a single row, you can use one of the following variants, which acts in an equal manner on the database: 
INSERT INTO dbtab VALUES wa
Alternatively, you can use one of the following:
INSERT dbtab FROM wa.
The structure of wa should be identical to the structure of dbtab. This operation will only be executed if there is no entry with the same primary key. If there is an entry with the same key, SY-SUBRC will be set to 4 instead of 0.
To add multiple datasets, you can use the following statement: 
INSERT dbtab FROM TABLE itab [ACCEPTING DUPLICATE KEYS]
All entries from itab are inserted into the dbtab table. If at least one entry with the same key exists on the database, an exception will be raised and no data will be inserted.
You can avoid this exception by using the ACCEPT DUPLICATE KEYS addition. In this case, the duplicated datasets are ignored and SY-SUBRC will be set to 4. All datasets without a duplicate will be inserted in the database. The SY-DBCNT field contains the number of inserted entries.
Following the five golden performance rules, you should execute the operation with internal tables whenever possible instead of executing loops and single rows.

Updating datasets

To change existing database entries, you can use the UPDATE statement. Three different options are available for the UPDATE statement.
To change specific columns, you can use the following syntax:
UPDATE dbtab
      SET [col1 = f1 col2 = f2 ...]
          [col1 = col1 + f2 col2 = col2 + f2 ...]
          [col1 = col1 – f1 col2 = col2 – f2 ...]
      WHERE ...
With this statement, you change the columns named at the SET clause for all rows within the dbtab tablefor which the conditions of the WHERE clause take effect. Without the WHERE clause, all rows of the table are changed.
You can either override the existing value or add to or subtract from those values.
To change the entire row, you can use the following statement:
UPDATE dbtab FROM wa
To change multiple entries, the syntax is as follows:
UPDATE dbtab FROM TABLE itab
As usual, the work area should be the same type as the database table.

Deleting datasets

Use the DELETE statement to remove one or more rows. Deletion can be done in two variants


The optimization of reading big datasets

The performance of a program is often determined by the efficiency of a database and how its operations are used on it.
The efficiency of using a database and downloading only necessary data to the application server can be critical to the general speed of a program, so operations on a database should be as low as possible.
We need to follow a rule that helps us to maintain an operation on a database in good performance.
In order to ensure correct performance, we must ensure that we follow these steps:
  • Get only the required rows from a table: For example, if we need to get a flight from America Airlines from the sflight table, we must use the proper conditions in WHERE. When it comes to the effective use of all data, what we can get to specify the condition is really important.
  • Get only the required columns from a table: In every SELECT statement, the result set should only contain a column that you really need. If we need to get data with regard to the flight, country, and airport we will fly from, we cannot use *, because we do not need all columns.
  • Do not use more reads than necessary: To keep a low number of database reads, use a mass operation instead of a single one. For example, we should not use any reads in a loop. Instead of this, use JOIN, SELECT VIEW, or SUBQUERY.
  • Using local buffers and indexes: In all cases when the secondary index can improve selection performance, these indexes should be used. When the same data is read more than once, we can save this data to a local SAP buffer. This operation can significantly save time, since reading data from a local buffer is faster than getting data from a database.
Even if we performed all of these actions, the load of data may sometimes be too big and can cause dumps, resulting in getting a notification that the connection to the database took too long.
In this case, we need to enter the CURSOR statement. CURSOR is the way to split SELECT into smaller partitions. If we want to use CURSOR, we need to declare them as a variable. A CURSOR statement consists of three pieces. The first of these is OPEN CURSOR, where we can define which SELECT statement will be executed in this cursor.
When we used OPEN CURSOR, we created a database cursor, which is pointed to the result set of a database selection. A database CURSOR is always assigned to a line in the result set. CURSOR handling is usually implicit, but when we use CURSOR, we can control the database cursor. 
We then need to use FETCH to fetch the next rows, where we can decide how many rows FETCH should select.
After all, we need to use CLOSE CURSOR to close the database cursor. An example of using the CURSOR statement is as follows:
DATA: c_cursor TYPE cursor,
       gt_cursor TYPE TABLE OF sflight.

OPEN CURSOR c_cursor FOR
SELECT carrid connid
FROM sflight.

DO.
FETCH NEXT CURSOR c_cursor APPENDING TABLE gt_cursor PACKAGE SIZE 100.
IF sy-subrc <> 0.
EXIT.
ENDIF.
ENDDO.

CLOSE CURSOR c_cursor.
In this example, we get data from the sflight table with a package of at least 100 rows in 1 call.
We need to use the DO loop to get all of these records. Of course, the SELECT statement of the CURSOR can be more extensive, and in the DO loop, we executed an operation. It is also really important to create an EXIT statement, otherwise we can create an infinite loop.

The new SQL syntax

From version 7.40, SP08 SAP introduced a few important changes in SQL:
  • Inline declaration
  • SQL expression
There are more changes, but here is the focus on the most important and most helpful in daily work.

Inline declaration

The main changes compared to the old SQL is data declaration, where we needed to declare all the necessary fields that will be selected. By using inline declaration, this is no longer needed. During select from database, a structure or table will be created. This is really helpful, as when you need to select a new field or fields, just add them to the field list.
The following are three SELECT statements. The first of these is created in the old SQL, while the second and third are created with the inline declaration. All the SELECT statement results are identical.
The first SELECT statement is also presented with the declaration of the table:
TYPES:
   BEGIN OF t_spfli,
     mandt     TYPE s_mandt,
     carrid    TYPE s_carr_id,
     connid    TYPE s_conn_id,
     countryfr TYPE land1,
     countryto TYPE land1,
   END OF t_spfli.

 DATA: lt_spfli TYPE TABLE OF t_spfli.

 SELECT mandt carrid connid countryfr countryto
   FROM spfli
   INTO TABLE lt_spfli.
In the following code snippet, the fragment is presented as an inline declaration (a declaration of a variable is not needed):
SELECT mandt, carrid, connid, countryfr, countryto
  FROM spfli
  INTO TABLE @data(lt_spfli).
The following is the second version of the new SQL:
SELECT FROM spfli
  FIELDS mandt, carrid, connid, countryfr, countryto
  INTO TABLE @DATA(lt_splfi).
The declaration can also contain fields that are not selected, and the inline declaration was created with only the fields that were selected.
In the new SQL, it is possible to add a field that is not selected from the database.
To add this field, we need to add a TYPE variable to the list of the SELECT fields. The following example shows the first SELECT statement from the inline declaration with an added field named flag, which is of the Boolean type:
SELECT mandt, carrid, connid,
        countryfr, countryto, lv_flag AS flag
   FROM spfli
   INTO TABLE @data(lt_spfli).
In this case, only one field is added, but it is also possible to add more fields, structure, tables, and so on.
If the new SQL is used, it is also necessary to make all variables a host variable, for example, if WHERE is used in SELECT. In the following SELECT statement, use the variable named lv_carrid in WHERE, and this variable needs to be escaped by @:
DATA: lv_carrid TYPE s_carr_id VALUE 'AA'.

SELECT mandt, carrid, connid
   FROM sflight
   INTO TABLE @DATA(lt_spfli)
   WHERE carrid = @lv_carrid.

SQL expression

SQL expression introduced the ability to add, for example, arithmetic calculations or case to SELECT.
The first example involves creating the case. In code, you need to add the case after the list of fields, the list of conditions, and the name of the field where the result is shown.
In the following example, use a case in the carrid field. When a field in the entry is equal to AA, a field named flag will be X; if carrid is not equal to AA, the flag field is empty:
SELECT mandt, carrid, connid, countryfr, countryto,
   CASE carrid
     WHEN 'AA' THEN 'X'
     ELSE ' '
   END AS flag
   FROM spfli
   INTO TABLE @DATA(lt_spfli).
To create the calculation, you need to create the new field with the result of the calculation. In the following example, we have created the result of an addition of two fields, seatsocc_b and seatsocc_f:
SELECT mandt, carrid, connid,
   ( seatsocc_b + seatsocc_f ) AS occupy
   FROM sflight
   INTO TABLE @DATA(lt_spfli).
The addition of fields in brackets results in the occupy field.

Using the SQL Trace tool for performance analysis

We discussed how to exercise care in relation to the performance of database reads, but how can we measure that? For this, we can use the SQL Trace tool and the RUNTIME analysis tool.
To start the SQL trace, we can open the ST05 transaction and RUNTIME analysis in the SAT transaction.
The main window of the SQL trace looks like this:
Here, we have several options to analyze the tracing, but right now, we will focus on SQL Trace. To start tracing, we need to click on the Activate Trace button. Right after clicking on that button, tracing begins. Now, we can execute our program.
After ending a program, click on Deactivate Trace. When we want to see the trace, click on Display Trace.
After doing this, we get the window where we can select which Trace Types will be shown, or the time period of the trace. When we select our values, click on RUN. At this point, we will get the values of the selected trace:
Let's check how time-consuming it is to join two tables. To do this, we will follow these steps:
  1. Run the ST05 transaction
  2. Activate the trace
  3. Execute our program with JOIN SELECT
  4. Deactivate the trace
  5. Display the trace
We now need to locate interesting rows. In the following window, we can see trace information about this call:
From this window, we can get values such as the following:
  • The duration of SQL
  • The number of selected records
  • The object name that is used for this call
  • The program
  • The SELECT statement that is used for this call
To measure performance, we can also use Runtime AnalysisRuntime Analysis can be called by the SAT transaction.
The main window of this transaction appears as follows:
In this transaction, we can run the chosen program in a similar manner to running the program in the SQL Trace tool. The remaining values can also be useful, such as the time to execute the loop in entries.
To run the program/transaction/function module, we mark the corresponding radio button, enter the name, and click on Execute.
The window that lists the values looks like this:
On the left-hand side of the screen, we can see the profile of the trace result, while on the Hit List on the right-hand side of the screen, we can see the time of the execution statement, as well as how long the entire statement needs.
We can now run an experiment to determine which method of selecting data from more than one table is the most efficient. The SAT transaction is the best option for this, as the join option only gets data directly in SELECT; other options also need other statements. The rules for the experiment are as follows:
  • Use the same result (same data and number of entries) of execution in the internal table
  • Run only SELECT and the requisite statements in all versions of the code
  • Using the same code as is to be found in the Reading data from several tables section
  • All versions will run three times
  • All time will be presented in microseconds
The first version used in this comparison is the version with FOR ALL ENTRIES.
The whole program takes the following amount of time:
The following is the execution timetable of FOR ALL ENTRIES:
First run
Second run
Third run
Average
5,167
5,525
5,814
5,502

Now, we will look at the time required to execute SELECT...ENDSELECT:
The following is the execution timetable of SELECT...ENDSELECT:
First run
Second run
Third run
Average
5,983
6,647
6,700
6,443.3
Finally, the time required to execute the JOIN version is displayed as follows:
The following is the execution time table of JOIN:
First run
Second run
Third run
Average
5011
5039
5081
5043

As you can see, JOIN is more efficient in this case. Small differences here are caused by small numbers of rows in tables.

Comments

Popular posts from this blog

[SAP ABAP] Data declaration

[SAP ABAP] BAPI

[SAP HR] Payroll Schema