0406 - Data Provisioning - SLT for HANA 1.0 - How to accelerate the initial load.pdf
Transcript of 0406 - Data Provisioning - SLT for HANA 1.0 - How to accelerate the initial load.pdf
0
<Internal>
How to accelerate the initial load
·LT Replication Server for HANA 1.0 SPS03 (DMIS SP05)
SLO
30/07/2012
Version 6
Author: I048980
1
How to accelerate the initial load Prerequisite: a schema is configured on a successfully installed SLT server. This SLT server is on
HANA 1.0 SPS03 (DMIS SP05). The target tables you plan to load/replicate are TRANSPARENT
tables.
On the basis of HANA 1.0 SPS03 (DMIS SP05), the default reading type for the initial load is “3-
cluster”. In this guide, you can utilize reading type “1-access plan” for better load performance.
The following table lists the difference of the two reading types:
1-access plan 3-cluster
Parallel load Possible Currently Not possible (*)
Access plan calculation time Longer (depends on the size of the source table)
Shorter (almost 0)
Entries per portion 5.000 lines per portion, and the amount of portions in the collective access plan might differ
Fixed: 10.000 lines
Index used to read source table during the initial load
Additional secondary index needs to be built in the source system before the load
Primary index
Behavior in case of interruption Continue from where it stopped
Restarts with last cluster
*We are currently working on this.
If no additional secondary index is built beforehand, the load performance might be bad
(dependent of the system settings and available indices). We changed the default reading type
from 1 to 3 because the index we need normally does NOT exist in the source system and thus is
making initial load slow.
This document shows you how to accelerate the initial load of transparent tables with reading
type 1. And you can benefit from parallel loading and quicker reading performance per single
thread (job). This method is proved practical, effective and most beneficial for huge transparent
tables.
The following section describes the way to build the secondary index and some additional steps
to follow.
2
Important Notes
Please make sure the following Notes are installed on your SLT system:
SAP Note Number
Short Text Description
1655246 HANA LTR SP5: General Corrections General fixes after applying DMIS SP05.
1656370 HANA LTR SP5: General Corrections 2 General fixes after applying DMIS SP05.
3
Preparation
1. Build the secondary index on source system Check the definition of the source table, and from the primary key identify the field that has the
most distinct values (See Reference below [A][B]). This field will be then used for access plan
calculation (divide data portion base on the range of this key field) and initial load, so a
secondary index needs to be built on this table to accelerate the process. As a follow-up step,
table statistics update is also recommended after the index creation.
[A] Identify the most selective field on a (transparent) table in the source system (DB
independent)
Start transaction DB05 and choose the table for which you want to build the secondary index. It
is recommended to run the analysis in background (Option: Submit analysis in background).
Example: COEJ
Start transaction SM37 to monitor the job during runtime. When the job status reached
"finished", press button “Spool” to display the result of the request.
4
Depending on the distinct value of the fields which had been considered in the analysis, the
suitable fields for secondary indices can be determined. Please take into account, that the
numbers shown in the result are cumulative numbers. Therefore the delta (result of division)
between the distinct values is the variable which determines particular selectiveness. In the
following example, BELNR is the best candidate (34555/30 = 1151.8).
If you have to deal with a huge amount of RFC-connections, make sure that the parameter
settings are adjusted correctly for high interface load. By adjusting the parameters a
performance increase can be realized. Please find a more detailed description in SAP note
384971.
5
[B] Identify the most selective field on a (transparent) table in the source system (Oracle)
If your source system is on Oracle database you can use transaction DB02OLD to identify the
most selective field of a table.
Click on button “Detailed analysis” and input the table name as “Object name”.
In the result screen, click on button “Table columns”.
There you will find the distinct value for each columns of the table.
6
Choose the most distinctive key field (whether it is a key field is indicated in column “Key”). In
the sample above, the candidate field is BELNR.
2. Cross check system capacity and plan for the parallel level For a parallel load, each job will occupy a BGD work process in SLT system and in the source
system, 1 DIA work process is occupied.
Provided that you have only ONE schema on the SLT, do think about the following questions:
How many BGD work processes are configured in your SLT? (X)
How many DIA work processes in maximal can be occupied by SLT in the source system if you
want to keep business operations smoothly? (Y)
2.1. Setting the number of jobs assigned to one schema on SLT Transaction: LTR
7
In the initial screen, click on the target schema name and a new screen will pop up with all the
detailed information with regard to this schema.
By clicking button “Edit”, it is possible to adjust the jobs assigned to this schema:
Number of Replay Jobs: Number of jobs assigned to this schema (number of DTL* jobs in SM37).
Number of Replay Jobs <= X - 3
8
Initial Load Jobs: maximal jobs OUT of Number of Replay Jobs can be used for initial load in one
schema.
First of all, this field should be no greater than Number of Replay Jobs.
Initial Load Jobs: >= 0, <= min (X-3, Y)
* Plan the setting here before the load really starts.
* For multiple schemas in one SLT, you should make sure SLT can stand the sum of required BGD
jobs. But the requirement (#DIA work processes) in source system is separate.
3. Pre-configure the table properties before the initial load on SLT Before you start to load/replicate a table, you need to pre-configure the table properties in your SLT server. In this matter both the selected key field (for access plan calculation) and parallelization (for initial load) are pre-defined. There is no need to adjust these properties during runtime any more.
Table: IUUC_PERF_OPTION
Field name Field label Value Description
MT_ID Identification <xxx> Mass transfer identifier
TABNAME Table Name <TABNAME> Target table
CLUSTER_PHYSICAL phys. clust. unknown Not relevant for HANA scenario
LOGTAB_IN_RCV logtab rcvr unknown Not relevant for HANA scenario
PARALL_JOBS parall jobs 01 - N How many jobs should be used for this table (only meaningful if the number is smaller than the number of jobs for the initial load itself)
PARALL_FIELDNAME paral fld name <SEL_FIELD> Selected field for the access plan calculation (The secondary index on this field is still required)
SEQNUM Sequence number
01 - 99 Priority of the table (only relevant if you have several big tables in the IL and you want to prioritize them). The smaller number means higher priority.
TABNAME_CLONE Table Name Clone
Leave it blank. Not relevant for HANA scenario.
MAX_IN_BLOCK Hint Max in bloc Leave it blank. Not relevant for HANA scenario.
READING_TYPE Reading Type acc. plan. calculation
Reading type 1-acc. plan. calculation
GENERIC_PARALLEL gener paral unknown Not relevant for HANA scenario.
NUM_RECS_LOGTAB NumRecLogTab You could over steer the default value of 5.000 or leave it blank.
NUM_ORA_HASH_PAR Hash Partitions Leave it blank. Not relevant for HANA scenario.
SEQUN_CACHE_SIZE seq cache size Leave it blank. Not relevant for HANA scenario.
SPC_LOGTAB_CLEAN spc. logtab cle Leave it blank. Not relevant for HANA scenario.
9
Go to transaction SE16 and input IUUC_PERF_OPTION as the table name, click button “Create Entries” (F5).
On the next screen, click button “New Entries” (F5) again.
The following screenshot shows you an example:
10
In this example we plan to replicate table MARA under Mass Transfer ID 915. We use MATNR as the selected field for access plan calculation (*). We plan to utilize 2 parallel jobs to load this table. * A secondary index that only contains field MATNR should be created against table MARA in the source system before you start to process this table.
11
Start the load
1. Select load/replicate for the target tables Make sure all the jobs are running in SLT.
Select to load/replicate tables via HANA studio Data Provisioning Editor.
How to check the parallel runtime info
Transaction: MWBMON (on SLT server)
In the initial screen, enter your Mass Transfer ID in field “Identification” and “Access plan” = 1.
Click EXECUTE (F8).
Switch to tab stripe “Runtime Information”. You will see statistics data for the tables that are
being loaded at the moment. Column “RUN” = <n> means n processes are loading this table.
To refresh the data, click button “Refresh”. If there are new tables starting to load, you might
have to Exit (Shift+F3) and enter again.