BI2012 Christensen 20 Tips and Tricks

57
© 2012 Wellesley Information Services. All rights reserved. 20 Tips and Tricks to Improve Data Load Performance Jesper Christensen COMERIT

Transcript of BI2012 Christensen 20 Tips and Tricks

  • In This Session Gain insight into SAP NetWeaver BW data load processes, how they work, and what tools are available to monitor and optimize their performanceReceive best practices to maximize data load performance while reducing long-term maintenance costs Understand the benefits of optimized data load processes Find out how to enable version history to track code changes and how to create reusable ETL logic to improve throughput and reduce data load timeGet tips on when and how to use customer exits in DataSources and variables to manage risk and reduce maintenance costs Identify the challenges and benefits of semantic partitioning and the importance of efficient data models*

  • *What Well Cover Loading data in SAP NetWeaver BWFinding performance bottlenecksOptimizing the databaseOptimizing the ABAP codeOptimizing the data modelsOptimizing the data updatesWrap-up

  • SAP NetWeaver BW Data Load Processing OverviewSAP NetWeaver BW data load processing consists of three main activities:Extraction = Collecting the data in the source systems and preparing it before sending it to SAP NetWeaver BWTransformation = Transforming the data using routines, lookups, formulas, etc.Load = Updating the data into InfoProviders DataStore Objects (DSOs), cubes, and master data

    *

  • Dataflow in SAP NetWeaver BW*Source: SAP

  • Extraction Interface Types*Source: help.sap.com

  • *DataSources Supported by SAP NetWeaver ExtractionSAP NetWeaver BW Service API Allows data from SAP systems in standardized form to be extracted and accessed directlyThese can be SAP application systems or SAP NetWeaver BW systemsFile interfaceThe file interface permits the extraction from and direct access to files, such as csv files Web servicesPermit you to send data to the SAP NetWeaver BW system under external control

  • *DataSources Supported by SAP NetWeaver Extraction (cont.)Universal Data (UD) ConnectPermits the extraction from and direct access to relational dataDatabase (DB) ConnectPermits the extraction from and direct access to data located in tables or views of a database management system Staging Business Application Programming Interfaces (BAPIs)Open interfaces that SAP BusinessObjects DataServices and certified third-party tools can use to extract data from older systems

  • *Extraction Time Can Be Split into Two CategoriesExtraction timeDB time to select the data to be extractedLogic applied during extraction such as joins, lookups, and filteringMiddleware and network time The time used to transfer the data from the source system to the target SAP NetWeaver BW systemInterface types such as Web services and Universal Data (UD) Connect are good for small amounts of data and cannot handle large volumesFixed format files are larger to transfer but faster to load into SAP NetWeaver BWWAN Network time can become a bottleneck during peak hours

  • *Transformation TypesSAP NetWeaver BW supports the 3.x and the 7.x versions of transforming the data3.x is using Transfer rules and Update rulesTwo steps of logic to process the datasetLoads to different targets must be processed togetherUsed to have better performance than transformationsOld method; no more development or performance enhancements; do not continue to use7.x is using transformationsIs using a single step of logic to process the datasetLoads to different targets can be processed independentlyBetter performanceAlways use this option for new development

  • *Loading Data to Information Providers TypesLoading of the data to InfoProviders differs depending on typeDSOUpdate of the activation queueActivation of data (update of active table and changelog)SID determinationShould in general be switched off for DSOsMaster dataUpdate of master data tablesSID determinationCheck duplicate key values Very time consuming for time-dependent attributesAttribute change run to activate the master dataGenerate navigation data

  • Loading Data to Information Providers Types (cont.)Loading of the data to InfoProviders differs depending on type (cont.)CubesUpdate of data to the InfoCube star schemaSID determinationRoll up data to aggregatesUpdate data to SAP NetWeaver BW Accelerator (SAP NetWeaver BWA)Performance considerations for loading the dataEnsure that the database parameters are in placeImplement the correct SAP NetWeaver BW settings for your InfoProviders

    *

  • *What Well Cover Loading data in SAP NetWeaver BWFinding performance bottlenecksOptimizing the databaseOptimizing the ABAP codeOptimizing the data modelsOptimizing the data updatesWrap-up

  • Tip 01: SAP NetWeaver BW 7.x StatisticsSAP NetWeaver BW includes a great statistics tool It collects information on most SAP NetWeaver BW-specific activity Such as data loads and queriesIts delivered as business contentSo you must activate it just like all business contentHow to Activate Admin Cockpit document on help.sap.comhttp://help.sap.com/saphelp_nw04s/helpdata/en/46/f9bd550d40537de10000000a1553f6/frameset.htm

  • Tip 01: SAP NetWeaver BW 7.x Statistics (cont.)Define standard measure that can be monitored on a daily, weekly, and monthly basis to evaluate data load performance trendsRecords processed per minute or Time to process 1 million recordsTime spent on extractionTime spent in transformationsTop 10 long running loadsTotal time spent for Attribute and Hierarchy change runsUse the standard queries and reports as a starting point

    *

  • Tip 02: See Details About Performance in the MonitorThe load monitor transaction code RSMO gives more details about the processing stepsInfoPackage detailsData Transfer Process (DTP) details

    *

  • *Tip 03: Use SE30 to Test PerformanceTransaction code SE30 ABAP Runtime Analysis gives a detailed view of performanceRemember to set the accuracy to LowRun transaction code RSA3Note: SE30 can also be used for transformations by simulating the DTP run

  • Tip 03: Use SE30 to Test Performance (cont.)Detailed Runtime will show you the bottlenecksSort descending based on Net Time and you will see your bottleneck on the top

  • *What Well Cover Loading data in SAP NetWeaver BWFinding performance bottlenecksOptimizing the databaseOptimizing the ABAP codeOptimizing the data modelsOptimizing the data updatesWrap-up

  • Tip 04: Implement the Correct DB ParametersKey DB parametersSAP has recommended some parameter values for SAP NetWeaver BW that usually improve performanceExpect to evaluate these parameter settings frequently, though, to ensure that the DB operates optimallySee three key SAP Notes:830576 Parameter recommendations for Oracle 10g387946 Use of locally managed tablespaces for BW systems1044441 Basis parameterization for NW 7.0 BI systems

    *

  • *Tip 05: Manage Database StatisticsDB statistics are also crucial for SAP NetWeaver BW performanceThe DB will not know the most optimal execution path for an SQL statement without DB statisticsTo set up DB statistics:Set up BRCONNECT job using DB20 to recalculate DB statisticsUse program RSANAORA to analyze specific tables

    DB statistics can run very slowly under Oracle when you use SAP NetWeaver BW programs or DB statistics. Make sure you use BRCONNECT.

  • Tip 06: Build Secondary Indices The select statements used during extraction or during user exit enhancements should always use a database indexBuild secondary indices in transaction code SE11 or on the DSO objects used in select statements

    *

  • *What Well Cover Loading data in SAP NetWeaver BWFinding performance bottlenecksOptimizing the databaseOptimizing the ABAP codeOptimizing the data modelsOptimizing the data updatesWrap-up

  • Tip 07: Coding Tips Dynamic Calls Code the extractor user exits so that they call a dynamic program per DataSourceIsolate the code per DataSource in a self-contained programMinimize risk that a syntax error in code for one DataSource impacts extraction from all other DataSources ExampleProgram name = ZBW + Form name = DOZBW + This same technique can be used with customer exit variable code*

  • *Tip 07: Coding Tips Dynamic Calls (cont.)Illustration: Sample dynamic program call

  • Tip 08: Coding Tips Field Symbols Performance consideration: Where possible, use field symbols to populate fields in the data package The move costs of a LOOP ... INTO statement depend on the size of a table lineThe larger the line size, the longer the move will takeBy applying a LOOP... ASSIGNING statement you can attach a field symbol to the table lines and operate directly on the line contentsThis is a much faster way to access the internal table lines without moving their contents

    *

  • *Tip 08: User Exit Field Symbols Illustration: Sample use of field symbols User Exit (without field-symbols)

    REPORT YBWZDS_AGR_USER.****************************************************************** Form called dynamically must start with DOYBW + *****************************************************************

    FORM DOYBWZDS_AGR_USER TABLES C_T_DATA STRUCTURE ZOXBWD0001. data: l_logsys type logsys. l_s_data like ZOXBWD0001.

    select single logsys from t000 into l_logsys where mandt = sy-mandt. loop at c_t_data into l_s_data. l_s_data-load_dt = sy-datum. l_s_data-logsys = l_logsys. modify c_t_data from l_s_data index sy-tabix. endloop.ENDFORM.User Exit (with field-symbols)

    REPORT YBWZDS_AGR_USER.****************************************************************** Form called dynamically must start with DOZBW + *****************************************************************

    FORM DOYBWZDS_AGR_USER TABLES C_T_DATA STRUCTURE ZOXBWD0001. data: l_logsys type logsys. field-symbols: like c_t_data. select single logsys from t000 into l_logsys where mandt = sy-mandt. loop at c_t_data assigning . -load_dt = sy-datum. -logsys = l_logsys. endloop.ENDFORM.

  • *Tip 09: Coding Tips Read Instead of LoopUse a READ statement to access a table rather than a LOOP WHEREThe cost of a LOOP WHERE is much higher than a READ with table key or binary search statementThe READ can also be used prior to a loop statement that does require a LOOP to then use a LOOP FROM INDEX instead of LOOP WHERE

  • *Tip 09: User Exit: Read Instead of LoopIllustration: Sample use of field symbols User Exit (without read)

    REPORT YBW2LIS_13_VDITM.****************************************************************** Form called dynamically must start with DOYBW + *****************************************************************

    FORM DOYBW2LIS_13_VDITM TABLES C_T_DATA STRUCTURE ZOXBWD0001. data: l_logsys type logsys. l_s_data like ZOXBWD0001.

    field-symbols: like c_t_data, like VBAP.

    Loop at c_t_data assigning .Loop at itab assigning where VBELN = c_t_data-VEBLN. c_t_data-NETVALUE = c_t_data-NETVALUE + - NETWR. endloop.Endloop.ENDFORM.User Exit (with read)

    REPORT YBWZDS_AGR_USER.****************************************************************** Form called dynamically must start with DOZBW + *****************************************************************

    FORM DOYBWZDS_AGR_USER TABLES C_T_DATA STRUCTURE ZOXBWD0001. data: l_logsys type logsys, l_idx type sy-tabix.field-symbols: like c_t_data, like VBAP.

    Loop at c_t_data assigning .READ TABLE ITAB WITH TABLE KEY VBELN = c_t_data-VEBLN BINARY SEARCH.L_idx = sy-tabix.Loop at itab assigning FROM INDEX l_idx. check -VBELN = c_t_data-VEBLN. c_t_data-NETVALUE = c_t_data-NETVALUE + - NETWR. endloop.endloop.

    ENDFORM.

  • *Tip 10: Delta Enable Generic DataSources Improve extract performance by creating delta-enabled generic DataSourcesSimple:By date By timestampBy sequential number (unique table key) Complex: Pointers ABAP techniques can be used to record an array of pointers to identify new and changed records

  • *Tip 10: Delta Enable Generic DataSources (cont.)Illustration: Delta enabling a generic DataSource Ensure that you set the upper or lower limits correctly based on the data you are extracting!

  • Tip 11: Lookups Do not use single selects for lookups! For better performance: Use start routines to read lookup data to an internal table Read internal table to populate field values in routines For best performance: Add lookup fields to InfoSource Use start routine and field symbols to populate blank fields for entire data package at one time (see illustration on slide titled User Exit Field Symbols) *

  • Tip 12: Program Includes Use includes for all complex routine logic Access logic by using perform statements Increase portability of transformation logic Use same read statements for multiple lookups Reduce risk of errors in obscure places Decrease maintenance cost of complex update rules One place to go to fix/enhance logic Code is consistent and easier to follow Enable version management of codeTrack changes over timeCompare between systemsRevert to previous versions*

  • *Tip 12: Program Includes (cont.) Illustration Select into internal table Start routine

    FORM startup TABLES MONITOR STRUCTURE RSMONITOR "user defined monitoring MONITOR_RECNO STRUCTURE RSMONITORS DATA_PACKAGE STRUCTURE DATA_PACKAGE USING RECORD_ALL LIKE SY-TABIX SOURCE_SYSTEM LIKE RSUPDSIMULH-LOGSYS CHANGING ABORT LIKE SY-SUBRC. "set ABORT 0 to cancel update**$*$ begin of routine - insert your code only below this line *-*

    * fill the internal tables "MONITOR" and/or "MONITOR_RECNO",* to make monitor entries

    perform READ_USR02_TO_MEMORY_FOR_0BWTC_C02 TABLES MONITOR DATA_PACKAGE USING RECORD_ALL SOURCE_SYSTEM CHANGING ABORT.* if abort is not equal zero, the update process will be canceled* ABORT = 0.*$*$ end of routine - insert your code only before this line *-* Program include

    ****************************************************************** INITIALIZATION (ONE-TIME PER DATA PACKET) ********************** TO READ FROM DATABASE (ALL RECORDS FOR DATA PACKAGE) **************************************************************************** FORM READ_USR02_TO_MEMORY_FOR_0BWTC_C02*---------------------------------------------------------------*Form READ_USR02_TO_MEMORY_FOR_0BWTC_C02 TABLES MONITOR STRUCTURE RSMONITOR DATA_PACKAGE STRUCTURE /BIC/CS80BWTC_C02 USING RECORD_ALL LIKE SY-TABIX SOURCE_SYSTEM LIKE RSUPDSIMULH-LOGSYS CHANGING ABORT LIKE SY-SUBRC.

    * Refresh the internal table. refresh: GT_USR02.

    * Read USR02 user data to memory for this data package select * into corresponding fields of table GT_USR02 from USR02 FOR ALL ENTRIES IN DATA_PACKAGE where BNAME = DATA_PACKAGE-TCTUSERNM order by primary key.

    * if abort is not equal zero, the update process will be canceled ABORT = 0.ENDFORM. "READ_USR02_TO_MEMORY_FOR_0BWTC_C02

  • *Tip 12: Program Includes (cont.) Illustration Include perform statements Update routine

    FORM compute_key_field TABLES MONITOR STRUCTURE RSMONITOR "user defined monitoring USING COMM_STRUCTURE LIKE /BIC/CS0BWTC_C02 RECORD_NO LIKE SY-TABIX RECORD_ALL LIKE SY-TABIX SOURCE_SYSTEM LIKE RSUPDSIMULH-LOGSYS CHANGING RESULT LIKE /BI0/V0BWTC_C02T-USERGROUP RETURNCODE LIKE SY-SUBRC ABORT LIKE SY-SUBRC. "set ABORT 0 to cancel update**$*$ begin of routine - insert your code only below this line*-** fill the internal table "MONITOR", to make monitor entries PERFORM READ_GT_USR02 USING COMM_STRUCTURE-TCTUSERNM RECORD_NO RECORD_ALL SOURCE_SYSTEM CHANGING GS_USR02 ABORT.

    RESULT = GS_USR02-CLASS.*if abort is not equal zero, the update process will be canceled*$*$ end of routine - insert your code only before this line *-*ENDFORM.Program include

    ****************************************************************** RECORD PROCESSING (RUN PER RECORD) ***************************** TO READ FROM MEMORY (ONE RECORD) ************************************************************************************************ FORM READ_GT_USR02*---------------------------------------------------------------*FORM READ_GT_USR02 USING TCTUSERNM LIKE USR02-BNAME RECORD_NO LIKE SY-TABIX RECORD_ALL LIKE SY-TABIX SOURCE_SYSTEM LIKE RSUPDSIMULH-LOGSYS CHANGING GS_USR02 ABORT LIKE SY-SUBRC. "ABORT0 cancels update

    STATICS: L_RECORD LIKE SY-TABIX. IF RECORD_NO L_RECORD. L_RECORD = RECORD_NO. CLEAR GS_USR02.

    * Read user data from internal table GT_USR02 READ TABLE GT_USR02 WITH KEY BNAME = TCTUSERNM INTO GS_USR02.

    ENDIF.ENDFORM. "READ_GT_USR02

  • Tip 13: Use Start and End RoutinesStart routines can be used to process the data efficiently prior to starting the single records processingThe most efficient place to delete records from the data package prior to spending time on processing themEnd routines in SAP NetWeaver 7.x allows for processing of the data after it has been passed through the transformationIt is the most efficient place to copy data records (e.g., for generating year-to-date figures)*

  • *What Well Cover Loading data in SAP NetWeaver BWFinding performance bottlenecksOptimizing the databaseOptimizing the ABAP codeOptimizing the data modelsOptimizing the data updatesWrap-up

  • *Tip 14: Data Modeling: Defining DimensionsUse as many dimensions as possibleSeparate common filter characteristics into own dimensionUse line-item dimensions for high cardinality characteristics such as document numbersDo not set the high cardinality flag!Define related characteristics in the same dimensionCalculate expected number of dimensional entriesTry not to exceed 10% of expected fact table entriesVerify the dimension design after the first dataloads using program SAP_INFOCUBE_DESIGNSAdd all relevant time characteristicsIf 0CALMONTH is lowest granularity, add 0CALMONTH2, 0CALQUARTER, 0CALQUART1, 0HALFYEAR, and 0CALYEARProvides greatest reporting flexibility without need to reload

  • *Tip 15: Implement Semantic PartitioningWhat is it?An architectural design to enable parallel data loading and query executionPartitioning criteria: Year, Region, or Actual/Plan

    Source: SAP

  • *Tip 15: Implement Semantic Partitioning (cont.)Benefits of semantic partitioning:Reduction in SAP NetWeaver BWA footprint (when partitioned by year)Parallel data loading (when not partitioned by year)Parallel query executionBest case when partitioning criterion is set as constantAlmost as good to create variables to filter on 0INFOPROVArchival of a single InfoCube does not impact othersEasier DB maintenancePerformance benefits are so significant semantic partitioning should be deployed on virtually every data model!

  • *Tip 15: Implement Semantic Partitioning (cont.)Example: Semantic partitioning by year Source: SAP

  • *What Well Cover Loading data in SAP NetWeaver BWFinding performance bottlenecksOptimizing the databaseOptimizing the ABAP codeOptimizing the data modelsOptimizing the data updatesWrap-up

  • *Tip 16: Switch Off SID Determination for DSOsSwitch off SID determination for DSOs that are not used in reportingSID determination is required only for report DSOs and take up 40-70% of the activation time

  • *Tip 17: Activate Parallel ProcessingParallel processing is possible for most steps in SAP NetWeaver BWDTP Parallel Processing

    DSO settings Transaction code RSODSO_SETTINGS

  • *Tip 18: Compress DataCompression of InfoCubes helps with two things in the dataflow:Makes the tables that are updated smaller and hence faster to updateThe process variant that drops and recreates the indices during loading in a process deletes only the indices on the F-fact table and hence the time to rebuild indices is much fasterRecommendationCompress data that is older than 2-8 days depending on your load schedule

  • Tip 19: Implement Number Range Buffering of DIMs and SIDsThe number range tables (NRIV) are called for every new distinct record that is loaded to SAP NetWeaver BW as either master data or dimension in an InfoCubeThe NRIV table is accessed with a select for update statement, which can be quite slowBuffering should be done as follows:Determine the large number ranges (Document numbers, Dimensions with documents or many distinct values)Goto t-code SNRO and set up buffering

    *

  • *Tip 20: Implement SAP NetWeaver BW AcceleratorSAP NetWeaver BWA is superior to aggregates when it comes to improving performanceAggregates require continuous tuning as the data and query requirements change over timeSAP NetWeaver BWA requires limited maintenance effort in comparison

    If you can afford it, you should invest in SAP NetWeaver BWA

  • *Tip 20: Implement SAP NetWeaver BW Accelerator (cont.)Disk speed is growing slower than other hardware components

    *In-memory data stores

    Multi-channel UI, high event volume, cross industry value chains

    Application-aware and intelligent data managementDisk-based data storage

    Simple consumption of apps (fat client UI, EDI)

    General-purpose, application-agnostic database19902010Architectural Drivers Improvement20101990216 Addressable Memory2502x50.15MB/$0.02MB/$Memory5066x253.31MIPS/$0.05MIPS/$CPUTechnology Drivers 600MBPS5MBPSDisk Data Transfer 120x1000 x100Gbps100MbpsNetwork Speed264 248x Source: 1990 numbers SAP AG, 2010 numbers, Dr. BergPhysical hard drive speeds grew by only 120 times since 1990. All other hardware components grew faster.

  • Source: SAPTip 20: Implement SAP NetWeaver BW Accelerator (cont.)In this example, the average query execution took 58.8 seconds; after SAP NetWeaver BW Accelerator, the average query took 17.9 seconds (295% faster overall)

    *Real example

  • Tip 20: Implement SAP NetWeaver BW Accelerator (cont.)With SAP NetWeaver BW 7.3, you can have data in SAP NetWeaver BW Accelerator; InfoCubes are not requiredThis saves the loading time to the BW cube start schemaYou should implement SAP NetWeaver BWA if you want to consistently improve query performance and data load performance

    *

  • Tip 20: Implement SAP NetWeaver BW Accelerator (cont.)SAP NetWeaver BWA is an appliance, but it does require some maintenance activities to keep it running smoothlyMonitor SAP NetWeaver BWA utilization to avoid overloadingThe rule of thumb is that you should have data that is less than 50% of the memory sizeOverloading SAP NetWeaver BWA will cause performance degradationCompress the cubes and rebuild indices on a regular basisSAP NetWeaver BWA is not a cheap toy. The licensing is based on blades used. Avoid using more space than needed by dropping and rebuilding the SAP NetWeaver BWA indices on a regular basis

    *

  • *Gather information about end-user query requirements and drill-down patternsYou can suggest aggregates based on query design Execute the query multiple times using realistic drill-down scenariosAllow time for users to execute queries and collect SAP NetWeaver BW statisticsYou can suggest aggregates based on SAP NetWeaver BW statistics Analyze the use of aggregatesModify aggregates for optimizationBefore aggregate creation:After aggregate creation:Tip 20: Implement SAP NetWeaver BW Accelerator (cont.)Avoid aggregates but consider as a back up for SAP NetWeaver BW AcceleratorThey come at a costAdditional step in data loadingLonger runtime for master data and hierarchy activationsCheck that the query is using the aggregate via RSRT

  • *What Well Cover Loading data in SAP NetWeaver BWFinding performance bottlenecksOptimizing the databaseOptimizing the ABAP codeOptimizing the data modelsOptimizing the data updatesWrap-up

  • ResourcesJoe Darlak of COMERIT, SAP NetWeaver BI and Portals 2010 conference (Orlando, Florida) Practical Tips to Improve Data Loading Performance and Efficiency in SAP NetWeaver by Up to 75%

    TrainingBW360 BW Performance and Administration class

    *

  • *7 Key Points to Take HomeUse the SAP NetWeaver BW statistics to find data loads that require optimization target to optimize top 5-10 every monthUse SE30 to analyze ABAP runtime for DataSources and transformationsReview and implement the recommended database parameters for SAP NetWeaver BWEnsure that all SQL statements used in the data loading process are using indices and that statistics are calculated for the tablesMake sure that the ABAP coding used in extraction exits and transformation is optimizedReview and optimize the data models to avoid unnecessary processingUse parallel processing during data loading and updates

  • *Your Turn!How to contact me:Jesper Moselund [email protected]

  • *DisclaimerSAP, R/3, mySAP, mySAP.com, SAP NetWeaver, Duet, PartnerEdge, and other SAP products and services mentioned herein as well as their respective logos are trademarks or registered trademarks of SAP AG in Germany and in several other countries all over the world. All other product and service names mentioned are the trademarks of their respective companies. Wellesley Information Services is neither owned nor controlled by SAP.

    ***************