B307 Mload Part 2

21
TCS Internal Module 7: A MultiLoad Application After completing this module, you will be able to: Describe the tables involved in a MultiLoad job. Set error limits as a time value or as a percentage of loaded rows. Specify a checkpoint interval. Redefine input record layout.

Transcript of B307 Mload Part 2

Page 1: B307 Mload Part 2

TCS Internal

Module 7: A MultiLoad Application

After completing this module, you will be able to:

Describe the tables involved in a MultiLoad job.

Set error limits as a time value or as a percentage of loaded rows.

Specify a checkpoint interval.

Redefine input record layout.

Page 2: B307 Mload Part 2

TCS Internal

New Accounts Application – Description

Transaction(s)• Applied to Balance• Added to History

NewAccountInformation

Customer details• New• Existing

MULTILOAD

CustomerC #UPI

Accounts A #UPI

Trans_HistA # DATE

NUPI

Account_CustomerA # C #

NUSIUPI

• Each New Account requires an INSERT into the Accounts table and an INSERT into Account_Customer.

• An Account can be opened for new or pre-existing customer(s) – UPSERT to the Customer table.

• Each New Account will probably require an opening transaction – UPDATE to Account (balance) and INSERT to Trans_History.

Page 3: B307 Mload Part 2

TCS Internal

New Accounts Application Script (1 of 3)

.LOGTABLE Newacct_logtable_mld;

.LOGON tdpid/username, password;

.BEGIN IMPORT MLOAD TABLES Accounts, Account_Customer, Customer, Trans_Hist ;

.LAYOUT New_Acc_Layout ;

.FILLER in_Field_Indicator 1 CHAR(1) ;

.FIELD in_Account_Number 2 INTEGER ;

.FIELD in_Number * INTEGER ;

.FIELD in_Street * CHAR(25) ;

.FIELD in_City * CHAR(20) ;

.FIELD in_State * CHAR(2) ;

.FIELD in_Zip_Code * INTEGER ;

.FIELD in_Balance_Forward * INTEGER ;

.FIELD in_Balance_Current * INTEGER ;

.FIELD in_Customer_Number 2 INTEGER ;

.FIELD in_Last_Name * CHAR(25) ;

.FIELD in_First_Name * CHAR(20) ;

.FIELD in_Social_Security * INTEGER ;

.FIELD in_AC_Account_Number 2 INTEGER ;

.FIELD in_AC_Customer_Number * INTEGER ;

.FIELD in_Trans_Number 2 INTEGER ;

.FIELD in_Trans_Account_Number * INTEGER ;

.FIELD in_Trans_ID * INTEGER ;

.FIELD in_Trans_Amount * INTEGER ;

Page 4: B307 Mload Part 2

TCS Internal

New Accounts Application Script (2 of 3)

.DML LABEL Label_A ; INSERT INTO Accounts VALUES ( :in_Account_Number, :in_Number, :in_Street, :in_City ,:in_State, :in_Zip_Code,

:in_Balance_Forward, :in_Balance_Current ) ;

.DML LABEL Label_B ; INSERT INTO Account_Customer VALUES ( :in_AC_Account_Number, :in_AC_Customer_Number);

.DML LABEL Label_C MARK MISSING UPDATE ROWS DO INSERT FOR MISSING UPDATE ROWS ; UPDATE Customer SET Last_Name = :in_Last_Name WHERE Customer_Number = :in_Customer_Number ; INSERT INTO Customer VALUES ( :in_Customer_Number, :in_Last_Name, :in_First_Name, :in_Social_Security ) ;

.DML LABEL Label_D ; INSERT INTO Trans_Hist VALUES ( :in_Trans_Number, DATE, :in_Trans_Account_Number, :in_Trans_ID, :in_Trans_Amount ) ; UPDATE Accounts

SET Balance_Current =Balance_Current + :in_Trans_Amount WHERE Account_Number =:in_Trans_Account_Number ;

.IMPORT INFILE datain4 LAYOUT New_Acc_LayoutAPPLY Label_A WHERE (in_Field_Indicator = 'A' )APPLY Label_B WHERE (in_Field_Indicator = 'B' )APPLY Label_C WHERE (in_Field_Indicator = 'C' )APPLY Label_D WHERE (in_Field_Indicator = 'D' ) ;

.END MLOAD ;

Page 5: B307 Mload Part 2

TCS Internal

New Accounts Application Script (3 of 3)

.IF (&SYSINSCNT1 > 0 OR &SYSUPDCNT1 > 0) THEN;COLLECT STATISTICS ON Accounts;

.ENDIF;

.IF (&SYSINSCNT2 > 0) THEN;COLLECT STATISTICS ON Account_Customer;

.ENDIF;

.IF (&SYSINSCNT3 > 0 OR &SYSUPDCNT3 > 0) THEN;COLLECT STATISTICS ON Customer;

.ENDIF;

.IF (&SYSINSCNT4 > 0) THEN;COLLECT STATISTICS ON Trans_Hist;

.ENDIF;

.LOGOFF ;

Note:.BEGIN IMPORT MLOAD TABLES Accounts, Account_Customer, Customer, Trans_Hist ;

1st table 2nd table 3rd table 4th table

MultiLoad environment variables can be checked to optionally COLLECT STATISTICS as part of the job.

&SYSDELCNT_ where _ is 1 to 5&SYSINSCNT_ "&SYSUPDCNT_ "&SYSETCNT_ "&SYSUVCNT_ "

Page 6: B307 Mload Part 2

TCS Internal

.BEGIN IMPORT Task Commands

• Specifies the tables and optionally the work and error tables used in this MultiLoad job.

• Also used to specify miscellaneous MultiLoad options such as checkpoint, sessions, etc.

.BEGIN [IMPORT] MLOAD

TABLES tname1, tname2, ...

WORKTABLES wt_table1, wt_table2, …

ERRORTABLES et_ table1 uv_table1, et_ table2 uv_table2, ...

ERRLIMIT errcount [errpercent]

CHECKPOINT rate (Default – 15 min.)

SESSIONS limit (Default – 1 per AMP + 2)

TENACITY hours (Default – 4 hours)

SLEEP minutes (Default – 6 minutes)

AMPCHECK NONE | APPLY | ALL

NOTIFY OFF | LOW | MEDIUM | HIGH . . .

;

.END MLOAD ;

Page 7: B307 Mload Part 2

TCS Internal

Work Tables

IMPORT Task: WORKTABLES wt_table1, wt_table2, …

DELETE Task: WORKTABLES wt_table1

• Default is in user’s default database and the work table is named WT_TableName.

• Alternative may be specified as DataBaseName.WorkTableName.

• There must be one work table defined for each data table.

.BEGIN [IMPORT] MLOAD

TABLES Employee, PayCheck

WORKTABLES util_db.WT_Emp

,util_db.WT_Pay

ERRORTABLES util_db.ET_Emp util_db.UV_Emp

,util_db.ET_Pay util_db.UV_Pay

. . . ;

.BEGINparameters

.BEGINparameters

The Error Tables are described on the next page.

Example:

Page 8: B307 Mload Part 2

TCS Internal

Error Tables

ERRORTABLES et_tab1 uv_tab1, et_tab2 uv_tab2, ...

• Error table 1 (et)– default is the user’s database and the table is named ET_Tablename. – contains any errors that occur in the Acquisition Phase. – contains primary index overflow errors that occur in the Application phase

• Error table 2 (uv)– default is the user’s database and the table is named UV_Tablename. – contains Application Phase errors.

• Uniqueness violations• Constraint errors• Overflow errors on columns other than primary index

.BEGIN [IMPORT] MLOAD

TABLES Employee, PayCheck

WORKTABLES util_db.WT_Emp

,util_db.WT_Pay

ERRORTABLES util_db.ET_Emp util_db.UV_Emp

,util_db.ET_Pay util_db.UV_Pay

. . . ;

.BEGINparameters

.BEGINparameters

Example:

Page 9: B307 Mload Part 2

TCS Internal

ERRLIMIT

ERRLIMIT ErrCount

Without ERRPERCENT:

• Specifies approximate number of data errors permitted during Acquisition.

• Does not count Uniqueness violations.

ERRLIMIT ErrCount ErrPercent

With ERRPERCENT:

• Specifies a percentage of data errors after an approximate number of records has been transmitted.

.BEGINparameters

.BEGINparameters

ERRLIMIT 10000 5Example:

In this example, after processing 10,000 input records, the system looks for an error rate of 5%.

Page 10: B307 Mload Part 2

TCS Internal

CHECKPOINT

• Rate may be specified in the Acquisition Phase of a complex IMPORT task as:

– A number of incoming records (exact count; not less than 60)

– A time interval in minutes (approximate; less than 60)

• If no CHECKPOINT value is specified MultiLoad will checkpoint every 15 minutes, and at the end of each Phase. The default is 15 minutes.

.BEGINparameters

.BEGINparameters

CHECKPOINT 30Example 1: In this example, a 30-minute time interval is specified.

CHECKPOINT 100000Example 2: In this example, a checkpoint after 100,000 input records is specified.

Page 11: B307 Mload Part 2

TCS Internal

More .BEGIN Parameters

SESSIONS

• Used to specify the maximum, and optionally, minimum sessions generated by MultiLoad.

TENACITY

• Number of hours MultiLoad will try to establish a connection to the system.

• The default is 4 hours.

SLEEP

• Number of minutes MultiLoad waits before retrying a logon; must be greater than 0.

• The default is 6 minutes.

NOTIFY

• NOTIFY LOW• NOTIFY MEDIUM for the most significant events.• NOTIFY HIGH for every MultiLoad event that involves an operational decision point.• NOTIFY OFF suppresses the notify option.

.BEGINparameters

.BEGINparametersSESSIONS 64 48

TENACITY 10

SLEEP 3

NOTIFY OFF Note: The MultiLoad manual specifies in detail which events are associated with each level.

Page 12: B307 Mload Part 2

TCS Internal

More .BEGIN Parameters: AMPCHECK

AMPCHECK NONE | APPLY | ALL

• NONE

– MultiLoad will not perform an AMPCHECK.

– It will proceed if AMPs are offline, provided all target tables are FALLBACK.

• APPLY

– MultiLoad will continue in all phases except the Application phase with AMPs offline, provided all target tables are FALLBACK.

– This is the default.

• ALL

– MultiLoad will not proceed with down AMPs, regardless of the protection-type of the target tables.

– Most conservative option.

.BEGINparameters

.BEGINparameters

Page 13: B307 Mload Part 2

TCS Internal

DELETE Task Commands

• Specifies the table and optionally the work and error tables used in this MultiLoad Delete task.

• Also used to specify miscellaneous MultiLoad options such as tenacity, sleep, etc.

.BEGIN DELETE MLOAD

TABLES tname1

WORKTABLES wt_table1

ERRORTABLES et_ table1

TENACITY hours (Default – 4 hours)

SLEEP minutes (Default – 6 minutes)

NOTIFY OFF | LOW | MEDIUM | HIGH . . .

;

.END MLOAD

Page 14: B307 Mload Part 2

TCS Internal

.LAYOUT Parameters — INDICATORS

INDICATORS

– The INDICATORS contain bits that, when equal to 1, represent a null data field.

.LAYOUT Record_Layout INDICATORS;

Indicator Byte(s) F1 F2 F3Physical Input Records

MultiLoad .LAYOUTDefinition

Page 15: B307 Mload Part 2

TCS Internal

.FIELD and .FILLER

.FIELD fieldname { startpos datadesc } || fieldexp

[ NULLIF nullexpr ]

[ DROP {LEADING / TRAILING } { BLANKS / NULLS }

[ [ AND ] {TRAILING / LEADING } { NULLS / BLANKS } ] ] ;

.FILLER [ fieldname ] startpos datadesc ;

.FIELD

• Input fields supporting redefinition and concatenation.

Startpos identifies the start of a field relative to 1.

Fieldexpr specifies a concatenation of fields in the format:

fieldname1 || fieldname2 [ || fieldname3 …]

The option DROP LEADING / TRAILING BLANKS / NULLS is applicable only to character datatypes, and is sent as a VARCHAR with a 2-byte length field.

.FILLER

• Identifies data NOT to be sent to the Teradata database.

Page 16: B307 Mload Part 2

TCS Internal

Redefining the Input — Example

• A bank loads daily transactions sequentially on a tape for batch processing by MultiLoad.

• An input data record might be an Add, Update or Delete, each of which has a different length and contains different fields, as illustrated:

Add New Account A PI F1 F2 F3

Update Existing Account U PI F4 F5

Delete Inactive Account D PI

.LAYOUT Record_Layout ;

.FILLER trans_type 1 CHAR(1) ;

.FIELD PI 2 INTEGER ;

.FIELD F1 * INTEGER ;

.FIELD F2 * CHAR(25) ;

.FIELD F3 * CHAR(20) ;

.FIELD F4 6 CHAR(2) ;

.FIELD F5 * INTEGER ;

Note: FILLER data is not placed into ML work tables.

Page 17: B307 Mload Part 2

TCS Internal

The .DML Command Options

Defines Labels along with Error Treatment conditions for one or more following INSERTs, UPDATEs or DELETEs to be applied under various conditions:

.DML LABEL Labelname

MARK | IGNORE DUPLICATE INSERT | UPDATE MARK | IGNORE MISSING UPDATE | DELETE DO INSERT FOR [MISSING UPDATE] ROWS ;

[

[ ][

ROWS

Operation: Default:

INSERT (Duplicate violation) Marked in UV_tablenameUPDATE (Duplicate violation) Marked in UV_tablenameUPDATE (Fails - missing row) Marked in UV_tablenameDELETE (Fails - missing row) Marked in UV_tablename

UPSERT (If successful) IgnoredUPSERT (Fails) Mark failure of INSERT in UV_tablename

Example of UPSERT failure: 1. PI value doesn’t exist, so UPDATE can’t occur. 2. INSERT fails because of check violation - e.g., can’t put character data in a numeric field.

MARK or IGNORE Whether or not to record duplicate or missingINSERT, UPDATE, OR DELETE rows into the UV_error_table and continue processing.

Page 18: B307 Mload Part 2

TCS Internal

The .DML Command Options (cont.)

.DML LABEL Labelname

MARK | IGNORE DUPLICATE INSERT | UPDATE MARK | IGNORE MISSING UPDATE | DELETE DO INSERT FOR [MISSING UPDATE] ROWS ;

[

[ ][

ROWS

DO INSERT FOR MISSING UPDATE Key statement that indicates an UPSERT. An SQL UPDATE followed by an SQL INSERT is required.

.DML LABEL Action1 DO INSERT FOR MISSING UPDATE ROWS;

The default for an UPSERT operation is to not mark missing update rows.

Example 1:

.DML LABEL Action2 MARK MISSING UPDATE ROWS DO INSERT FOR MISSING UPDATE ROWS;

Example 1:

When the MARK MISSING UPDATE ROWS is used with an UPSERT, this will place (in the UV_table) data rows that can’t be updated (no PI). If the insert also fails, the insert record is also marked in the UV_table.

Page 19: B307 Mload Part 2

TCS Internal

Summary

• On the .BEGIN statement, optionally, the names of work and error tables can be specified.

• You can:

– Specify error limits and checkpoints.

– Limit sessions.

– Designate time allowed for connection.

– Specify retry time limit.

– Designate the level of notification you prefer.

– Designate how MultiLoad will proceed if AMPs are offline.

• .LAYOUT parameters define the data format.

• .DML commands define Labels and Error Treatment conditions for one or more operations.

• You can use FastLoad or MultiLoad INMODs.

Page 20: B307 Mload Part 2

TCS Internal

Review Questions

1. Complete the BEGIN statement to accomplish the following:

– Specify an error limit count of 200,000 and an error percentage of 5%.

– Specify a checkpoint at 50,000 records.

– Limit the sessions to 4.

– Set the number of hours to try to establish connection as 6.

.LOGTABLE RestartLog_mld;

.LOGON ________________;

.BEGIN [IMPORT] MLOAD TABLES Trans_Hist

;

.END MLOAD ;

ERRLIMIT 200000 5

CHECKPOINT 50000

SESSIONS 4

TENACITY 6

Page 21: B307 Mload Part 2

TCS Internal

Review Questions

1. Complete the BEGIN statement to accomplish the following:

– Specify an error limit count of 200,000 and an error percentage of 5%.

– Specify a checkpoint at 50,000 records.

– Limit the sessions to 4.

– Set the number of hours to try to establish connection as 6.

.LOGTABLE RestartLog_mld;

.LOGON ________________;

.BEGIN [IMPORT] MLOAD TABLES Trans_Hist

;

.END MLOAD ;

ERRLIMIT 200000 5

CHECKPOINT 50000

SESSIONS 4

TENACITY 6