Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

49
Teradata Facilitator Kunal Agarwal

description

About utilities in teradata

Transcript of Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Page 1: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Teradata

Facilitator

Kunal Agarwal

Page 2: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Teradata BTEQ BTEQ stands for Basic Teradata Query

It is a general-purpose, command-based program that allows users on a workstation to communicate

with one or more Teradata RDBMS systems

Allows user to format reports for both print and screen output.

BTEQ formats the results and returns them to the screen, a file, or to a designated printer.

Can be run in both interactive and batch mode

Page 3: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Frequently used BTEQ commands

LOGON - start a BTEQ session,

SESSIONS - specify the number of sessions to use

LOGOFF - end the current sessions without exiting BTEQ

EXIT or QUIT - end the current sessions and exit BTEQ

OS - execute an MS-DOS, PC-DOS, or UNIX command from within the BTEQ environment

RUN - execute Teradata SQL requests and BTEQ commands from a specified run file

EXPORT - open a file with a specific format to transfer information from the Teradata RDBMS

IMPORT - open a file with a specific format to transfer information to the Teradata RDBMS

Page 4: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Testing & Branching commands

IF...THEN... - Tests the condition stated in the IF clause, then resumes command execution based on the outcome of the test.

GOTO - specifies the destination of the branch operation.

.LABEL labelname

.GOTO labelname

Testing status valuesERRORCODE - Indicates the actual completion code associated with the

request.

ERRORLEVEL - Indicates the severity level associated with an error code.

ACTIVITYCOUNT - Indicates the actual number of rows affected by the request.

Page 5: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Testing & Branching

Testing conditions

= equal to

<> , != , ~= , or ^= not equal to

> greater than

>= greater than or equal to

< less than

<= less than or equal to

Example

.IF ACTIVITYCOUNT > 0 THEN .GOTO label1

.IF ERRORCODE <> 0 THEN .GOTO label2

Page 6: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Termination Error codesWhen a job terminates, the utility returns a completion code to theclient system:

• 00 = Normal completion• 04 = Warning• 08 = User error• 12 = Severe internal error• 16 = No message destination available

Page 7: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Case Sampling

Facilitator

Kunal Agarwal

Page 8: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Sampling

Select employee_no

From employee_table

Sample 5;

Or .5 (50 percentage)

Mulitiple samples

Select employee_no, SAMPLEID

From employee_table

Sample 2,2,2;

Page 9: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

SELECT Student_ID ,Course_ID ,SAMPLEID FROM student_course_table SAMPLE 5, 5, 5 ORDER BY 3, 1, 2 ;

Page 10: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Select employeeid,departmentno,sampleid

From employee2

SAMPLE

WHEN departmentno=100 then .25

WHEN departmentno=200 then .25

WHEN departmentno=300 then .25

WHEN departmentno=400 then .25

END ;

Page 11: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Multiload

Facilitator

Kunal Agarwal

Page 12: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Multi Load

MultiLoad is a command-driven utility

Can be used to do fast, high-volume maintenance on multiple tables and views

Each MultiLoad import task can do multiple data insert, update, and delete functions on up to five different tables or views.

Each MultiLoad delete task can remove large numbers of rows from a single table.

Can be run in interactive or batch mode

Page 13: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Multi Load – What it does… Five different phases in a multiload task

Preliminary phase - Parses and validates all of the MultiLoad commands and Teradata SQL statements in your MultiLoad job.

- Establishes sessions and process control with the Teradata RDBMS.

- Submits special Teradata SQL requests to the Teradata RDBMS.

- Creates and protects temporary work tables and error tables in the Teradata RDBMS.

DML Transaction - submits the DML statements specifying the insert, update and delete tasks to the Teradata RDBMS.

Acquisition - Imports data from the specified input data source.

(for an import task) - Evaluates each record according to specified application

conditions.

- Loads the selected records into the worktables in the

Teradata RDBMS.

(There is no acquisition phase activity for a MultiLoad delete task.)

Page 14: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Multi Load – What it does…Application - Acquires locks on the specified target tables and views in the

Teradata RDBMS.

- For an import task, inserts the data from the temporary work

tables into the target tables or views in the Teradata RDBMS.

- For a delete task, deletes the specified rows from the target

table in the Teradata RDBMS.

- Updates the error tables associated with each MultiLoad task.

Cleanup - Forces an automatic restart/rebuild if an AMP went down

and came back online during the application phase.

- Releases all locks on the target tables and views.

- Drops the temporary work tables and all empty error tables

from the Teradata RDBMS.

- Reports the transaction statistics associated with the import

and delete tasks.

Page 15: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Frequently used MultiLoad Commands

Support commands

ACCEPT Allows the value of one or more utility variables to be accepted from either a file or an environment variable.

LOGOFF Disconnects all active sessions and terminates MultiLoad on the client system.

LOGON Specifies the LOGON command string to be used in connecting all sessions established by MultiLoad.

LOGTABLE Identifies the table to be used to journal checkpoint information required for safe, automatic restart of MultiLoad when the client or Teradata RDBMS system fails.

RUN FILE Invokes the specified external source as the current source of commands and statements.

SYSTEM Suspends operation of MultiLoad and executes any valid local operating system command.

ROUTE MESSAGES Identifies the destination of output produced by MultiLoad support environment.

Page 16: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Frequently used MultiLoad CommandsTask commandsBEGIN MLOAD / Specifies:

BEGIN DELETE MLOAD • The kind of MultiLoad task to be executed

• The target tables in the Teradata RDBMS

• The parameters for executing the task

DML LABEL Defines a label and error treatment options for a following group of DML statements.

END MLOAD Indicates completion of MultiLoad command entries and initiates execution of the task.

FIELD Used with the LAYOUT command to define a field of the data source record that is sent to the Teradata RDBMS.

IMPORT Identifies the data source, the layout used to describe the data record and optional conditions for performing DML

operations.

LAYOUT Introduces the record format of the data source to be used in the MultiLoad task. This command is followed by a

sequence or combination of FIELD, FILLER, and TABLE

commands.

TABLE Used with the LAYOUT command to identify a table whose column names and data descriptions are used as the

field names and data descriptions of the data source records.

Page 17: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Supported SQL statements in MultiLoad

Some of the supported SQL statements are:

ALTER TABLE Change the column configuration or options of an

existing table.

COLLECT STATISTICS Collect statistical data for one or more columns of a table.

CREATE DATABASE Create a new database, macro, table or view

CREATE MACRO

CREATE TABLECREATE

DATABASE Specify a new default database for the current session.

DELETE Remove rows from a table.

UPDATE Change the column values of an existing row in a table.

Page 18: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Invoking MultiLoad

mload </home/mluser/tests/test1 >/home/mluser/tests/out1

This command specifies both an input file and an output file:

• /home/mluser/tests/test1 is the input file that provides the MultiLoad job script.

• /home/mluser/tests/out1 is the destination file for output data.

mload </home/mluser/tests/test1

This command specifies only an input file. In this case, the output is written to the standard output device

mload

This command specifies neither an input nor an output device. In this case, terminal provides both the command input and the output data destination.

Page 19: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Fastload

Facilitator

Kunal Agarwal

Page 20: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Fast Load Command driven utility used for loading large amount of

data into Teradata tables

Uses multiple sessions to load data

Only one table can be loaded per job

Target table must be empty and have no secondary indexes

Does not load duplicate records even if the target is a multiset table.

Can be invoked in batch or interactive mode

Can be run on both network-attached and channel-attached systems

Page 21: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Fast Load – what it does… Logs on to the Teradata RDBMS for a specified number of

sessions, using the username, password and tdpid/acctid information.

Load the input data into the FastLoad table on the Teradata RDBMS.

Logs off from the Teradata RDBMS

If the load operation was successful, return the following information about the FastLoad operation and then terminate: Total number of records read, skipped and sent to the Teradata

RDBMS

Number of errors posted to the FastLoad error tables

Number of inserts applied

Number of duplicate rows

Page 22: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Fast Load – Input data formats Formatted data - conforms to the format of data from a

Teradata RDBMS source, such as BTEQ export file.

Each record has: A 2-byte data length field

A variable length indicator field (optional)

A variable length input data field

A 1-byte end-of record delimiter

Unformatted data – does not conform to Teradata source format

Each record has: A variable length indicator field (optional)

A variable length input data field

No end-of record delimiter

Page 23: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Fast Load – Input data formats

Binary format – similar to Formatted data format except that no end-of-record delimiter is present

Each record has: A 2-byte data length field

A variable length indicator field (optional)

A variable length input data field

Text format – similar to fixed width files

Variable-length text – similar to variable length delimited text files

The delimiter is specifed using SET RECORD command in the fastload script

Page 24: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Frequently used FastLoad Commands

Session control commands:LOGOFF/QUIT Ends FastLoad sessions and terminates FastLoad.

LOGON Begins one or more FastLoad sessions.

OS Enters client operating system commands.

SESSIONS Specifies the number of FastLoad sessions logged on with a LOGON command and, optionally, the minimum number of sessions required to run the job.

SHOW Shows the current field/file definitions established by DEFINE

commands.

SLEEP Specifies the number of minutes that FastLoad pauses before

retrying a logon operation.

TENACITY Specifies the number of hours that FastLoad continues trying to

log on when the maximum number of load jobs

Page 25: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Frequently used FastLoad CommandsData handling commands:

BEGIN LOADING Identifies the tables used in the FastLoad operation

DEFINE Describes each field of an input data source record and

specifies the name of the input data source

END LOADING Informs the Teradata RDBMS that all input data has been sent.

ERRLIMIT Limits the number of errors detected during the loading phase - Processing stops when the limit is reached.

RECORD Specifies the number of a record in an input data source at which FastLoad begins to read data

SET RECORD Specifies that the input data records are either:

• Formatted

• Unformatted

• Binary

• Text

• Variable-length text Note: The SET RECORD command applies only to network-attached systems.

Page 26: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

FastLoad – supported SQL statements

Fastload supports only the following SQL statements:

CREATE TABLE Defines the columns, index and other qualities of a

table.

DATABASE Changes the default database.

DELETE Deletes rows from a table.

DROP TABLE Removes a table and all of its rows from a database.

INSERT Inserts rows into a table.

Page 27: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Invoking Fastload

Can be invoked in either interactive mode or batch mode• fastload </home/fluser/tests/test1 >/home/fluser/tests/out1

This command specifies both an input file and an output file:

• /home/fluser/tests/test1 is the input file (FastLoad job script.

• /home/fluser/tests/out1 is the destination file for output data.

• fastload </home/fluser/tests/test1

This command specifies only an input file. In this case, the output is written to the standard output device

• fastload

This command specifies neither an input nor an output device (interactive mode)

Page 28: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Terminating Fastload Normal termination

.Logoff/.Quit command – terminates and logs off the fastload session

– If given before END LOADING the fastload is paused – useful in multi-file fastload

Return Status:

0 – job completes normally

4 – warning condition occurred

8 – user error occurred

12 – fatal error occurred

16 – no message destination is available

The 2 error tables are either dropped or maintained based on the return code

Abort termination Network attached system – press ctrl+c thrice

Channel attached system – abort using the client system console

Page 29: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Paused Fastload…

The fastload may be paused due to error conditions or for running multifile fastload jobs.

When a fastload is paused, the error tables are not dropped and the target table is locked.

For multifile fastload, no end loading will be specified.

In multifile fastload job, the target table will be in locked state till the final load script with the end loading command is run.

To release the target table lock, run the fastload script with same error tables and only begin & end loading statements.

Page 30: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

FastExport

Facilitator

Kunal Agarwal

Page 31: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Fast Export

FastExport is a command-driven utility uses multiple sessions to quickly

Uses multiple sessions to quickly transfer large amount of data from tables and views of the Teradata to a client-based application

Can be invoked in batch or interactive mode

Can be run on both network-attached and channel-attached systems

Page 32: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Fast Export – What it does…

Logs on to the Teradata RDBMS for a specified number of sessions, using the username, password and tdpid/acctid information.

Retrieve the specified data from the Teradata RDBMS, in accordance with the format and selection specifications.

Export the data to the specified file

Logs off the Teradata RDBMS.

Can generate a MultiLoad script if mlscript option is specified.

Page 33: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

FastExport Commands

Page 34: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

FastExport Commands

Page 35: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

FastExport Commands

Page 36: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

FastExport Commands

Page 37: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

FastExport – Supported Teradata SQL Statements

Page 38: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

FastExport – Supported Teradata SQL Statements

Page 39: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Invoking FastExport fexp < /home/fexpuser/tests/test1 > /home/fexpuser/tests/out1

This command specifies both an input file and an output file:

/home/fexpuser/tests/test1 is the input file that provides the FastExport job script.

/home/fexpuser/tests/out1 is the destination file for output data

fexp < /home/fexpuser/tests/test1

This command specifies only an input file. In this case, the output is written

to the standard output device, which is usually your terminal.

fexp

This command specifies neither an input nor an output file.

In this case, the terminal provides both the command input and output data

destination.

Page 40: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Terminating FastExport

Normal termination .Logoff command – terminates and logs off the fastload session

Return Status:

0 – job completes normally

4 – warning condition occurred

8 – user error occurred

12 – fatal error occurred

16 – no message destination is available

The restart log tables are either dropped or maintained based on the return code

Abort termination Network attached system – press ctrl+c thrice

Channel attached system – abort using the client system console

Page 41: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Restarting paused FastExport job

Restart log tables are not dropped Pausing can be intentional or due to error Unintentional conditions that can pause the job can be due to:

FastExport script error Hardware failure

Restarting the paused FastExport job Resubmit the job after correcting the cause of the error On restarting the job continues from the last checkpoint saved.

Page 42: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

TPump

Facilitator

Kunal Agarwal

Page 43: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

TPump

TPump is a data loading utility that helps you maintain (update, delete, insert) the data

Used if the system is too busy to devote a designated batch window to upload data

Provides an alternative to MultiLoad for the low volume batch maintenance of large databases

Command driven

Complements Multiload

Multiload – bulk load entire data in a single batch

TPump – load the data in a stream throughout the day

Page 44: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Tpump-Real Time usage.

To load near real time data in multiple tables.

For Cloning different operations on Target tables at the same time by different users.

To load into the Table where the traffic is different. [Morning high, evening low etc..,.]

Page 45: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Difference between Tpump and Mload

Tpump loads max of 64 tables, Mload loads 5.

No Table level lock in Tpump, row level block instead.

Creates Macros for doing DML operations. Mload runs in phases.

Less Volume of data is handled by Tpump, more is handled by Moad.

Packet by Packet process in Tpump where as block by block in Mload.

Tpump has only one error table, Mload has 2 error tables.

Page 46: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

TPump Commands ACCEPT Allows the value of one or more utility variables

to be accepted from a file.

LOGOFF Disconnects all active sessions and terminates

TPump support on the client.

LOGON Specifies the LOGON string to be used in connecting

all sessions established by TPump.

LOGTABLE Identifies the table to be used for journaling

checkpoint information required for safe, automatic

restart of the TPump support environment

RUN FILE Invokes the specified external source as the current

source of commands and statements.

SET Assigns a data type and a value to a utility variable.

SYSTEM Suspends TPump to issue commands to the local OS

Page 47: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

TPump Commands BEGIN LOAD - Specifies the kind of TPump task to be executed,

the target tables and the parameters for

executing the task.

FIELD - Defines a field of the data source record. Used with

LAYOUT command.

DML - Defines a label and error treatment option(s) for TD.

END LOAD - Indicates completion of TPump command entries

and initiates execution of the task.

IMPORT - Identifies the data source, the layout, and the DML

operation(s) to be performed

LAYOUT - Introduces the record format of the data source to

be used in the TPump task. This command is

followed by a sequence or combination of FIELD,

FILLER, and TABLE commands.

Page 48: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

TPump –Supported SQLs

ALTER TABLE

CREATE DATABASE

CREATE MACRO

CREATE TABLE

CREATE VIEW

DATABASE

DELETE

DELETE DATABASE

DROP DATABASE

INSERT

RENAME

Page 49: Teradata Bteq,Mload,Fload,Fexport,Tpump and Sampling

Invoking TPump

tpump </home/mluser/tests/test1 >/home/mluser/tests/out1

This command specifies both an input file and an output file:

• /home/mluser/tests/test1 is the input file that provides the Tpump job script.

• /home/mluser/tests/out1 is the destination file for output data.

tpump </home/mluser/tests/test1

This command specifies only an input file. In this case, the output is written to the standard output device