Day 1 Data Stage Administrator And Director 11.0

53
C3: Protected DataStage Administrator and Director Basic

description

DATA STAGE BASICS

Transcript of Day 1 Data Stage Administrator And Director 11.0

Page 1: Day 1 Data Stage Administrator And Director 11.0

C3: Protected

DataStage Administrator and

Director

Basic

Page 2: Day 1 Data Stage Administrator And Director 11.0

2©Copyright 2005, Cognizant Academy, All Rights Reserved

About the Author

Created By: Mandhagini P.S (127057)

Credential Information:

An expert in DataStage having 3 years of IT experience

Version and Date:

DS/PPT/1106/1.0

Page 3: Day 1 Data Stage Administrator And Director 11.0

3©Copyright 2005, Cognizant Academy, All Rights Reserved

Questions

A Welcome Break

Coding Standards

Demo Key Contacts

Reference

Test Your Understanding

Hands-on Exercise

Icons Used

Page 4: Day 1 Data Stage Administrator And Director 11.0

4©Copyright 2005, Cognizant Academy, All Rights Reserved

DataStage Administrator and Director: Overview

Introduction:

DataStage is a Widely used Data Warehousing (DW) tool used to develop

Complex ETL jobs. It has a unique feature of Real Time Integration and also

provides a very user friendly Interface. DataStage has many features to make

easier back end query.

DataStage administrator allows you to prepare the setup for DataStage Projects

and General Administration of DataStage

DataStage director allows you to monitor, schedule, and run the jobs and helps in

viewing the Job Log after running the job

Page 5: Day 1 Data Stage Administrator And Director 11.0

5©Copyright 2005, Cognizant Academy, All Rights Reserved

DataStage Administrator and Director: Objectives

Objective:After completing this chapter, you will be able to:

Identify what is DataStage tool

Define DataStage Administrator

Work with DataStage Administrator

Explain DataStage Director

Work with DataStage Director

Page 6: Day 1 Data Stage Administrator And Director 11.0

6©Copyright 2005, Cognizant Academy, All Rights Reserved

• Logging into a DataStage server using the Administrator requires the host name

of the server, the fully qualified name if necessary or the server’s IP address, and

an operating system username and password.

• For UNIX servers, users logging in as root or as a root-equivalent account, or as

dsadm will have full administrative rights.

• For Windows servers, users logging in who are members of the Local

Administrators (standalone server) or Domain Administrators (domain controller

or servers in an Active Directory Forest) groups will have full administrative

rights.

DataStage Administrator: Logging In

Page 7: Day 1 Data Stage Administrator And Director 11.0

7©Copyright 2005, Cognizant Academy, All Rights Reserved

Enter your operating system username and password

Enter the hostname or IP address of the server where DataStage is installed

DataStage Administrator: Logging In (Contd.)

The Administrator Login Dialog Box

Page 8: Day 1 Data Stage Administrator And Director 11.0

8©Copyright 2005, Cognizant Academy, All Rights Reserved

• This page lists the DataStage projects, and shows the pathname of the selected project in the Project pathname field. The Projects page has the following buttons:

– Add: Adds new DataStage projects. This button is enabled only if you have administrator status.

– Delete: Deletes projects. This button is enabled only if you have administrator status.

– Properties: Views or sets the properties of the selected project.

– NLS: Lets you change project maps and locales (if the NLS option was installed during the server installation).

– Command: Issues DataStage Engine commands directly from the selected project.

Viewing the Project List

Page 9: Day 1 Data Stage Administrator And Director 11.0

9©Copyright 2005, Cognizant Academy, All Rights Reserved

Tip: The default directory path in which to create projects is located under the root directory of the DataStage server installation. For example, if the server was installed to /appl/Ascential/DataStage the projects would be installed to /appl/Ascential/DataStage/Projects/{project name}.

Adding Projects

• Provided that you have the proper permissions, you can add as many projects to the DataStage server as necessary.

• In normal projects any DataStage developer can create, delete, or modify any object within the project once it has been created.

Page 10: Day 1 Data Stage Administrator And Director 11.0

10©Copyright 2005, Cognizant Academy, All Rights Reserved

Deleting Projects

Make sure you have a current backup of your project, just in case!

Highlight the project to be deleted

Page 11: Day 1 Data Stage Administrator And Director 11.0

11©Copyright 2005, Cognizant Academy, All Rights Reserved

General Project Options

• Enable job administration in Director - enabling this feature allows the user the

ability to Cleanup Resources and Clear Status File from within the Job menu of

DataStage Director.

• Enable Runtime Column Propagation for Parallel Jobs - if you enable this feature,

stages in parallel jobs can handle undefined columns that they encounter when

the job is run, and propagate these columns through to the rest of the stages in

the job.

• Auto-purge of job log - this setting will automatically purge job log entries for jobs

based on the auto-purge action setting. For example, if you specify to auto purge

up to the previous 3 job runs, entries for the previous 3 job runs are kept as new

job runs are completed.

Page 12: Day 1 Data Stage Administrator And Director 11.0

12©Copyright 2005, Cognizant Academy, All Rights Reserved

General Project Options (Contd.)

Auto purge settings for job logs—not a global or retroactive setting

Create Environmental Variables

Page 13: Day 1 Data Stage Administrator And Director 11.0

13©Copyright 2005, Cognizant Academy, All Rights Reserved

Setting Project-wise Environment Variables

• You can set project-wide defaults for general environment variables or ones

specific to parallel jobs from this page.

• You can also specify new variables. All of these are then available to be used in

jobs.

• In each of the categories except User Defined, only the default value can be

modified. In the User Defined category, users can create new environment

variables and assign default values.

Page 14: Day 1 Data Stage Administrator And Director 11.0

14©Copyright 2005, Cognizant Academy, All Rights Reserved

Setting Project-wise Environment Variables (Contd.)

Page 15: Day 1 Data Stage Administrator And Director 11.0

15©Copyright 2005, Cognizant Academy, All Rights Reserved

Enable Server-Side Job Tracing

Trace files that have been created

Enable or disable tracing in the project

View or delete the currently highlighted file

You can trace the activities on the server to help diagnose project problems.

Page 16: Day 1 Data Stage Administrator And Director 11.0

16©Copyright 2005, Cognizant Academy, All Rights Reserved

Validating User Account for Job Scheduling

Select a user account with proper access to the DataStage project

Verification that the currently selected user account can schedule jobs

• This tab applies to Windows NT/2000 servers only.• DataStage uses the Windows NT Schedule service to schedule jobs.

Page 17: Day 1 Data Stage Administrator And Director 11.0

17©Copyright 2005, Cognizant Academy, All Rights Reserved

Performance Tuning Options

Some performance tuning options are:• Row buffering• Hashed file stage caching

Page 18: Day 1 Data Stage Administrator And Director 11.0

18©Copyright 2005, Cognizant Academy, All Rights Reserved

Server Commands

Select a project and click ‘Command’

Enter a valid DataStage command

When you execute the command, a new window will show the response from the engine

Page 19: Day 1 Data Stage Administrator And Director 11.0

19©Copyright 2005, Cognizant Academy, All Rights Reserved

Assigning Roles (Operator/Developer) to User Accounts

There are four roles for a DataStage user account:

• DataStage Developer: Has full access to all areas of a DataStage project.

• DataStage Production Manager: Has full access to all areas of a DataStage

project, and can also create and manipulate protected projects.

• DataStage Operator: Has permission to run and manage DataStage jobs.

• <None>: Does not have permission to log on to DataStage.

Page 20: Day 1 Data Stage Administrator And Director 11.0

20©Copyright 2005, Cognizant Academy, All Rights Reserved

Select the user role, which is to be assigned to particular user accounts.

Assigning Roles (Operator/Developer) to User Accounts (Contd.)

Page 21: Day 1 Data Stage Administrator And Director 11.0

21©Copyright 2005, Cognizant Academy, All Rights Reserved

Settings for Parallel Jobs

• Enable Runtime Column Propagation for Parallel Jobs

When this feature is enabled, stages in parallel jobs can handle undefined

columns that they encounter when the job is run, and propagate these columns

through to the rest of the job.

• Enable Remote Execution of Parallel Jobs

Select this to specify that parallel jobs in this project are to be deployed on USS

machine (Unix systems Services). When this option is selected, the Remote tab

is enabled and you can specify details about the jobs that are deployed

Page 22: Day 1 Data Stage Administrator And Director 11.0

22©Copyright 2005, Cognizant Academy, All Rights Reserved

Enable these options.

Settings for Parallel Jobs (Contd.)

Page 23: Day 1 Data Stage Administrator And Director 11.0

23©Copyright 2005, Cognizant Academy, All Rights Reserved

Settings for Parallel Jobs (Contd.)

Page 24: Day 1 Data Stage Administrator And Director 11.0

24©Copyright 2005, Cognizant Academy, All Rights Reserved

DataStage Director: Logging In

• Logging into a DataStage server using the Director requires.

• The host name of the server, the fully qualified name if necessary, or the server’s

IP address and the operating system username and password.

Page 25: Day 1 Data Stage Administrator And Director 11.0

25©Copyright 2005, Cognizant Academy, All Rights Reserved

Enter your operating system username and password

Enter the hostname or IP address of the server where DataStage is installed

Select the project to attach to

The Director Login Dialog Box

DataStage Director: Logging In (Contd.)

Page 26: Day 1 Data Stage Administrator And Director 11.0

26©Copyright 2005, Cognizant Academy, All Rights Reserved

• The Job Status view shows the status of all the jobs in the currently selected job category, or, if the job category pane is hidden, in the current project. The view has the following columns:– Job name: The name of the job.– Status: The status of the job.– Started on date: The time and date a job was started. These fields are only

filled in for a job with a status of Running.– Last ran on date: The time and date the job was finished, stopped, or

aborted. These columns are blank for jobs that have never been run.– Description: A description of the job, if available.

• To view more details about a job’s status, select the job and do one of the following:– Choose View —> Detail.– Right-click to display the shortcut menu and choose Detail.– Double-click the job.

Viewing the Job Run Status

Page 27: Day 1 Data Stage Administrator And Director 11.0

27©Copyright 2005, Cognizant Academy, All Rights Reserved

Detailed information about a job’s status

Viewing the Job Run Status (Contd.)

Page 28: Day 1 Data Stage Administrator And Director 11.0

28©Copyright 2005, Cognizant Academy, All Rights Reserved

Validating a Job

• You can check that a job or job invocation will run successfully by validating it.

• Jobs should be validated before running them for the first time, or after making

any significant changes to job parameters. When a server job is validated, the

following checks are made without actually extracting, converting, or writing data.

• Connections are made to the data sources or data warehouse.

• SQL SELECT statements are prepared.

• Files are opened. Intermediate files in Hashed File, UniVerse, or ODBC stages

that use the local data source are created, if they do not already exist.

Page 29: Day 1 Data Stage Administrator And Director 11.0

29©Copyright 2005, Cognizant Academy, All Rights Reserved

Click Validate when Job Run Options and parameters have been set

Validating a Job (Contd.)

Page 30: Day 1 Data Stage Administrator And Director 11.0

30©Copyright 2005, Cognizant Academy, All Rights Reserved

Click Run when Job Run Options, parameters and tracing options have been set

Running a Job

Page 31: Day 1 Data Stage Administrator And Director 11.0

31©Copyright 2005, Cognizant Academy, All Rights Reserved

Monitoring a Job

Expand tree to see all links attached to an active stage

Optionally show CPU utilization for each

active stage

Page 32: Day 1 Data Stage Administrator And Director 11.0

32©Copyright 2005, Cognizant Academy, All Rights Reserved

Stopping a Job

Click Stop button to stop a running job

Page 33: Day 1 Data Stage Administrator And Director 11.0

33©Copyright 2005, Cognizant Academy, All Rights Reserved

• If a job has stopped or aborted, then it is difficult to determine whether all the

required data was written to the target data tables. When a job has a status of

Stopped or Aborted, you must reset it before running the job again. By resetting

a job, you set it back to a runnable state and, optionally, return your target files to

the state they were in before the job was run.

• To reset a job or job invocation:

1. Select the job or invocation you want to reset in the Job Status view.

2. Choose Job —> Reset or click the Reset button on the toolbar. A message

box appears.

3. Click Yes to reset the tables. All the files in the job are reinstated to the state

they were in before the job was run. The job’s status is updated to “Has been

reset”.

Resetting a Job

Page 34: Day 1 Data Stage Administrator And Director 11.0

34©Copyright 2005, Cognizant Academy, All Rights Reserved

Resetting a Job (Contd.)

Click Reset button to return a job to a runnable state

Page 35: Day 1 Data Stage Administrator And Director 11.0

35©Copyright 2005, Cognizant Academy, All Rights Reserved

Interpreting the Job Execution Details in Log View

Current run—blackPrevious run—blue

Additional information is available for this entry (…)

Page 36: Day 1 Data Stage Administrator And Director 11.0

36©Copyright 2005, Cognizant Academy, All Rights Reserved

Log Event Detail Window

Detail information can be copied to the system clipboard and pasted into a text editor—useful for sending errors to support!

Additional lines of information regarding this particular event

Page 37: Day 1 Data Stage Administrator And Director 11.0

37©Copyright 2005, Cognizant Academy, All Rights Reserved

Filtering Log Events

Where to start showing log entries

Where to stop showing log entries

How many log entries to show

What type of log entries to show

Page 38: Day 1 Data Stage Administrator And Director 11.0

38©Copyright 2005, Cognizant Academy, All Rights Reserved

Clearing Log Entries

Immediately delete log entries or automatically purge entries

Which entries to remove immediately

Which entries to remove automatically

Page 39: Day 1 Data Stage Administrator And Director 11.0

39©Copyright 2005, Cognizant Academy, All Rights Reserved

Options in Auto- Purge:

• Up to previous (job runs): Purges old log entries, leaving the specified number

of recent job run entries in the file.

• Older than (days): Purges all log entries older than the specified number of

days. Specify the number of job run entries or days by clicking the arrow buttons

or entering the value directly.

Clearing Log Entries (Contd.)

Page 40: Day 1 Data Stage Administrator And Director 11.0

40©Copyright 2005, Cognizant Academy, All Rights Reserved

Schedule View

Page 41: Day 1 Data Stage Administrator And Director 11.0

41©Copyright 2005, Cognizant Academy, All Rights Reserved

Scheduling a Job Execution

You can schedule a job to run in a number of ways:

• Once today at a specified time

• Once tomorrow at a specified time

• On a specific day and at a particular time

• Daily at a particular time

• On the next occurrence of a particular date and time

Page 42: Day 1 Data Stage Administrator And Director 11.0

42©Copyright 2005, Cognizant Academy, All Rights Reserved

Select a job and click Schedule button

Scheduling a Job Execution (Contd.)

Page 43: Day 1 Data Stage Administrator And Director 11.0

43©Copyright 2005, Cognizant Academy, All Rights Reserved

Rescheduling a Job Execution

Select a previously scheduled job and click Reschedule button

Page 44: Day 1 Data Stage Administrator And Director 11.0

44©Copyright 2005, Cognizant Academy, All Rights Reserved

Un-scheduling a Job Execution

Right click on a previously scheduled job and click Unschedule

Page 45: Day 1 Data Stage Administrator And Director 11.0

45©Copyright 2005, Cognizant Academy, All Rights Reserved

Cleaning Up Resources

• If the Enable Job Administration in Director option has been set in the DataStage

Administrator, then certain functions are available to help you clean up the

resources of a job that has hung or aborted or return a job to a state in which you

can rerun it after the cause of the problem has been fixed.

• You should use them with care, and only after you have tried to reset the job and

you are sure it has hung or aborted.

• The Cleanup Resources command lets you:

– View and end job processes

– View and release the associated locks

Page 46: Day 1 Data Stage Administrator And Director 11.0

46©Copyright 2005, Cognizant Academy, All Rights Reserved

Cleaning Up Resources (Contd.)

Operating system’s process ID number

Logout (kill) selected O/S process

Engine locks associated with processes

Page 47: Day 1 Data Stage Administrator And Director 11.0

47©Copyright 2005, Cognizant Academy, All Rights Reserved

Clearing the Status File

Select a hung job and select Clear Status File from Job menu

Page 48: Day 1 Data Stage Administrator And Director 11.0

48©Copyright 2005, Cognizant Academy, All Rights Reserved

Clearing the Status File (Contd.)

Before you clear a status file you should:

• Try to reset the job.

• Ensure that all the job’s processes have ended.

Page 49: Day 1 Data Stage Administrator And Director 11.0

49©Copyright 2005, Cognizant Academy, All Rights Reserved

• Allow time for questions from participants

Page 50: Day 1 Data Stage Administrator And Director 11.0

50©Copyright 2005, Cognizant Academy, All Rights Reserved

• What is the use of having User Defined Environment Variables?

• Can a DataStage operator manipulate a protected Project?

• What is the default cache size of a Hash size?

• When will “Clear Status File” be enabled in Director?

• What does (…) in the JOB LOG mean?

• Where do you see the CPU Utilization of each stage in a job?

Test Your Understanding

Page 51: Day 1 Data Stage Administrator And Director 11.0

51©Copyright 2005, Cognizant Academy, All Rights Reserved

DataStage Administrator and Director: Summary

• DataStage is an ETL tool widely used in Data Warehousing. It has 4 components: Administrator, Director, Designer and Manager.

• Administrator can be used to:

– Create or delete projects

– Assign roles to user accounts

– Set project specific environment variables

– Enable tracing and Performance tuning

• Director can be used to:

– View job statistics

– Validate/Run/Monitor/Stop/Reset and Schedule jobs

– View logs/ filter log events and clear log entries

– Clean up job resources

Page 52: Day 1 Data Stage Administrator And Director 11.0

52©Copyright 2005, Cognizant Academy, All Rights Reserved

DataStage Administrator and Director: Source

• DataStage 7.5.1 manual

Disclaimer: Parts of the content of this course is based on the materials available from the Web sites and books listed above. The materials that can be accessed from linked sites are not maintained by Cognizant Academy and we are not responsible for the contents thereof. All trademarks, service marks, and trade names in this course are the marks of the respective owner(s).

Page 53: Day 1 Data Stage Administrator And Director 11.0

You have successfully completed

DataStage Administrator and Director.