Working With the Load Balancer Overview

download Working With the Load Balancer Overview

of 14

Transcript of Working With the Load Balancer Overview

  • 8/14/2019 Working With the Load Balancer Overview

    1/14

    Working with the Load Balancer Overview

    By PenchalaRaju.Yanamala

    The Load Balancer dispatches tasks to Integration Service processes running onnodes. When you run a workflow, the Load Balancer dispatches the Session,Command, and predefined Event-Wait tasks within the workflow. If theIntegration Service is configured to check resources, the Load Balancer matchestask requirements with resource availability to identify the best node to run a

    task. It may dispatch tasks to a single node or across nodes.

    To identify the nodes that can run a task, the Load Balancer matches theresources required by the task with the resources available on each node. Itdispatches tasks in the order it receives them. When the Load Balancer hasmore Session and Command tasks to dispatch than the Integration Service canrun at the time, the Load Balancer places the tasks in the dispatch queue. Whennodes become available, the Load Balancer dispatches the waiting tasks fromthe queue in the order determined by the workflow service level.

    You assign resources and service levels using the Workflow Manager. You can

    perform the following tasks:

    Assign service levels. You assign service levels to workflows. Service levelsestablish priority among workflow tasks that are waiting to be dispatched.Assign resources. You assign resources to tasks. Session, Command, andpredefined Event-Wait tasks require PowerCenter resources to succeed. If theIntegration Service is configured to check resources, the Load Balancerdispatches these tasks to nodes where the resources are available.

    Assigning Service Levels to Workflows

    Service levels determine the order in which the Load Balancer dispatches tasksfrom the dispatch queue. When multiple tasks are waiting to be dispatched, theLoad Balancer dispatches high priority tasks before low priority tasks. You createservice levels and configure the dispatch priorities in the Administration Console.

    You assign service levels to workflows on the General tab of the workflowproperties.

    Assigning Resources to Tasks

    PowerCenter resources are the database connections, files, directories, nodenames, and operating system types required by a task to make the task succeed.The Load Balancer may use resources to dispatch tasks. If the IntegrationService is not configured to run on a grid or check resources, the Load Balancerignores resource requirements. It dispatches all tasks to the master IntegrationService process running on the node.

    If the Integration Service runs on a grid and is configured to check resources, theLoad Balancer uses resources to dispatch tasks. The Integration Servicematches the resources required by tasks in a workflow with the resourcesavailable on each node in the grid to determine which nodes can run the tasks.

  • 8/14/2019 Working With the Load Balancer Overview

    2/14

    The Load Balancer distributes the Session, Command, and predefined Event-Wait tasks to nodes with available resources. For example, if a session requiresa file resource for a reserved words file, the Load Balancer dispatches thesession to nodes that have access to the file. A task fails if the IntegrationService cannot identify a node where the required resource is available.

    In the Administration Console, you define the resources that are available toeach node. Resources are either predefined or user-defined. Predefinedresources include connections available to a node, node name, and operating

    system type. User-defined resources include file/directory resources and customresources.

    In the task properties, you assign PowerCenter resources to nonreusable tasksthat require those resources. You cannot assign resources to reusable tasks.

    Table 24-1 lists resource types and the repository objects to which you canassign them:

    Table 24-1. Resource Types and Associated Repository Objects

    ResourceType

    Predefined/User-Defined

    Repository Objects that Use Resources

    Custom User-defined Session, Command, and predefined Event-Wait taskinstances and all mapping objects within a session.

    File/Directory User-defined Session, Command, and predefined Event-Wait taskinstances, and the following mapping objects within asession:-Source qualifiers

    -

    Aggregator transformation-Custom transformation

    -External Procedure transformation

    -Joiner transformation

    -Lookup transformation

    -Sorter transformation

    -Custom transformation-Java transformation

    -HTTP transformation

    -SQL transformation

    -Union transformation

    -Targets

  • 8/14/2019 Working With the Load Balancer Overview

    3/14

    Node Name Predefined Session, Command, and predefined Event-Wait taskinstances and all mapping objects within a session.

    OperatingSystem Type

    Predefined Session, Command, and predefined Event-Wait taskinstances and all mapping objects within a session.

    If you try to assign a resource type that does not apply to a repository object, theWorkflow Manager displays the following error message:

    The selected resource cannot be applied to this type of object. Please select adifferent resource.

    The Workflow Manager assigns connection resources. When you use arelational, FTP, or external loader connection, the Workflow Manager assigns theconnection resource to sources, targets, and transformations in a sessioninstance. You cannot manually assign a connection resource in the WorkflowManager.

    To assign resources to a task instance:

    1.Open the task properties in the Worklet or Workflow Designer.If the task is an Event-Wait task, you can assign resources only if the task waitsfor a predefined event.2.On the General tab, click Edit.3.In the Edit Resources dialog box, click the Add button to add a resource.

    4.

    In the Select Resource dialog box, choose an object you want to assign aresource to. The Resources list shows the resources available to the nodeswhere the Integration Service runs.

    5.Select the resource to assign and click Select.6.In the Edit Resources dialog box, click OK.

    Row Error Logging Overview

    When you configure a session, you can log row errors in a central location. Whena row error occurs, the Integration Service logs error information that lets youdetermine the cause and source of the error. The Integration Service logsinformation such as source name, row ID, current row data, transformation,timestamp, error code, error message, repository name, folder name, sessionname, and mapping information.

    You can log row errors into relational tables or flat files. When you enable error

    logging, the Integration Service creates the error tables or an error log file thefirst time it runs the session. Error logs are cumulative. If the error logs exist, theIntegration Service appends error data to the existing error logs.

    You can log source row data. Source row data includes row data, source row ID,and source row type from the source qualifier where an error occurs. TheIntegration Service cannot identify the row in the source qualifier that contains anerror if the error occurs after a non pass-through partition point with more thanone partition or one of the following active sources:

    Aggregator

  • 8/14/2019 Working With the Load Balancer Overview

    4/14

    Custom, configured as an active transformationJoinerNormalizer (pipeline)RankSorter

    By default, the Integration Service logs transformation errors in the session logand reject rows in the reject file. When you enable error logging, the IntegrationService does not generate a reject file or write dropped rows to the session log.

    Without a reject file, the Integration Service does not log Transaction Controltransformation rollback or commit errors. If you want to write rows to the sessionlog in addition to the row error log, you can enable verbose data tracing.

    Note: When you log row errors, session performance may decrease because theIntegration Service processes one row at a time instead of a block of rows atonce.

    Error Log Code Pages

    The Integration Service writes data to the error log file differently depending on

    the Integration Service process operating system:

    UNIX. The Integration Service writes data to the error log file using theIntegration Service process code page. However, you can configure theIntegration Service to write to the error log file using UTF-8 by enabling theLogsInUTF8 Integration Service property.Windows. The Integration Service writes all characters in the error log file usingthe UTF-8 encoding format.

    The code page for the relational database where the error tables exist must be asubset of the target code page. If the error log table code page is not a subset of

    the target code page, the Integration Service might write inconsistent data in theerror log tables.

    Understanding the Error Log Tables

    When you choose relational database error logging, the Integration Servicecreates the following error tables the first time you run a session:

    PMERR_DATA. Stores data and metadata about a transformation row error andits corresponding source row.PMERR_MSG. Stores metadata about an error and the error message.

    PMERR_SESS. Stores metadata about the session.PMERR_TRANS. Stores metadata about the source and transformation ports,such as name and datatype, when a transformation error occurs.

    You specify the database connection to the database where the IntegrationService creates these tables. If the error tables exist for a session, the IntegrationService appends row errors to these tables.

    Relational database error logging lets you collect row errors from multiplesessions in one set of error tables. To do this, you specify the same error log

  • 8/14/2019 Working With the Load Balancer Overview

    5/14

    table name prefix for all sessions. You can issue select statements on thegenerated error tables to retrieve error data for a particular session.

    You can specify a prefix for the error tables. The error table names can have upto eleven characters. Do not specify a prefix that exceeds 19 characters whennaming Oracle, Sybase, or Teradata error log tables, as these databases have amaximum length of 30 characters for table names. You can use a parameter orvariable for the table name prefix. Use any parameter or variable type that youcan define in the parameter file. For example, you can use a session parameter,

    $ParamMyErrPrefix, as the error log table name prefix, and set$ParamMyErrPrefix to the table prefix in a parameter file.

    The Integration Service creates the error tables without specifying primary andforeign keys. However, you can specify key columns.

    PMERR_DATA

    When the Integration Service encounters a row error, it inserts an entry into thePMERR_DATA table. This table stores data and metadata about atransformation row error and its corresponding source row.

    The following table describes the structure of the PMERR_DATA table:

    Column Name Datatype Description

    REPOSITORY_GID Varchar Unique identifier for the repository.

    WORKFLOW_RUN_ID Integer Unique identifier for the workflow.

    WORKLET_RUN_ID Integer Unique identifier for the worklet. If a sessionis not part of a worklet, this value is 0.

    SESS_INST_ID Integer Unique identifier for the session.

    TRANS_MAPPLET_INST Varchar Name of the mapplet where an error

    occurred.TRANS_NAME Varchar Name of the transformation where an error

    occurred.

    TRANS_GROUP Varchar Name of the input group or output groupwhere an error occurred. Defaults to eitherinput or output if the transformation doesnot have a group.

    TRANS_PART_INDEX Integer Specifies the partition number of thetransformation where an error occurred.

    TRANS_ROW_ID Integer Specifies the row ID generated by the lastactive source.

    TRANS_ROW_DATA LongVarchar

    Delimited string containing all column data,including the column indicator. Columnindicators are:D - validN - nullT - truncatedB - binaryU - data unavailableThe fixed delimiter between column data andcolumn indicator is colon ( : ). The delimiterbetween the columns is pipe ( | ). You can

  • 8/14/2019 Working With the Load Balancer Overview

    6/14

  • 8/14/2019 Working With the Load Balancer Overview

    7/14

    PMERR_MSG

    When the Integration Service encounters a row error, it inserts an entry into thePMERR_MSG table. This table stores metadata about the error and the errormessage.

    The following table describes the structure of the PMERR_MSG table:

    Column Name Datatype Description

    REPOSITORY_GID Varchar Unique identifier for the repository.

    WORKFLOW_RUN_ID Integer Unique identifier for the workflow.

    WORKLET_RUN_ID Integer Unique identifier for the worklet. If a session isnot part of a worklet, this value is 0.

    SESS_INST_ID Integer Unique identifier for the session.

    MAPPLET_INST_NAME Varchar Mapplet to which the transformation belongs.If the transformation is not part of a mapplet,this value is n/a.

    TRANS_NAME Varchar Name of the transformation where an erroroccurred.

    TRANS_GROUP Varchar Name of the input group or output groupwhere an error occurred. Defaults to eitherinput or output if the transformation doesnot have a group.

    TRANS_PART_INDEX Integer Specifies the partition number of thetransformation where an error occurred.

    TRANS_ROW_ID Integer Specifies the row ID generated by the lastactive source.

    ERROR_SEQ_NUM Integer Counter for the number of errors per row ineach transformation group. If a session hasmultiple partitions, the Integration Servicemaintains this counter for each partition.For example, if a transformation generatesthree errors in partition 1 and two errors inpartition 2, ERROR_SEQ_NUM generates thevalues 1, 2, and 3 for partition 1, and values 1and 2 for partition 2.

    ERROR_TIMESTAMP Date/Time Timestamp of the Integration Service whenthe error occurred.

    ERROR_UTC_TIME Integer Coordinated Universal Time, calledGreenwich Mean Time, of when an error

    occurred.ERROR_CODE Integer Error code that the error generates.

    ERROR_MSG LongVarchar

    Error message, which can span multiple rows.When the data exceeds 2000 bytes, theIntegration Service creates a new row. Theline number for each row error entry is storedin the LINE_NO column.

    ERROR_TYPE Integer Type of error that occurred. The IntegrationService uses the following values:1 - Reader error2 - Writer error

  • 8/14/2019 Working With the Load Balancer Overview

    8/14

    3 - Transformation error

    LINE_NO Integer Specifies the line number for each row errorentry in ERROR_MSG that spans multiplerows.

    Note: Use the column names in bold to join tables.

    PMERR_SESS

    When you choose relational database error logging, the Integration Service

    inserts entries into the PMERR_SESS table. This table stores metadata aboutthe session where an error occurred.

    The following table describes the structure of the PMERR_SESS table:

    Column Name Datatype Description

    REPOSITORY_GID Varchar Unique identifier for the repository.

    WORKFLOW_RUN_ID Integer Unique identifier for the workflow.

    WORKLET_RUN_ID Integer Unique identifier for the worklet. If a sessionis not part of a worklet, this value is 0.

    SESS_INST_ID Integer Unique identifier for the session.SESS_START_TIME Date/Tim

    eTimestamp of the Integration Service whena session starts.

    SESS_START_UTC_TIME Integer Coordinated Universal Time, calledGreenwich Mean Time, of when the sessionstarts.

    REPOSITORY_NAME Varchar Repository name where sessions arestored.

    FOLDER_NAME Varchar Specifies the folder where the mapping andsession are located.

    WORKFLOW_NAME Varchar Specifies the workflow that runs the sessionbeing logged.

    TASK_INST_PATH Varchar Fully qualified session name that can spanmultiple rows. The Integration Servicecreates a new line for the session name.The Integration Service also creates a newline for each worklet in the qualified sessionname. For example, you have a sessionnamed WL1.WL2.S1. Each component ofthe name appears on a new line:WL1

    WL2S1The Integration Service writes the linenumber in the LINE_NO column.

    MAPPING_NAME Varchar Specifies the mapping that the sessionuses.

    LINE_NO Integer Specifies the line number for each row errorentry in TASK_INST_PATH that spansmultiple rows.

    Note: Use the column names in bold to join tables.

  • 8/14/2019 Working With the Load Balancer Overview

    9/14

    PMERR_TRANS

    When the Integration Service encounters a transformation error, it inserts anentry into the PMERR_TRANS table. This table stores metadata, such as thename and datatype of the source and transformation ports.

    The following table describes the structure of the PMERR_TRANS table:

    Column Name Datatype Description

    REPOSITORY_GID Varchar Unique identifier for the repository.

    WORKFLOW_RUN_ID Integer Unique identifier for the workflow.

    WORKLET_RUN_ID Integer Unique identifier for the worklet. If a sessionis not part of a worklet, this value is 0.

    SESS_INST_ID Integer Unique identifier for the session.

    TRANS_MAPPLET_INST Varchar Specifies the instance of a mapplet.

    TRANS_NAME Varchar Name of the transformation where an erroroccurred.

    TRANS_GROUP Varchar Name of the input group or output groupwhere an error occurred. Defaults to either

    input or output if the transformation doesnot have a group.

    TRANS_ATTR Varchar Lists the port names and datatypes of theinput or output group where the erroroccurred. Port name and datatype pairs areseparated by commas, for example:portname1:datatype, portname2:datatype.

    This value can span multiple rows. Whenthe data exceeds 2000 bytes, theIntegration Service creates a new row for

    the transformation attributes and writes theline number in the LINE_NO column.

    SOURCE_MAPPLET_INST Varchar Name of the mapplet in which the sourceresides.

    SOURCE_NAME Varchar Name of the source qualifier. n/a appearswhen a row error occurs downstream of anactive source that is not a source qualifieror a non pass-through partition point withmore than one partition.

    SOURCE_ATTR Varchar Lists the connected field(s) in the source

    qualifier where an error occurred. When anerror occurs in multiple fields, each fieldname is entered on a new line. Writes theline number in the LINE_NO column.

    LINE_NO Integer Specifies the line number for each row errorentry in TRANS_ATTR andSOURCE_ATTR that spans multiple rows.

    Note: Use the column names in bold to join tables.

    Understanding the Error Log File

  • 8/14/2019 Working With the Load Balancer Overview

    10/14

    You can create an error log file to collect all errors that occur in a session. Thiserror log file is a column delimited line sequential file. By specifying a uniqueerror log file name, you can create a separate log file for each session in aworkflow. When you want to analyze the row errors for one session, use an errorlog file.

    In an error log file, double pipes || delimit error logging columns. By default, pipe| delimits row data. You can change this row data delimiter by setting the DataColumn Delimiter error log option.

    Error log files have the following structure:

    [Session Header]

    [Column Header]

    [Column Data]

    Session header contains session run information similar to the information storedin the PMERR_SESS table. Column header contains data column names.

    Column data contains row data and error message information.

    The following sample error log file contains a session header, column header,and column data:

    **********************************************************************

    Repository GID: fe4817ab-7d87-465f-9110-354222424df0

    Repository: CustomerInfo

    Folder: Row_Error_Logging

    Workflow: wf_basic_REL_errors_AGG_case

    Session: s_m_basic_REL_errors_AGG_case

    Mapping: m_basic_REL_errors_AGG_case

    Workflow Run ID: 1310

    Worklet Run ID: 0

    Session Instance ID: 19

    Session Start Time: 08/03/2004 16:57:01

    Session Start Time (UTC): 1067126221

    **********************************************************************

  • 8/14/2019 Working With the Load Balancer Overview

    11/14

    Transformation||Transformation Mapplet Name||Transformation Group||PartitionIndex||Transformation Row ID||Error Sequence||Error Timestamp||Error UTCTime||Error Code||Error Message||Error Type||Transformation Data||SourceMapplet Name||Source Name||Source Row ID||Source Row Type||Source Data

    agg_REL_basic||N/A||Input||1||1||1||08/03/2004 16:57:03||1067126223||11019||Port [CUST_ID_NULL]: Default value is: ERROR([ERROR]: [AGG] CUST_ID - NULL detected on input.\n... nl:ERROR(s:'[AGG]CUST_ID - NULL detected on input.')).||3||D:1221|N:|N:|N:|D:Kauai Dive Shoppe|

    D:4-976 Sugarloaf Hwy|D:Kapaa Kauai|D:HI|D:94766|D:[AGG] DEFAULT SIDVALUE.|D:01/01/2001 00:00:00||mplt_add_NULLs_to_QACUST3||SQ_QACUST3||1||0||D:1221|D:Kauai Dive Shoppe|D:4-976 Sugarloaf Hwy|D:Kapaa Kauai|D:HI|D:94766

    agg_REL_basic||N/A||Input||1||4||1||08/03/2004 16:57:03||1067126223||11019||Port [CITY_IN]: Default value is: ERROR( [ERROR]: [AGG]Null detected for City_IN.\n... nl:ERROR(s:'[AGG] Null detected for City_IN.')).||3||D:1354|N:|N:|D:1354|T:Cayman Divers World|D:PO Box 541|N:|D:Gr|N:|D:[AGG]DEFAULT SID VALUE.|D:01/01/2001 00:00:00||mplt_add_NULLs_to_QACUST3||SQ_QACUST3||4||0||D:1354|D:Cayman Divers

    World Unlim|D:PO Box 541|N:|D:Gr|N:

    agg_REL_basic||N/A||Input||1||5||1||08/03/2004 16:57:03||1067126223||11131||Transformation [agg_REL_basic] had an error evaluating variable column[Var_Divide_by_Price]. Error message is [ [/]: divisor iszero\n... f:(f:2 / f:(f:1 - f:TO_FLOAT(i:1)))].||3||D:1356|N:|N:|D:1356|T:Tom SawyerDiving C|T:632-1 Third Frydenh|D:Christiansted|D:St|D:00820|D:[AGG]DEFAULT SID VALUE.|D:01/01/2001 00:00:00||mplt_add_NULLs_to_QACUST3||SQ_QACUST3||5||0||D:1356|D:Tom SawyerDiving Centre|D:632-1 Third Frydenho|D:Christiansted|D:St|D:00820

    The following table describes the columns in an error log file:

    Log File ColumnHeader

    Description

    Transformation Name of the transformation used by a mapping where an erroroccurred.

    TransformationMapplet Name

    Name of the mapplet that contains the transformation. n/aappears when this information is not available.

    TransformationGroup

    Name of the input or output group where an error occurred.Defaults to either input or output if the transformation doesnot have a group.

    Partition Index Specifies the partition number of the transformation partitionwhere an error occurred.

    TransformationRow ID

    Specifies the row ID for the error row.

    Error Sequence Counter for the number of errors per row in each transformationgroup. If a session has multiple partitions, the IntegrationService maintains this counter for each partition.For example, if a transformation generates three errors inpartition 1 and two errors in partition 2, ERROR_SEQ_NUMgenerates the values 1, 2, and 3 for partition 1, and values 1

  • 8/14/2019 Working With the Load Balancer Overview

    12/14

    and 2 for partition 2.

    Error Timestamp Timestamp of the Integration Service when the error occurred.

    Error UTC Time Coordinated Universal Time, called Greenwich Mean Time,when the error occurred.

    Error Code Error code that corresponds to the error message.

    Error Message Error message.

    Error Type Type of error that occurred. The Integration Service uses thefollowing values:

    1 - Reader error2 - Writer error3 - Transformation error

    TransformationData

    Delimited string containing all column data, including thecolumn indicator. Column indicators are:D - validO - overflowN - nullT - truncatedB - binaryU - data unavailable

    The fixed delimiter between column data and column indicatoris a colon ( : ). The delimiter between the columns is a pipe ( | ).You can override the column delimiter in the error handlingsettings.

    The Integration Service converts all column data to text stringin the error file. For binary data, the Integration Service usesonly the column indicator.

    Source Name Name of the source qualifier. N/A appears when a row erroroccurs downstream of an active source that is not a sourcequalifier or a non pass-through partition point with more than

    one partition.Source Row ID Value that the source qualifier assigns to each row it reads. If

    the Integration Service cannot identify the row, the value is -1.

    Source Row Type Row indicator that tells whether the row was marked for insert,update, delete, or reject.0 - Insert1 - Update2 - Delete3 - Reject

    Source Data Delimited string containing all column data, including thecolumn indicator. Column indicators are:

    D - validO - overflowN - nullT - truncatedB - binaryU - data unavailableThe fixed delimiter between column data and column indicatoris a colon ( : ). The delimiter between the columns is a pipe ( | ).You can override the column delimiter in the error handlingsettings.

  • 8/14/2019 Working With the Load Balancer Overview

    13/14

    The Integration Service converts all column data to text stringin the error table or error file. For binary data, the IntegrationService uses only the column indicator.

    Configuring Error Log Options

    You configure error logging for each session on the Config Object tab of thesessions properties. When you enable error logging, you can choose to create

    the error log in a relational database or as flat file. If you do not enable errorlogging, the Integration Service does not create an error log.

    Tip: Use the Workflow Manager to create a reusable set of attributes for theConfig Object tab.

    To configure error logging options:

    1.Double-click the Session task to open the session properties.2.Select the Config Object tab.3.Specify the error log type.

    The following table describes the error logging settings of the Config Object tab:Error LogOptions

    Description

    Error LogType

    Specifies the type of error log to create. You can specify relationaldatabase, flat file, or none. By default, the Integration Service doesnot create an error log.

    Error Log DBConnection

    Specifies the database connection for a relational log. This option isrequired when you enable relational database logging.

    Error LogTable Name

    Prefix

    Specifies the table name prefix for relational logs. The IntegrationService appends 11 characters to the prefix name. Oracle and

    Sybase have a 30 character limit for table names. If a table nameexceeds 30 characters, the session fails.You can use a parameter or variable for the error log table nameprefix. Use any parameter or variable type that you can define inthe parameter file.

    Error Log FileDirectory

    Specifies the directory where errors are logged. By default, theerror log file directory is $PMBadFilesDir\. This option is requiredwhen you enable flat file logging.

    Error Log FileName

    Specifies error log file name. The character limit for the error log filename is 255. By default, the error log file name is PMError.log. Thisoption is required when you enable flat file logging.

    Log Row Data Specifies whether or not to log transformation row data. When youenable error logging, the Integration Service logs transformationrow data by default. If you disable this property, n/a or -1 appears intransformation row data fields.

    Log SourceRow Data

    If you choose not to log source row data, or if source row data isunavailable, the Integration Service writes an indicator such as n/aor -1, depending on the column datatype.If you do not need to capture source row data, consider disablingthis option to increase Integration Service performance.

    Data ColumnDelimiter

    Delimiter for string type source row data and transformation grouprow data. By default, the Integration Service uses a pipe ( | )

  • 8/14/2019 Working With the Load Balancer Overview

    14/14

    delimiter. Verify that you do not use the same delimiter for the rowdata as the error logging columns. If you use the same delimiter,you may find it difficult to read the error log file.

    4.Click OK.