Informatica_Questions & Answers

8/13/2019 Informatica_Questions & Answers

1/53

Sl.No.

Questions

1 Can we have multipleconditions in a Filter?

Yes. (We can place multiple conditions in one filter condition, we can not give multiple conditions inseparate groups)

2 How the flags are called inUpdate strategy? DD_UPDATE-1,DD_INSERT-0,DD_DELETE-2,DD_REJECT-3

3 What is diff. Things u can dousing PMCMD?

pmcmd command is a command line prompt to contact the informatica server. In previous versions it iscalled as pmrepserver. using pmcmd command you can run the mapping without using the workflowmana er.

4 What kind of Test plan? Whatkind of validation you do?

5 What is the usage ofunconnected/connected lookup?

Lookup is used to get the related value from the source or target.Two types of lookups are there:Connected and Unconnected.If the return value is single we need to go for unconnected lookup.If we want to return multiple columns then we need to use connected lookup. Dynamic cache, userdefined default values will su oort in connected looku .

6 What is the differencebetween Connected andUnconnected Lookups ?

Connected transformation receives input from the pipeline.We can use dynamic or static cachereturn multiple columns as outputif there is no match for the lookup condition, the powe center server returns the default value.We can define the user defined value

Unconnected transformation receives input from the result of :LKP expression in another transformationIt uses static cache

If there is no match for the lookup condition, the power center server returns the null value.We cannot define the user defined value

7 If u have data coming fromdiff. sources whattransformation will u use in

our desi ner?

For hetrogenous sources we need to use join transformation

8 What are different ports inInformatica?

Input, output,input/output,variable,Group by,Rank port,Lookup port


2/53

9 What is a Variable port?Why it is used?

Variable port will use with in the transformation. It is local to the particular transformation. We can storetemporary results to perform the calculations. The values will maintain for the entire mapping.

10 Diff between Active and

passive transormation ?

Active transformation will change the number of rows that pass through it. Eg: Source Qualifier, Filter,

Router, Joiner,Aggregator,Union ,Update strategy etc.Passive transformation will not change the number of rows that pass through it.Eg: Lookup, Sequencegenerator, Stored procedure, external procedure etc.

11 What are Mapplet? Mapplet is set of transformations that can be reusable.12 What is Aggregate

transformation Aggregaotor transformation allows us to perform the calculations such as averages and sums etc. Theaggregator transformation unlike the expression transformation in that you can use aggregate functionsand groups but difference is expression transformaion will do calculations on row by row basis. Thepower center server performs aggregate calculations as it reads and stores necessary data group and row

data in an aggreate cache.We can improve the performance by using the sorted input. if we place the sorted input the data should

13 What is RouterTransformation? How is itdifferent from Filtertransformation?

Router transformation is similar to the filter transformation because both transformations will use to testthe condition. A filter transformation tests the data for one condition and drops the rows of data that donot meet conditon. Router transformation can tests data for one or more conditions and gives you theoption to route rows that do not meet any of the condition to a default output group.Router transformation contains one input group , one or more output groups and one default group.

14 What are connected andunconnected transformations?

Connected transformation connected to other transformation in a pipeline.Unconnected transformation is not connected to other transformation in the mapping. A connectedtransformation is called within another transformation and returns a value to that t ransformation.

15 What is Normalizertransformation?

Normalizer transformation is the process of organizing data. In database terms, this includes creatingnormalized tables and establishing relationships between those tables according to rules designed to bothprotect the data and make the database more flexible by eliminating redundancy.Normalizer transformation normalizes records from cobol and relational sources. Use normalizer

16 How to use a sequencecreated in Oracle inInformatica?

1. We can call in source qualifer transformation (sequencename.nextval)2. By using stored procedure transformation we can get the value of sequence


3/53

17 What are source qualifiertransformations?

When you add a relational or a flat file source definition to a mapping, you need to connect it to a SourceQualifier transformation. The Source Qualifier transformation represents the rows that the IntegrationService reads when it runs a session

18 What are cache and their

types in Informatica?

The power center server builds a cache in memory for the rank,aggregator,joiner,lookup,sorter

transformations in a mapping. It allocates memory for the cache based on the amount we defined in thetransformation or session properties.Types of caches : Index and Data cache, lookup caches (Static,dynamic,shared,persistent,recache fromsource

19 What is an incrementalaggregation?

Incremental aggregation is used for aggregator transformation. Once the aggregate t ransformation isplaced in mapping, in session properties we need to check the increment aggregation property. So thatthe data will aggregate for incrmentallly.The first time you run an incremental aggregation session, the Integration Service processes the source.

At the end of the session, the Integration Service stores the aggregated data in two cache files, the indexand data cache files. The Integration Service saves the cache files in the cache file directory. The nexttime you run the session, the Integration Service aggregates the new rows with the cached aggregatedvalues in the cache files.

When you run a session with an incremental Aggregator transformation, the Integration Service creates abackup of the Aggregator cache files in $PMCacheDir at the beginning of a session run. The IntegrationService promotes the backup cache to the initial cache at the beginning of a session recovery run. TheIntegration Service cannot restore the backup cache file if the session aborts

20 What is Reject loading? By default, the Integration Service process creates a reject file for each target in the session. The rejectfile contains rows of data that the writer does not write to targets.

The writer may reject a row in the following circumstances:

It is flagged for reject by an Update Strategy or Custom transformation.It violates a database constraint such as primary key constraint.

A field in the row was truncated or overflowed, and the target database is configured to reject truncatedor overflowed data.By default, the Integration Service process saves the reject file in the directory entered for the serviceprocess variable $PMBadFileDir in the Workflow Manager, and names the reject filetarget_table_name.bad.

Note: If you enable row error logging, the Integration Service process does not create a reject file.


4/53

21 WHAT IS SESSION andBATCHES?

Session - A Session Is A set of instructions that tells the Informatica Server How And When To Move DataFrom Sources To Targets. After creating the session, we can use either the server manager or thecommand line program pmcmd to start or stop the session.

Batches - It Provides A Way to Group Sessions For Either Serial Or Parallel Execution By The InformaticaServer. There Are Two Types Of Batches :

22 Significance of SourceQualifier Transformation

When you add a relational or a flat file source definition to a mapping, you need to connect it to a SourceQualifier transformation. The Source Qualifier transformation represents the rows that the IntegrationService reads when it runs a session. The following tasks we can do by using the SQ transformation:1. Join data coming from the same database2. Filter the rows when the informatica server reads the source data

23 What are 2 modes of datamovement in InformaticaServer?

ASCIIUNICODE

24 Why we use lookuptransformations?

By using lookup transformation we can do the following things:1. Get a related value2. Perform Calculation3. Update slowly changing dimension tables - We can use lookup transformation to determine whetherthe records already exist in the target or not.

25 What are confirmeddimensions

Confirmed dimensions can be shared by multiple facts

26 What is Data warehousing Datawarehouse is a relational database and it is designed for query and analysis purpose. It contains thehistorical data derived from the transaction data and also it include data from other sources.

27 What is a reusable transf..What is a mapplet . Explaindiff. Bet them

Reusable tranformation is a single transformation it can be used in any other mappings.Mapplet is a reusable object it contains the multiple transformations. (The set of transformation logcembed in mapplet we can use this as a reusable logic in any number of mappings)


5/53

28 What happens when u usethe delete or update or rejector insert statement in youru date strate ?

It will perform the certain action what we specified in update strategy transformation and it depends ontreat source rows option in session properties

29 Where do u define users andprivileges in Informatica Repository manager30 when u run the session does

debugger loads the data totar et ?

it will load the data but if we select the target discard option it will not load the data.

31 Can u use flat file and table(relational) as source together?

Yes.

32 suppose I need to separatethe data for delete and insertto target depending on thecodition, which transformationu use ?

Router or filter

33 What is the differencebetween lookup Data cacheand Index cache.

Data cache - Output columns data other than condition columnsIndex cache - Condition columns

34 What is an indicator file andhow it can be used.

This is one of the output file of informatica when the session runs it will generate the indicator file.If youuse a flat file as a target, you can configure the Integration Service to create an indicator file for targetrow type information. For each target row, the indicator file contains a number to indicate whether therow was marked for insert, update, delete, or reject. The Integration Service process names this filetar et name.ind and stores it in the same director as the tar et file.

35 What is an FilterTransformation? or whatoptions u have in FilterTransformation?

Filter transformation is used to filter the data based on the condition specified in filter condition value.

36 What happens to thediscarded rows in FilterTransformation.

The discarded rows will ignore by the informatica server


6/53

37 What are the two programsthat communicate with theInformatica Server?

Informatica provides Server Manager and pmcmd programs to communicate with the Informatica Server:Server Manager. A client application used to create and manage sessions and batches, and to monitorand stop the Informatica Server. You can use information provided through the Server Manager totroubleshoot sessions and improve session performance.

pmcmd. A command-line program that allows you to start and stop sessions and batches, stop theInformatica Server, and verify if the Informatica Server is running.

38 What u can do with Designer?

The Designer has tools to help you build mappings and mapplets so you can specify how to move andtransform data between sources and targets. The Designer helps you create source definitions, targetdefinitions, and transformations to build the mappings.

39 What are different types ofTracing Levels u hv inTransformations?

Normal :Integration Service logs initialization and status information, errors encountered, and skippedrows due to transformation row errors. Summarizes session results, but not at the level of individualrows.Terse:Integration Service logs initialization information and error messages and notification of rejecteddata

Verbose initialize:In addition to normal tracing, Integration Service logs additional initialization details,names of index and data files used, and detailed transformation statistics

Verbose data: In addition to verbose initialization tracing, Integration Service logs each row that passesinto the mapping. and provides detailed t ransformation statistics.When you configure the tracing level to verbose data, the Integration Service writes row data for all rowsin a block when it processes a transformation.

40 What is Mapplet and how dou create Mapplet?

Mapplet is reusable transformation logic. We will create mapplet in mapplet designer.

41 If data source is in the formof Excel Spread sheet thenhow do use?

1.Install the microsoft excel odbc driver2. Create data source for the driver.3. Define ranges for the excel sheet and set the datatypes for all the columns.4. import the excel into source analyser


7/53

42 When do u use connectedlookup n when do u useunconnected lookup?

Connected lookup transformation is part of mapping pipeline, by using this we can receive multiple returnvalues.Unconnected lookup transforamtion is a separate from the flow. we can call this in expressiontransformation by using the :LKP qualifier.

we can use same looku in multi le transformations.43 How many values it(informatica server) returnswhen it passes thruConnected Lookup nUnconncted Looku ?

Unconnected lookup will return single return portconnect lookup will return one or more output values

44 What kind of modifications ucan do/perform with eachTransformation?

Expression - performs calculations Aggregator - Find the aggregate values, nested aggregatesfilter - Filter the recordsRouter - Filter the multiple conditionsStored procedure - call the oracle procedure

-45 Expressions in

Transformations, Explainbriefly how do u use?

Use the Expression transformation to calculate values in a single row before you write to the target. Forexample, you might need to adjust employee salaries, concatenate first and last names, or convertstrings to numbers. Use the Expression transformation to perform any non-aggregate calculations. Youcan also use the Expression transformation to test conditional statements before you output the results totarget tables or other transformations.

46 In case of Flat files (whichcomes thru FTP as source)has not arrived then whatha ens

We will get the fatar error and session will fail


8/53

47 What does a load manager do?

A component of the Integration Service that dispatches Session, Command, and predefined Event-Waittasks across nodes in a grid.The load Manager is the Primary informatica Server Process. It Performs the following tasks -Manages session and batch scheduling.

Locks the session and read session properties.Reads the parameter file.Expand the server and session variables and parameters.

Verify permissions and privileges. Validate source and target code pages.Create the session log file.

48 What is a cache It stores the temporary results while running the session.49 What is an Expression

transformation?Expression transformation is passive transformation

50 I have two sources S1 having100 records and S2 having10000 records, I want to jointhem, using joinertransformation. Which ofthese two sources (S1,S2)should be master to improvemy performance? Why?

Master should be S1.In general, Master table contains less rows and detail table contains the more rows .The cache will create for the master table rows. For each master row the detail record will process. Herethe cache performance will increase

51 I have a source and I want togenerate sequence numbersusing mappings ininformatica. But I dont wantto use sequence generatortransformation. Is there any

1. We can use oracle sequence2. in expression transformation, declare one variable port and increment by 1 for every row processing.

52 What is a bad file? Bad file is a file that contains the rejected informat ion.53 What is the first column of the

bad file?Record and row indicator. Row indicator (0-insert,1-update,2-delete,3-reject)

54 What are the contents of thecache directory in the server

Data cache - Output columns data other than condition columnsIndex cache - Condition columns


9/53

55 Is lookup a Activetransformation or Passivetransformation ?

Passive

56 What is a Mapping? A mapping is a set of source and target definitions linked by transformation objects that define the rulesfor data transformation. Mappings represent the data flow between sources and targets. When theIntegration Service runs a session, it uses the instructions configured in the mapping to read, transform,and write data.

Every mapping must contain the following components:

Source definition. Describes the characteristics of a source table or file.Transformation. Modifies data before writing it to targets. Use different transformation objects to performdifferent functions.Target definition. Defines the target table or file.Links. Connect sources, targets, and transformations so the Integration Service can move the data as ittransforms it.

57 What are the types oftransformations

Active and Passive. Active transformation can change the number of rows that pass through it. Passivetransformation can not change the number of rows that pass through it.

58 If a sequence generator (withincrement of 1) is connectedto (say) 3 targets and eachtarget uses the NEXTVAL port,what value will each target

If the 3 target are using the nextval output port from the same transformation the value of 3 targets aresame.If the nextval outport port coming from 3 different flows the value of 3 target will be different. Its likemultiple of 3.

59 Have you used the Abort,Decode functions?

Abort can be used to stops trasforming at the row.Generally, you use ABORT within an IIF or DECODE function to set rules for aborting a session


10/53

60 What do you know about theInformatica serverarchitecture? Load Manager,DTM, Reader, Writer,

Transformer

Load Manager is the first process started when the session runs. It will manage the session andbatchscheduling, lock the session and read the session, read the parameter file, validate the parameter valueand session variables, creates session logs, creates DTM process. After the load manager performs validations for the session, it creates the DTM process. The DTM

process is the second process associated with the session run. The primary purpose of the DTM processis to create and manage threads that carry out the session tasks.The DTM allocates process memory forthe session and divide it into buffers. This is also known as buffer memory. It creates the main thread,which is called the master thread. The master thread creates and manages all other threads.If we

61 What are the default valuesfor variables?

String - empty, numeric -0,Date -1/1/1753

62 How many ways you can filterthe records?

Filter or router or source qualifier or rank or update startagy


11/53

63 How do you identify thebottlenecks in Mappings?

We should look the performance bottelnecks in the following order:TargetSourceMapping

SessionSystemIdentifying Target BottlenecksThe most common performance bottleneck occurs when the Informatica Server writes to a targetdatabase. You can identify target bottlenecks by configuring the session to writeto a flat file target. If the session performance increases significantly when you write to a flat file, youhave a target bottleneck.If your session already writes to a flat file target, you probably do not have a target bottleneck. You canoptimize session performance by writing to a flat file target local to the Informatica Server.Causes for a target bottleneck may include small check point intervals, small databasenetwork packet size, or problems during heavy loading operations.

Identifying Source BottlenecksIf the session reads from flat file source, we probably do not have a source bottelneck . you can improvethe session performance by setting the number of bytes the informatica server reads per line if you readfrom a flat file source.If the session reads from relational source, we have to use filter t ransformation, a read test mapping ordatabase query to identify source bottelnecks.Using a Read Test Session:Use the following steps to create a read test mapping:1. Make a copy of the original mapping.2. In the copied mapping, keep only the sources, source qualifiers, and any custom

joins or queries.3. Remove all transformations.4. Connect the source qualifiers to a file target.Use the read test mapping in a test session. If the test session performance is similar to the original


12/53

64 How to improve the Sessionperformance?

If we do not have a source, target, or mapping bottleneck, you may have a session bottleneck. You canidentify a session bottleneck by using the performance details. The Integration Service createsperformance details when you enable Collect Performance Data in the Performance settings on thesession properties.

Performance details display information about each transformation. All transformations have some basiccounters that indicate the number of input rows, output rows, and error rows.Small cache size, low buffer memory, and small commit intervals can cause session bottlenecks.1. we can implement partition concept at session level2. we can increase the cache size, commit intervals3. by running concurrent sessions4. by optimizing the transformations5. reduce the error records.

65 What is Business components? Where it exists ?

Business components component will be available in folder

66 What are Short cuts ? Whereit is used ?

By dragging from one folder to another folder we can create shortcuts. If we want to create shortcuts thefolder should be shared. After dragging the object we can rename it but we can not change anythingelse. When ever the main object changes it will inherit all the changes to shortcut also.

67 While importing the relationalsource definition fromdatabase, what are the metadata of source U import?

Extension name, column names,datatypes, constraints and database name

68 . How many ways U canupdate a relational sourcedefinition and what r they?

1. We can reimport the source definition2. Manually edit the definition by adding new columns/update the existing columns.


13/53

69 .What r the unsupportedrepository objects for amapplet?

1. Normalizer transformation2. XML Transformation3. XML Targets4. COBOL sources

5. other targets6. other mapplets7. pre and post session stored procedures8. Joiner transformation8. non reusable sequence generators

70 What r the mappingparameters and mappingvariables?

Maping parameter represents a constant value that you can define before running a session. A mapping parameter retains the same value throughout the entire session.When u use the maping parameter ,U declare and use the parameter in a maping or maplet.Then define the value of parameter in a parameter file for the session.Unlike a mapping parameter,a maping variable represents a value that can change throughout thesession.The informatica server saves the value of maping variable to the repository at the end of sessionrun and uses that value next time U run the session.

71 Can U use the mappingparameters or variablescreated in one mapping intoanother mapping?

No.

We can use when the same mapping parameters and variables are created in other mapping.

72 Can u use the mappingparameters or variablescreated in one mapping into

any other reusabletransformation?

Yes.

73 How can U improve sessionperformance in aggregatortransformation?

By using sorted input option


14/53

74 .What r the differencebetween joiner transformationand source qualifiertransformation?

Joiner transformation used to join the hetrogenous sources

Source qualifier used to join the tables in same database

75 In which conditions we cannot use joinertransformation(Limitations of

joiner transformation)?

1. if the join source is coming from the Update starategy transformation.2. Sequence generator

76 What r the settings that u use

to configure the joinertransformation?

Join type : Normal,master outer,detail outer,full outer

join conditionMaster and detail source check

77 What r the join types in joinertransformation?

Normal, master outer,detail outer,full outer

78 How the informatica serversorts the string values in Ranktransformation?

If the data movement mode is ASCII format the string sort will use the binary sort orderIf the data movement mode is UNICODE format the string sort will use the sort order defined in sessionproperties.

79 What is the Rank index inRank transformation?

the designer automatically creates rank index port for each rank transformation. The power center serveruses the rank index port to stored the ranking postion for each row in a group. It is output port only.


15/53

80 What is the Routertransformation?

Router transformation is similar to the filter transformation because both transformations will use to testthe condition. A filter transformation tests the data for one condition and drops the rows of data that donot meet conditon. Router transformation can tests data for one or more conditions and gives you theoption to route rows that do not meet any of the condition to a default output group.Router transformation contains one input group , one or more output groups and one default group.

81 What r the types of groups inRouter transformation?

1. Input group2. Output groups : two types of output groups i.e user defined groups ,default group (we cannot changethe default group)

82 Why we use stored proceduretransformation?

Stored procedures run in either connected or unconnected mode. We can call the stored procedures fromthe database.

83 What r the types of data thatpasses between informaticaserver and stored procedure?

We can send the data to stored procedure and also we can get the data from the stored procedure.There are three types of data that passes between informatica server and stored procedure. Input/outputvalues, return value, status code.

84 What is the status code? Status codes provide error handling for the Integration Service during a workflow. The stored procedureissues a status code that notifies whether or not the stored procedure completed successfully. You cannotsee this value. The Integration Service uses it to determine whether to continue running the session orstop. You configure options in the Workflow Manager to continue or stop the session in the event of astored procedure error.


16/53

85 What r the tasks that sourcequalifier performs?

When you add a relational or a flat file source definition to a mapping, you need to connect it to a SourceQualifier transformation. The Source Qualifier transformation represents the rows that the IntegrationService reads when it runs a session. The following tasks we can do by using the SQ transformation:1. Join data coming from the same database2. Filter the rows when the informatica server reads the source data3. We can use outer join instead of normal join4. specify sorted order

86 What is the default join thatsource qualifier provides?

normal

87 . What r the basic needs to join two sources in a sourcequalifier?

the two sources should come from same databasetwo sources join condition datatypes are same.

88 what is update strategytransformation ?

For handling changes in existing rows we will go for update strategy.When you design a data warehouse, you need to decide what type of information to store in targets. Aspart of the target table design, you need to determine whether to maintain all the historic data or just the

89 Describe two levels in whichupdate strategytransformation sets?

Mapping level and session levelMapping level - we will use the update strategy transformation to flag rows for insert ,update,delete orrejectSession level - we need to set the treat all rows ro ert as inserts u date delete data driven.

90 What is Data driven? If the mapping for the session contains an Update Strategy transformation, this field is marked DataDriven by default.If you do not choose Data Driven when a mapping contains an Update Strategy, the Workflow Managerdisplays a warning. When you run the session, the Integration Service does not follow instructions in the

Update Strategy in the mapping to determine how to flag rows.91 What r the options in the

target session of updatestrategy transformation?

Insert - Select this option to insert a row into a target table.Delete - Select this option to delete a row from a table.Update - You have the following options in this situation:Update as Update - Update each row flagged for update if it exists in the target table.Update as Insert - Inset each row flagged for update.Update else Insert - Update the row if it exists. Otherwise, insert it.

92 What r the mappings that weuse for slowly changingdimension table?

Type1 ,Type2,Type3 Mappings


17/53

93 What r the different types ofType2 dimension mapping?

Type2 Version Data Mapping:Type2 Flag current Mapping:

Type2 Effective Date Range Mapping:94 How can u recognize whether

or not the newly added rowsin the source r gets insert inthe target ?

1. We can check by using version or flag or effective date of the particular records.

95 What r two types of processesthat informatica runs thesession?

Load Balancer:When you run a workflow, the Load Balancer dispatches the Session, Command, andpredefined Event-Wait tasks within the workflow. The Load Balancer matches task requirements withresource availability to identify the best node to run a task. It dispatches the task to an IntegrationService process running on the node. It may dispatch tasks to a single node or across nodes

96 Can u generate reports in

Informatica? Yes

97 What is metadata reporter? we can generate reports by using the metadata reporter.98 Define mapping and sessions? Mapping :A mapping is a set of source and target definitions linked by transformation objects that define

the rules for data transformation. Mappings represent the data flow between sources and targets. Whenthe Integration Service runs a session, it uses the instructions configured in the mapping to read,transform, and write data.

99 Which tool U use to createand manage sessions andbatches and to monitor andsto the informatica server?

Informatica server manager

100 Why we use partitioning thesession in informatica?

To increase the session perofrmance.

101 To achieve the sessionpartition what r thenecessary tasks u have to do?

When running sessions, the PowerCenter Server can achieve high performance bypartitioning the pipeline and performing the extract, transformation, and load for eachpartition in parallel. To accomplish this, use the following session and server configuration: Configure the session with multiple partitions. Install the PowerCenter Server on a machine with multiple CPUs.

You can configure the partition type at most transformations in the pipeline. ThePowerCenter Server can partition data using round-robin, hash, key-range, database

-


18/53

102 How the informatica serverincreases the sessionperformance throughpartitioning the source?

When you run a session that partitions relational or Application sources, the Integration Service creates aseparate connection to the source database for each partition. It then creates an SQL query for eachpartition. You can customize the query for each source partition by entering filter conditions in theTransformation view on the Mapping tab. You can also override the SQL query for each source partitionusing the Transformations view on the Mapping tab.

103 . Why u use repositoryconnectivity?

When u edit, schedule the session each time, informatica server directly communicates the repository tocheck whether or not the session and users are valid. All the metadata of sessions and mappings will bestored in repository.

104 What r the tasks that Loadmanger process will do?

A component of the Integration Service that dispatches Session, Command, and predefined Event-Waittasks across nodes in a grid.The load Manager is the Primary informatica Server Process. It Performs the following tasks -Manages worflow and batch scheduling.

Locks the workflow and reads workflow properties.Reads the parameter file.Expand the server and workflow variables and parameters.

Verify permissions and privileges. Validate source and target code pages.Create the workflow log file.

105 What is DTM process? After the load manager performs validations for the session, it creates the DTM process. The DTM

process is the second process associated with the session run. The primary purpose of the DTM processis to create and manage threads that carry out the session tasks.The DTM allocates process memory forthe session and divide it into buffers. This is also known as buffer memory. It creates the main thread,


19/53

106 What r the different threads inDTM process?

MASTER THREAD - Main thread of the DTM process. Creates and manages all other threads.MAPPING THREAD - One Thread to Each Session. Fetches Session and Mapping Information.Pre And Post Session Thread - One Thread Each To Perform Pre And Post Session Operations.READER THREAD - One Thread for Each Partition for Each Source Pipeline.WRITER THREAD - One Thread for Each Partition If Target Exist In The Source pipeline Write To The Target.TRANSFORMATION THREAD - One or More Transformation Thread For Each Partition.

107 What r the data movement

modes in informatica?

ASCII

UNICODE108 What r the out put files thatthe informatica server createsduring the session running? The PowerCenter Server creates the following output files:

PowerCenter Server log Workflow log file Session log file Session details file Performance details file Reject files

Row error logs Recovery tables and files Control file Post-session email Output file 109 In which circumstances that

informatica server createsReject files?

It is flagged for reject by an Update Strategy or Custom transformation.It violates a database constraint such as primary key constraint.

A field in the row was truncated or overflowed, and the target database is configured to reject truncated


20/53

110 Can u copy the session to adifferent folder or repository?

yes

111 . What is batch and describeabout types of batches?

Grouping sessions is called a batch. Two types of batches sequential, concurrent.

112 Can u copy the batches? no113 .How many number of

sessions that u can create in abatch?

any number of sessions

114 When the informatica servermarks that a batch is failed?

When the session fails (the session property indicates run if the previous session successful)

115 . What is a command thatused to run a batch

pmcmd command

116 What r the different optionsused to configure the

se uential batches?

1. Run the session if the previous session successful2. Always run the session

117 In a sequential batch can urun the session if previoussession fails?

yes, by setting the always run the session property

118 Can u start batches with in abatch?

NO

119 Can u start a session inside abatch individually?

Yes

120 How can u stop a batch? By using pmcmd command

121 What r the sessionparameters?

Session parameters represent values you can change between session runs, such as databaseconnections or source and target files. You use a session parameter in session or workflow properties anddefine the parameter value in a parameter file. You can also create workflow variables in the WorkflowProperties and define the values in a parameter file.Database connections, Cachefile dir, source file names,target file names,reject filenames

122 What is parameter file? Parameter file contains the mapping parameters,mapping varaiables,session parameters.123 How can u access the remote

source into Ur session?1. we need to create the database connection for the relational target2. we need to create FTP connection for the flat files


21/53

124 What is differencebetween partioning ofrelational target andpartitioning of filetargets?

If we create the partition on relational target, it will create the multiple database connections for eachpartition and it will run the sql query and write it in to target.If we create the partition on file target, it will create multiple threads to write the target.

125 what r the transformationsthat restricts the partitioningof sessions?

Normalizer transformation,XML Target, Join transformation for master data

126 . Performance tuning inInformatica?

After finding the bottlenecks for the mapping:For optimizing the target we need to go for the following options:If it relation target, Drop indexes and key constraints.Increase checkpoint intervals.Use bulk loading.Use external loading.Minimize deadlocks.Increase database network packet size.Optimize Oracle target databases.If the target is flat file, we need to move the file to local server.Optimizing the Source:Optimize the query, use conditional filters, increase the network packet sizeOptimizing the Mapping: Use Source qualifier/filter transformations to filter the data, remove unnecessary datatype conventions,minimize aggreate function calls,Tune the tansformationsOptimizing theSession:Reduce error tracing, increase commit interval, increase cache size for index and data, allocate buffermemory, run sessions and workflows concurrently, remove staging areas,use pushdown optimization.


22/53

127 Define informatica repository? The Informatica repository is a relational database that stores information, or metadata, used by theInformatica Server and Client tools. Metadata can include information such as mappings describing howto transform source data, sessions indicating when you want the Informatica Server to perform thetransformations, and connect strings for sources and targets.

The repository also stores administrative information such as usernames and passwords, permissions andprivileges, and product version.

Use repository manager to create the repository.The Repository Manager connects to the repositorydatabase and runs the code needed to create the repository tables.Thsea tablesstores metadata in specific format the informatica server,client tools use.

128 What r the types of metadatathat stores in repository?

Following r the types of metadata that stores in the repository

Database connections

Global objectsMappingsMappletsMultidimensional metadataReusable transformationsSessions and batchesShort cutsSource definitionsTarget defintions

129 What is power centerrepository?

Power Center repository is used to store informatica's meta data .Information such as mapping name,location,target definitions,source definitions,transformation and flowis stored as meta data in the repository.

130 How can u work with remotedatabase in informatica? did uwork directly by using remote

We have to create remote connection for this. But this is not suggestable in informatica, we need toimport the source/target objects into local machine.


23/53

131 what is incrementalaggregation?

Incremental aggregation is used for aggregator transformation. Once the aggregate t ransformation isplaced in mapping, in session properties we need to check the increment aggregation property. So thatthe data will aggregate for incrmentallly.The first time you run an incremental aggregation session, the Integration Service processes the source.

At the end of the session, the Integration Service stores the aggregated data in two cache files, the indexand data cache files. The Integration Service saves the cache files in the cache file directory. The nexttime you run the session, the Integration Service aggregates the new rows with the cached aggregatedvalues in the cache files.

When you run a session with an incremental Aggregator transformation, the Integration Service creates abackup of the Aggregator cache files in $PMCacheDir at the beginning of a session run. The IntegrationService promotes the backup cache to the initial cache at the beginning of a session recovery run. TheIntegration Service cannot restore the backup cache file if the session aborts

132 . What r the schedulingoptions to run a session?

You can schedule a workflow to run continuously, repeat at a given time or interval, or you can manuallystart a workflow. The Integration Service runs a scheduled workflow as configured.

By default, the workflow runs on demand. You can change the schedule settings by editing the scheduler.If you change schedule settings, the Integration Service reschedules the workflow according to the newsettings.

Scheduling options:Run on server initialization - the integration service runs the workflow as soon as the service is initialized.Run on demand - the inte aration service run othe workflow when we start the workflow manuall

133 What is tracing level and whatr the types of tracing level?

Normal :Integration Service logs initialization and status information, errors encountered, and skippedrows due to transformation row errors. Summarizes session results, but not at the level of individualrows.Terse:Integration Service logs initialization information and error messages and notification of rejecteddata

Verbose initialize:In addition to normal tracing, Integration Service logs additional initialization details,names of index and data files used, and detailed transformation statistics


24/53

134 What is difference betweenstored proceduretransformation and externalprocedure transformation?

In case of storedprocedure transformation procedure will be compiled and executed in a relational datasource.You need data base connection to import the stored procedure in to maping.Where as in externalprocedure transformation procedure or function will be executed out side of data source.Iet we need tomake it as a DLL to access in maping.No need to have data base connection in case of externalprocedure transformation.

135 Explain about Recoveringsessions?

If you stop a session or if an error causes a session to stop, refer to the session and error logs todetermine thecause of failure. Correct the errors, and then complete the session. The method you use to complete thesession depends on the properties of the mapping, session, and Informatica Server configuration.

Use one of the following methods to complete the session:Run the session again if the Informatica Server has not issued a commit.Truncate the target tables and run the session again if the session is not recoverable.

136 If a session fails after loading

of 10,000 records in to thetarget. How can u load therecords from 10001st recordwhen u run the session next

By using Perform recovery option setting in session properties

137 Explain about performrecovery?

When the Informatica Server starts a recovery session, it reads the OPB_SRVR_RECOVERY table andnotes the row ID of the last row committed to the target database.The Informatica Server then reads all sources again and starts processing from the next row ID. Forexample, if the Informatica Server commits 10,000 rows before thesession fails, when you run recovery, the Informatica Server bypasses the rows up to 10,000 and startsloading with row 10,001.By default, Perform Recovery is disabled in the Informatica Server setup. You must enable Recovery inthe Informatica Server setup before you run a session so the Informatica Server can create and/or write


25/53

138 How to recover thestandalone session?

A standalone session is a session that is not nested in a batch. If a standalone session fails, you can runrecovery using a menu command or pmcmd. These options arenot available for batched sessions.To recover sessions using the menu:1. In the Server Manager, highlight the session you want to recover.2. Select Server Requests-Stop from the menu.3. With the failed session highlighted, select Server Requests-Start Session in Recovery Mode from themenu.To recover sessions using pmcmd:1.From the command line, stop the session.2. From the command line, start recovery.

139 How can u recover thesession in sequential

batches?

If you configure a session in a sequential batch to stop on failure, you can run recovery starting with thefailed session. The Informatica Server completes the session and

then runs the rest of the batch. Use the Perform Recovery session propertyTo recover sessions in sequential batches configured to stop on failure:1.In the Server Manager, open the session property sheet.2.On the Log Files tab, select Perform Recovery, and click OK.3.Run the session.4.After the batch completes, open the session property sheet.5.Clear Perform Recovery, and click OK.If you do not clear Perform Recovery, the next time you run the session, the Informatica Server attemptsto recover the previous session.If you do not configure a session in a sequential batch to stop on failure, and the remaining sessions in

the batch complete, recover the failed session as a standalonesession.


26/53

140 How to recover sessions inconcurrent batches?

If multiple sessions in a concurrent batch fail, you might want to truncate all targets and run the batchagain. However, if a session in a concurrent batch fails and the rest of the sessions complete successfully, you can recover the session as a standalone session.To recover a session in a concurrent batch:1.Copy the failed session using Operations-Copy Session.2.Drag the copied session outside the batch to be a standalone session.3.Follow the steps to recover a standalone session.4.Delete the standalone copy.

141 How can u completeunrecoverable sessions?

Under certain circumstances, when a session does not complete, you need to truncate the target tablesand run the session from the beginning. Run the session from thebeginning when the Informatica Server cannot run recovery or when running recovery might result ininconsistent data.

142 What r the circumstances thatinformatica server results an

unrecoverable session?

The source qualifier transformation does not use sorted ports.If u change the partition information after the initial session fails.

Perform recovery is disabled in the informatica server configuration.If the sources or targets changes after initial session fails.If the maping consists of sequence generator or normalizer transformation.If a concuurent batch contains multiple failed sessions.

143 If i've done any modificationsfor my table in back end doesit reflect in informaticawarehouse or mappingdesigner or source analyzer?

It will not reflect automatically, we need to reimport the objects or manually we need to add thechanges.

144 After dragging the ports ofthree sources(sqlserver,oracle,informix) to asingle source qualifier, can umap these three ports directly

If we join the three sources into source qualifier then we can map the ports to target.

145 Server Variables $PMCacheDir,$PMBadFileDir,$PMSourceFileDir,$PMTargetFileDir,$PMSessionLogDir,$PMWorkflowLogDir


27/53

146 Folders Folders provide a way to organize and store all metadata in the repository, including mappings, schemas,and sessions. Folders are designed to be flexible, to help you logically organize the repository. Each folderhas a set of configurable properties that help you define how users access the folder. For example, youcan create a folder that allows all repository users to see objects within the folder, but not to edit them.Or, you can create a folder that allows users to share objects within the folder. You can create shared

-147 Multiple Servers


28/53


29/53


30/53


31/53


32/53


33/53


34/53


35/53


36/53


37/53


38/53


39/53


40/53


41/53

Sl. No. Questions1 h i D W h ?


42/53

1 what is a Data Warehouse?

2

What is the conventionaldefinition of a DWH? Explaineach term.

3 Draw the architecture of aDatawarehousing system.

4What are the goals of the Datawarehouse?

5What are the approaches inconstructing a Datawarehouseand the datamart?

6

Data Mart

7Can a datamart be independent?


43/53

13


44/53

Difference between OLAP &OLTP?

14What are the different types ofOLAP? Give an eg.

15

Which is the suitable data modelfor a datawarehouse? Why?

16

Star Schema

17 What are the benefits of STARSCHEMA?

18What are Additive Facts? Or whatis meant by Additive Fact?

19 Snowflake Schema

20 What is Galaxy schema?

21


45/53

What is Dimension & Fact ?

22 Different types of Dimensions23 Are the dimensional tables

normalized? If so when?

24What is Transaction fact table &Centipede Fact table?

25

Different types of Facts?

26What are the types of Factlessfact tables?

27What is Granularity?

28 Is the Fact table normalized?29 Can 2 Fact Tables share same

dimensions Tables?

30

Give egs. of the fact, dimensionaltables, datamarts/DWH used inyour project. Explain what dataeach contains.

31


46/53

What are fact constellations?

32

What is a Fact less fact table?

33 What is metadata?

34

What is data quality?

35

How do you achieve dataquality?

36 What is Parsing & Data Mining.37 What are surrogate keys?

38Name a few data modellingtools.

39Materialized views?

40Can you insert into materializedviews?

41Definition of Adhoc Queries?

42

What is ODS (Operational DataStore), DSS (Decision supportSystem), Data Staging Area,Data Presentation Area.

43What is Market-Basket analysis?

SCD Types


47/53

4445 what is a Hypercube?4647 Explain the performance

improvement techniques inDW?48 Explain slice and dice ?

Datawarehouse is a relational database and it is designed for query and analysis purpose. It contains the historical


48/53

g q y y p pdata derived from the transaction data and also it include data from other sources.characterstics of datawarehouse:Subject-orientedThe data in the data warehouse is organized so that all the data elements relating to the same real-world event orobject are linked together.

Time-variantThe changes to the data in the data warehouse are tracked and recorded so that reports can be produced showingchanges over time.Non-volatileData in the data warehouse is never over-written or deleted - once committed, the data is static, read-only, andretained for future reporting.IntegratedThe data warehouse contains data from most or all of an organization's operational systems and this data is madeconsistent.

The main goals of our Data Warehouse are:

1. understand the users needs by business area, job responsibilities.2. determine the decisions the business users want to make with the help of the data warehouse3. choose the most effective, actionable subset of the OLTP data to present in the data warehouse,4. make sure the data is accurate and can be trusted, labeling it consistently across the enterprise5. continuously monitor the accuracy of the data and the content of the delivered reports

Top down approachBottom up approach: data marts are first created to provide reporting and analytical capabilities for specific businessprocesses. Data marts contain atomic data and, if necessary, summarized data. These data marts can eventually beunioned to ether to create a com rehensive data warehouse

A data mart is a subset of an organizational data store, usually oriented to a specific purpose or major data subject,that may be distributed to support business needs.[1] Data marts are analytical data stores designed to focus onspecific business functions for a specific community within an organization. Data marts are often derived from subsetsof data in a data warehouse, though in the bottom-up data warehouse design methodology the data warehouse is

Datamarts can use for small organizations. So it can be independent.

OLTP and operations sources


49/53

Database - 1. It is collection of data.( Online Transaction processing).2. Normalized Form

3. Complex joins 4. More DML operations

5. Holds current data

Datawarehouse -1. It is relational database , query and analytical purpose 2. partially normalized/denormalized form 1. Online Analytical Processing2. Read only data3. Partially normalized/denormalized tables4. holds current and historical data5. records are based on surrogate field6. cannot delete the records

multidimensional analysis is a data analysis process that groups data into two basic categories: data dimensions and measurements

OLAP - Online Analytical processing, It is normalized data. To retrieve data complex joins has to perform.ROLAP - Relational Online Analytical Process that provides multidimensional analysis of data, stored in a Relationaldatabase(RDBMS).MOLAP -Multidimensional OLAP, provides the analysis of data stored in a multi-dimensional data cube.HOLAP - Hybrid OLAP a combination of both ROLAP and MOLAP can provide multidimensional analysis simultaneouslyof data stored in a multidimensional database and in a relationaldatabase(RDBMS).DOLAP - Desktop OLAP or Database OLAP, provide multidimensional analysis locally in the client machine on the data

OLAP1. Online Analytical Processing


50/53

2. Read only data3. Partially normalized/denormalized tables4. holds current and historical data5. records are based on surrogate field6. cannot delete the records

7. simplified datamodelOLTP1. Online Transaction processing2. continuosuly updates data3. normalized form4. holds current data5. records are maintained on primary key field6. delete the table or records

In the OLAP, there are mainly two different types:Multidimensional OLAP (MOLAP) and Relational OLAP (ROLAP).

Hybrid OLAP (HOLAP) refers to technologies that combine MOLAP and ROLAPStar schema.It is a demoralized model.No need to use complicated joins.Queries results fastly.

Star schema is the simplest datawarehouse design. The main feature of schema is a table at the center called facttable and surrounding dimension tables. Fact tables tables are in a star schema are in database are 3rd normal form.Dimensions are denormalized formIt is a demoralized model.No need to use complicated joins.Queries results fastly.

Additive facts are fact that can be summed up through all of the dimensions in the fact table

Snowflake schema is more complex variation of a star schema design. The main difference is the dimensional tables ina snowflake schema are normalized, so they have a typical relational database design. These are used when thedimension table becomes ver bi and star schema can't re resent the com lexit of data structure.This schema is more complex than start and snowflake schema, which is because it contains multiple fact tables. Thisallows dimension tables to be shared amongest many fact tables. It is very hard to manage.

1. Dimension contains the descriptive information for a fact table. Size of dimension table is smaller than fact. In aschema more number of dimensions are presented than fact table. Surrogate key is used to prevent the primary key


51/53

violation. values of columns are in numerica nd test representation

1. Fact contains measurements. size of fact table is larger than dimension table. In a schema less number of fact thandimension table. values of columns are always in numeric.

Junk Dimension -When you consolidate lots of small dimensions and instead of having 100s of small dimensions, thatwill have few records in them, cluttering your database with these mini identifier tables, all records from all thesesmall dimension tables are loaded into ONE dimension table and we call this dimension table Junk dimension tableConfirmed Dimension - The dimension used by multiple facts. For ex Time dimension,Date dimensionDegenerated DimensionSlowly Changing Dimensions

If the schema is snowflake , then dimensions are normalized form.

Additive FactSemi additive factNon Additive fact

A fact table which doesn't contains any facts then called as fact less fact table.The first type of factless fact table is a table that records an event. Many event-tracking tables in dimensional datawarehouses turn out to be factless

A second kind of factless fact table is called a covera e tableDesigning a fact table is to determine the granularity of the fact table. By

yes. Fact table will be in 3rd normal form

Yes

Rating identity instr fact table is fact table,date_dim, all_org_dim, instrument_dim are dimensions

For each star schema it is possible to construct fact constellation schema(for example by splitting the original starschema into more star schemes each of them describes facts on another level of dimension hierarchies). The fact


52/53

constellation architecture contains multiple fact tables that share many dimension tables.

The main shortcoming of the fact constellation schema is a more complicated design because many variants for

A fact table which doesn't contains any facts then called as fact less fact table.

metadata is information about data Data quality is the reliability and effectiveness of data, particularly in a data warehouse. Data quality assurance (DQA) is the process of verifying the reliability and effectiveness of data. Maintaining dataquality requires going through the data periodically and scrubbing it.To achieve good quality information a database must havegood system designaccurate and completeuser friendly interfacedata validation

Surroage keys are sequence generated numbers. It is always numeric data type.ERWIN,RATIONAL ROSE

A materialized view is a table that actually contains rows, but behaves like a view. That is, the data in the tablechanges when the data in the underlying tables changes it will done in periodical basis.No, we will not insert records into materialized views but certain query refresh will be there in certain period of timesthen materialized view will get the data from the source tables.Query that is not predefined or anticipated, usually just runs once are called Ad-hoc queries. These are typical in thedataware housing environment.ODS:ODS is the Operational Data Source which is also called transactional data ODS is the source of a warehouse.Data from ODs is staged, transformed and then moved to datawarehouse.DSS:Gathers and presents data from a wide range of sources, typically for business purposes. DSS applications aresystems and subsystems that help people make decisions based on data that is pulled from a wide range of sources.This data used for analytical and reporting purpose.Data Staging Area: The Data Warehouse Staging Area is temporary location where data from source systems is copied

Three types of Slowly Changing Dimension,TYPE1 SCD , the new information overrides the current information. No history kept.TYPE2 SCD h d i dd d h bl Hi ill i l bl


53/53

TYPE2 SCD, the new record is added to the table. History will avialable.TYPE3 SCD: This will maintain the partial history.

we need to find the bottleneck of the mapping then we need to proceed to improve the performance of the

warehouse. By using partitions, query tuning, indexing.

Informatica_Questions & Answers

Documents

Transcript of Informatica_Questions & Answers