a Etl Intro
Transcript of a Etl Intro
-
8/2/2019 a Etl Intro
1/32
Global Business Services
2007 IBM Corporation
Informatica ETL Tool
- By Srini VeeravalliSiebel Analytics Factory
26th Sept, 2007
-
8/2/2019 a Etl Intro
2/32
Global Business Services
2007 IBM Corporation2
Synopsis:
Introduction to Informatica 7x
Importance of Informatica for ETL Applications Informatica Architecture
Informatica Client module
Informatica Server
Transformations in Informatica
Working with Workflow Manager
Working with Workflow Monitor
Demonstration of Sample Informatica Mapping
Informatica - ETL Tool
-
8/2/2019 a Etl Intro
3/32
Global Business Services
2007 IBM Corporation3
Informatica PowerCenter Informatica PowerMart
All product functionality,
including the ability to registermultiple servers.
A PowerCenter lets youcreate a single repository thatyou can configure as a global
repository.
Includes all features except
distributed metadata, multipleregistered servers.
Informatica Suite
-
8/2/2019 a Etl Intro
4/32
Global Business Services
2007 IBM Corporation4
Extraction, Transformation, Loading
Can extract large volumes of data from multipleplatforms, handle complex transformations on the data, andsupport high-speed loads.
It can simplify and accelerate the process of moving data
warehouses from development to test to production.
Importance of Informatica for ETL Applications
-
8/2/2019 a Etl Intro
5/32
Global Business Services
2007 IBM Corporation5
Sources
PowerCenter and PowerMart access the following sources:
Relational.
File.
Application.
Mainframe.
Other. Microsoft Excel and Access.
Informatica Architecture
-
8/2/2019 a Etl Intro
6/32
Global Business Services
2007 IBM Corporation6
Targets
PowerCenter and PowerMart can load data into the followingtargets:
Relational.
File.
Application.
Other. Microsoft Access.
You can load data into targets using ODBC or native drivers, FTP, orexternal loaders.
Informatica Architecture
-
8/2/2019 a Etl Intro
7/32
Global Business Services
2007 IBM Corporation7
Informatica Architecture
Informatica provides the following integrated components:
Informatica repository. The Informatica repository is at the center of theInformatica suite. The Informatica Client and Server access the repository tosave and retrieve metadata.
Informatica Repository Server. The Informatica Repository Servermanages connections to the repository from client applications.
Informatica Client. Use the Informatica Client to manage users, definesources and targets, build mappings and mapplets with the transformationlogic.
Informatica Server. The Informatica Server extracts the source data,performs the data transformation, and loads the transformed data into thetargets.
-
8/2/2019 a Etl Intro
8/32
Global Business Services
2007 IBM Corporation8
Informatica Architecture
Server
Sources Target
Repository
Repository
ManagerDesigner
WorkflowManager/MonitorInformatica Client
Source Analyzer
Warehouse Designer
TransformationDevelope
r
Mapping Designer
Mapplet Designer
-
8/2/2019 a Etl Intro
9/32
Global Business Services
2007 IBM Corporation9
Connectivity
-
8/2/2019 a Etl Intro
10/32
Global Business Services
2007 IBM Corporation10
Informatica Client
Repository Manager. Use the Repository Manager to create and
administer the metadata repository
Designer. Use the Designer to create mappings that containtransformation instructions for the Informatica Server.
Source Analyzer. Import or create source definitions.
Warehouse Designer. Import or create target definitions. Transformation Developer. Develop reusable transformations to use inmappings.
Mapplet Designer. Create sets of transformations to use in mappings.
Mapping Designer. Create mappings that the Informatica Server uses toextract, transform, and load data.
Workflow Manager. Use the Workflow Manager to create, schedule, andrun workflows.
Workflow Monitor. Use the Workflow Monitor to monitor scheduled and
running workflows for each Informatica Server.
-
8/2/2019 a Etl Intro
11/32
Global Business Services
2007 IBM Corporation11
Informatica Client Rep Server Admin Console
-
8/2/2019 a Etl Intro
12/32
Global Business Services
2007 IBM Corporation12
Use the Administration Console to add repository configurations to theConsole Tree.
When you add a repository configuration, you can perform the followingactions:Create a repository in a database.
Change the Repository Server managing the repository.
Upgrade an existing repository from an earlier version.
Informatica Client Rep Server Admin Console
-
8/2/2019 a Etl Intro
13/32
Global Business Services
2007 IBM Corporation13
Informatica Client Repository Manager
-
8/2/2019 a Etl Intro
14/32
Global Business Services
2007 IBM Corporation14
Repository Manager TasksYou can use the Repository Manager to perform the following tasks:
Add a repository. You can add multiple repositories.
Remove a repository. You can remove one or more repositories.
Connect to a repository. You can connect to one repository or multiplerepositories in a domain.
Export and import repository connection information. You can exportrepository connection information from the client registry to a file.
Truncate session and workflow log entries. You can truncate the list of
session and workflow logs.
Search for target definitions containing a keyword. You can use a keywordto search for a target definition.
Search for repository objects. You can search for repository objectscontaining specified text or keywords.
Informatica Client Repository Manager
-
8/2/2019 a Etl Intro
15/32
Global Business Services
2007 IBM Corporation15
The Designer has five tools to help you build mappings and mapplets so youcan specify how to move and transform data between sources and targets.The Designer helps you create source definitions, target definitions, andtransformations to build your mappings.
The Designer allows you to work with multiple tools at one time and to workin multiple folders and repositories at the same time.
Designer Tools
The Designer provides the following tools:
Source Analyzer. Use to import or create source definitions for flatfile, XML, COBOL, Application, and relational sources.Warehouse Designer. Use to import or create target definitions.Transformation Developer. Use to create reusable transformations.Mapplet Designer. Use to create mapplets.Mapping Designer. Use to create mappings.
Informatica Client Designer
-
8/2/2019 a Etl Intro
16/32
Global Business Services
2007 IBM Corporation16
Informatica Client Designer Source Analyzer
-
8/2/2019 a Etl Intro
17/32
Global Business Services
2007 IBM Corporation17
Importing Source / Target Definition from Data Base
Informatica Client Designer Source Analyzer
-
8/2/2019 a Etl Intro
18/32
Global Business Services
2007 IBM Corporation18
Informatica Client Designer Warehouse Designer
G B S
-
8/2/2019 a Etl Intro
19/32
Global Business Services
2007 IBM Corporation19
Informatica Client Designer Mapping Designer
Output
Gl b l B i S i
-
8/2/2019 a Etl Intro
20/32
Global Business Services
2007 IBM Corporation20
Designer WindowsThe Designer consists of the following windows:
Navigator. Use to connect to and work in multiple repositories andfolders. You can also copy and delete objects and create shortcuts usingthe Navigator.
Workspace. Use to view or edit sources, targets, mapplets,transformations, and mappings. You can work with a single tool at a timein the workspace.
Status bar. Displays the status of the operation you perform.
Output. Provides details when you perform certain tasks, such assaving your work or validating a mapping.
Overview. An optional window to simplify viewing workbooks
containing large mappings or a large number of objects.
Informatica Client Designer Windows
Gl b l B i S i
-
8/2/2019 a Etl Intro
21/32
Global Business Services
2007 IBM Corporation21
Transformations
Transformations are the manipulation of data from how itappears in the source system(s) into another form in the datawarehouse.
This includes
Data merging: Process of standardizing data types and fields.
Cleansing: This involves identifying any changinginconsistencies or inaccuracies.
Eliminating inconsistencies in the data from multiple sources.
Converting data from different systems into single consistentdata set suitable for analysis.
Aggregation: The process where by multiple detailed values arecombined into a single summary value typically summationnumbers representing dollars spend or units sold.
Gl b l B i S i
-
8/2/2019 a Etl Intro
22/32
Global Business Services
2007 IBM Corporation22
Transformations in Informatica
Aggregator Transformation: The Aggregator transformation allows you toperform aggregate calculations, such as averages and sums.
Expression Transformation: Expression transformation to calculate values in asingle row before you write to the target. You can use the Expression transformation toperform any non-aggregate calculations.
Advanced External Procedure Transformations : Advanced External Proceduretransformations operate in conjunction with procedures you create outside of theDesigner interface to extend PowerCenter/PowerMart functionality.
External Procedure Transformations : External Procedure transformationsoperate in conjunction with procedures you create outside of the Designer interface toextend PowerCenter/PowerMart functionality.
Filter Transformation : Filter transformation provides allows you to filter rows in amapping.
Rank Transformation: Allows you to select only the top or bottom rank of data.
Router Transformation: Is similar to Filter transformation but it used for two ormore filter conditions.
Global B siness Ser ices
-
8/2/2019 a Etl Intro
23/32
Global Business Services
2007 IBM Corporation23
Joiner transformation : Joiner transformation joins two related heterogeneous sources
residing in different locations or file systems. The combination of sources can be varied. Youcan use the following sources:
Two relational tables existing in separate databases
Two flat files in potentially different file systems
Two different ODBC sources
Two instances of the same XML source A relational table and a flat file source
A relational table and an XML source
Lookup transformation:Lookup transformation in your mapping to look up data in a relational table, view, orsynonym.
Normalizer Transformation: Normalization is the process of organizing data. use theNormalizer transformation with COBOL sources, which are often stored in a denormalizedformat
Sequence Generator Transformation: Is used for generates numeric values
Stored Procedure Transformation: Is an important tool for populating and maintaining
databases.
Transformations in Informatica
Global Business Services
-
8/2/2019 a Etl Intro
24/32
Global Business Services
2007 IBM Corporation24
Transformations in Informatica
Sorter transformation Transformation :Sorter transformation allows you to sort data. You can sort data from asource transformation in ascending or descending order according to aspecified sort key.
Source Qualifier Transformation :Source Qualifier represents the rows that the Informatica Server reads whenit executes a session. The Source Qualifier displays the transformationdatatypes. The transformation datatypes in the Source Qualifier determinehow the source database binds data when the Informatica Server reads it.
XML Source Qualifier Transformation:When you add an XML source definition to a mapping, you need to connect it
to an XML Source Qualifier transformation.
Update Strategy Transformation:To update the target based on the flag values.
Global Business Services
-
8/2/2019 a Etl Intro
25/32
Global Business Services
2007 IBM Corporation25
Work Flow Manager
Global Business Services
-
8/2/2019 a Etl Intro
26/32
Global Business Services
2007 IBM Corporation26
Workflow Manager ToolsThe Workflow Manager consists of three tools to help you develop aworkflow:
Task Developer. Use the Task Developer to create tasks you want to
execute in the workflow.
Workflow Designer. Use the Workflow Designer to create a workflowby connecting tasks with links. You can also create tasks in the WorkflowDesigner as you develop the workflow.
Worklet Designer. Use the Worklet Designer to create a Worklet.
Work Flow Manager
Global Business Services
-
8/2/2019 a Etl Intro
27/32
Global Business Services
2007 IBM Corporation27
Work Flow Manager Server Editor
Global Business Services
-
8/2/2019 a Etl Intro
28/32
Global Business Services
2007 IBM Corporation28
The Informatica Server moves data from sources to targets based onworkflow and mapping metadata stored in a repository.
A session is a type of workflow task. A session is a set of instructionsthat describes how to move data from sources to targets using amapping.
When a workflow starts, the Informatica Server retrieves mapping,workflow, and session metadata from the repository to extract data fromthe source, transform it, and load it into the target.
The Informatica Server uses the following processes to run aworkflow:
The Load Manager process. Starts and locks the workflow, runsworkflow tasks, and starts the DTM to run sessions.
The Data Transformation Manager (DTM) process. Performssession validations. Creates threads to initialize the session, read, write,and transform data.
Informatica Server
Global Business Services
-
8/2/2019 a Etl Intro
29/32
Global Business Services
2007 IBM Corporation29
Work flow Monitor
Navigator Window Time window Out put window
Global Business Services
-
8/2/2019 a Etl Intro
30/32
Global Business Services
2007 IBM Corporation30
Workflow Monitor is a tool that allows you to monitor workflows and tasks.You can view details about a workflow. You can run, stop, abort, and resumeworkflows from the Workflow Monitor.
The Workflow Monitor consists of the following windows:Navigator window. Displays monitored repositories, servers, andrepositories objects.
Output window. Displays messages from the Informatica Server and theRepository Server.
Time window. Displays progress of workflow runs.
Gantt Chart view. Displays details about workflow runs in chronological(Gantt Chart) format.
Task view. Displays details about workflow runs in a report format,organized by task, folder, or status.
Work flow Monitor
Global Business Services
-
8/2/2019 a Etl Intro
31/32
Global Business Services
2007 IBM Corporation31
Questions ? ???
Global Business Services
-
8/2/2019 a Etl Intro
32/32
Global Business Services
2007 IBM C ti32
Thank You