IBM Information Integration Capabilities · PDF fileNotes on any Repository Object –...
Transcript of IBM Information Integration Capabilities · PDF fileNotes on any Repository Object –...
![Page 1: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/1.jpg)
© 2006 IBM Corporation
IBM Information IntegrationCapabilities
![Page 2: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/2.jpg)
2
The IBM Solution: IBM Information ServerDelivering information you can trust
Understand Cleanse Transform Deliver
Parallel ProcessingRich Connectivity to Applications, Data, and Content
IBM Information Server
Discover, model, and govern information
structure and content
Standardize, merge,and correct information
Combine and restructure information
for new uses
Synchronize, virtualizeand move information
for in-line delivery
Unified Deployment
Unified Metadata Management
![Page 3: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/3.jpg)
3
Where is my information?
How do I get it when I need it?
What does it mean?
Can I trust it?
How do I get it in the form I need?
How do I get it where it needs to go?
How do I control it?
Why Is it Important to Start with Understanding?
![Page 4: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/4.jpg)
4
Physical Metadata: WebSphere Information Analyzer
Data-centric analysis of application, database and file-based sources
Secure, detailed profiling of fields, across fields, and across sources
Creation of metadata from profiling results
Results instantly promotable across IBM Information Server
UnderstandAnalyze source data structures, and
monitor adherence to integration and quality rules
WebSphere Information Analyzer
DataAnalysts
Subject Matter Experts
Physical View
![Page 5: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/5.jpg)
5
WebSphere Information Analyzer
What is it?What is it?Next generation data profiling and analysis tool for heterogeneous enterprise data sources
• Integrates profiling capabilities from three distinct products
What does it do?What does it do?Analyzes data sources to discover structure, contents and quality of information
• Infers the “reality” of the data, not just the data definition• Finds and reports missing, inaccurate and inconsistent data • Allows review of the quality of data throughout the life cycle
Who uses it?Who uses it?Business and Data Analysts, Data Quality Specialists, Data Architects and Data Stewards
![Page 6: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/6.jpg)
6
WebSphere Information Analyzer
End-to-End Data Profiling and Content Analysis– Combines data profiling, data audit, and data format investigation technologies
– Provides column, primary key, foreign key, and cross-domain analysis
– Incorporates comparative analysis against established baselines over time
– Leverages central repository for analysis results with project- and role-level data security
Driven by Business– Intuitive and Collaborative Environment
– Visualization of data analysis
– Extensive Reporting of analytical results
Exploiting Unique Information Integration Platform Advantages– Shared metadata and connectivity services
– Shared analytical results with WebSphere DataStage/QualityStage
– Parallel Engine technology for highly scalable performance
![Page 7: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/7.jpg)
7
A single unified and integrated framework
A new and exciting visual design
Pillar menu focused on methodology and user-based tasks, not products
Environment that promotes collaboration
Personalization and customization
Information Analyzer Home Screen
![Page 8: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/8.jpg)
8
Full graphical enablement and display of key analytical data
Potential problems flagged for easy identification
Multiple open workspaces and tabs for easy navigation to facilitate review
Ability to filter results to quickly focus on business issues
Information Analyzer Drill Down
![Page 9: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/9.jpg)
9
Quality Controls for Completeness and Validity of data values
Incomplete or Invalid values set by value, range, or reference sources
Consistency checks for data formats
Information Analyzer Validation
![Page 10: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/10.jpg)
10
Information Analyzer Spotlight: Column Analysis
•Domain Values & Validation
•Data Classification
•Data Properties
•Formats
![Page 11: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/11.jpg)
11
Frequencies of data values and format patterns
Classification of data by system and user
Inferences of data properties (e.g. data type, length, uniqueness)
Information Analyzer Spotlight
![Page 12: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/12.jpg)
12
Quality Controls for Completeness and Validity of data values
Incomplete or Invalid values set by value, range, or reference sources
Conformity checks for data formats
Information Analyzer Spotlight
![Page 13: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/13.jpg)
13
Easily generate reference tables of default, valid, or invalid data
Incorporate transformation mapping values
Preview table output
Export reference tables to desired location for ongoing use
Leverage in WebSphere DataStage or QualityStage jobs
Information Analyzer Spotlight
![Page 14: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/14.jpg)
14
Drilldown to underlying data
Review exception conditions from profiling or data rules
View in workspace with associated information
Filter drilldown results to enhance understanding
Information Analyzer Spotlight
![Page 15: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/15.jpg)
15
Information Analyzer Spotlight: Table Analysis
•Primary Keys(single or multi-column)
•Key Duplicates
![Page 16: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/16.jpg)
16
Evaluate single or multi-column primary keys
Summary and detail of column uniqueness
Details of primary key duplicates
Review of frequency distribution
Information Analyzer Spotlight
![Page 17: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/17.jpg)
17
Information Analyzer Spotlight: Cross Table Analysis
•Foreign Key Relationships
•Referential Integrity
•Cross-Domain Relationships
•Data Redundancy
![Page 18: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/18.jpg)
18
Evaluate single or multi-column foreign keys across any number of tables and sources
Summary of referential integrity
Details of key violations including orphaned values
Test any set of common domains for compatibility or redundancy
Information Analyzer Spotlight
![Page 19: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/19.jpg)
19
Information Analyzer Spotlight: Baseline Analysis
•Current-to-Prior Comparison
•Content & Structural Variation
![Page 20: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/20.jpg)
20
Compare a checkpoint or current analysis to a baseline
Table-level summary & column-level details
Identify changes in structure or content
Includes changes in quality measures
Turns data profiling into an ongoing event throughout project lifecycle
Information Analyzer Spotlight
![Page 21: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/21.jpg)
21
All analytical processes can be scheduled
Scheduling supports: start date or delay, repeating definitions, end date or delay, and repeat count to stop schedule
Information Analyzer Spotlight
![Page 22: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/22.jpg)
22
Information Analyzer Spotlight
Notes on any Repository Object– Metadata
Information– Any Analytical
Result
Supports user-defined Status and Type for subsequent reporting
![Page 23: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/23.jpg)
23
Multi-level security and administration framework:
Suite
Product
Project
Data source
Standard Authentication controls
User, role, and privilege assignment
Environment that supports critical compliance regulations
Information Analyzer Highlights
![Page 24: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/24.jpg)
24
Metadata discovery shared across Suite
Projects register interest only in Data Sources of concern
Metadata Import focused on user interest
Analytical results published in secured framework
Information Analyzer Spotlight
![Page 25: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/25.jpg)
25
The IBM Solution: IBM Information ServerDelivering information you can trust
Understand Cleanse Transform Deliver
Parallel ProcessingRich Connectivity to Applications, Data, and Content
IBM Information Server
Discover, model, and govern information
structure and content
Standardize, merge,and correct information
Combine and restructure information
for new uses
Synchronize, virtualizeand move information
for in-line delivery
Unified Deployment
Unified Metadata Management
![Page 26: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/26.jpg)
26
Why Should I Care About Cleansing Information?
Lack of information standards– Different formats & structures
across different systems
Data surprises in individual fields– Data misplaced in the database
Information buried in free-form fields
Data myopia– Lack of consistent identifiers inhibit
a single view
The redundancy nightmare– Duplicate records with a lack of
standards
Kate A. Roberts 416 Columbus Ave #2, Boston, Mass 02116
Catherine Roberts Four sixteen Columbus APT2, Boston, MA 02116
Mrs. K. Roberts 416 Columbus Suite #2, Suffolk County 02116
Name Tax ID Telephone
J Smith DBA Lime Cons. 228-02-1975 6173380300Williams & Co. C/O Bill 025-37-1888 415-392-20001st Natl Provident 34-2671434 3380321HP 15 State St. 508-466-1200 Orlando
WING ASSY DRILL 4 HOLE USE 5J868A HEXBOLT 1/4 INCH
WING ASSEMBY, USE 5J868-A HEX BOLT .25” - DRILL FOUR HOLES
USE 4 5J868A BOLTS (HEX .25) - DRILL HOLES FOR EA ON WING ASSEM
RUDER, TAP 6 WHOLES, SECURE W/KL2301 RIVETS (10 CM)
19-84-103 RS232 Cable 6' M-F CandS
CS-89641 6 ft. Cable Male-F, RS232 #87951
C&SUCH6 Male/Female 25 PIN 6 Foot Cable
90328574 IBM 187 N.Pk. Str. Salem NH 0145690328575 I.B.M. Inc. 187 N.Pk. St. Salem NH 0145690238495 Int. Bus. Machines 187 No. Park St Salem NH 0415690233479 International Bus. M. 187 Park Ave Salem NH 0415690233489 Inter-Nation Consults 15 Main Street Andover MA 0234190345672 I.B. Manufacturing Park Blvd. Bostno MA 04106
![Page 27: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/27.jpg)
27
Data Cleansing: WebSphere QualityStage
Specialized data quality functions seamlessly integrated with DataStage
Visual tools for defining complex matching and survivorship logic
Ensures clean, standardized, de-duplicated information
Enables a single version of the truth
Cleanse
Subject Matter Experts
Standardize and correct source data fields, and match records together
across sources to create a single view
WebSphere QualityStage™
Visual Match Rule Design
DataAnalysts
![Page 28: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/28.jpg)
28
Integrated Approach - QualityStage & Information Analyzer
Sharing metadata
Both Information Analyzer and QualityStage store Table metadata in the common repository
• Allows sharing of metadata definitions• Provides single metadata import from data source ~ for use in both tools
– Analytical information available in QS Designer• Enables QualityStage user to see analysis data for shared tables• “Analytical Information” tab on the
EditRow dialog when looking at thedetails of an individual column from…
– …a Table Definition– …a stage editor
• “Analytical Information” tab on the TableDefinition dialog
![Page 29: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/29.jpg)
29
Standardization Benefits
Direct from DB or flat file
Optimize disk
Rules are now ‘first class’ objects
![Page 30: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/30.jpg)
30
Introduction to New Match Design Environment -Features
The Major Components
Holding AreaHistogram
Data Viewer
Decision Rules
Pass Composer
Cutoff Tuning
![Page 31: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/31.jpg)
31
Statistics
Introduction to New Match Design Environment -Features
The Major Components (cont.)
Baseline Analysis
Customizable Graphics
![Page 32: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/32.jpg)
32
QualityStage ProcessData
Quality Assessment
(DQA)
Investigation
Data Re-Engineering (DRE)
Standardization Matching Survivorship
Blk 1, 1 St, 05-0005-00 Frist St, Block 11 First Str, #05-001, St, #05-00
Blk 1|First St|05-00Blk 1|First St|05-001|First St|#05-001|St|#05-00
Blk 1|First St|05-00Blk 1|First St|05-001|First St|#05-001|St|#05-00
#05-00, Blk 1, First St#05-00, 1, St
0001 25.0% L^^T^-^0001 25.0% ^-^+TL^0001 25.0% ^OT#^-^0001 25.0% ^T#^-^
![Page 33: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/33.jpg)
33
Investigation - Character
1. Double Click
![Page 34: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/34.jpg)
34
Investigation - Character
2. Select Column 3. Add
![Page 35: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/35.jpg)
35
Investigation - Character
9. Define output as desired
![Page 36: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/36.jpg)
36
Standardization
1. Double Click
Job: Tech Symposium\QualityStage\2.Standardarize\StanAndGenMatchFreqODBC
![Page 37: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/37.jpg)
37
Standardization
1. Double Click
![Page 38: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/38.jpg)
38
Standardization
6. Stage Properties
![Page 39: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/39.jpg)
39
Standardization
7. Output tab to map columns
![Page 40: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/40.jpg)
40
Standardization
8. OK
![Page 41: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/41.jpg)
41
Match Design - Unduplicate
![Page 42: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/42.jpg)
42
Match Design – Unduplicate - Overview
The Major ComponentsHolding AreaHistogram
Data Viewer
Decision Rules
Pass Composer
Cutoff Tuning
![Page 43: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/43.jpg)
43
Match Design - Unduplicate
1. Create Specification
![Page 44: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/44.jpg)
44
Match Design - Unduplicate
Blank Specification
![Page 45: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/45.jpg)
45
Match Design - Unduplicate
2. Select Match Type
![Page 46: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/46.jpg)
46
Match Design - Unduplicate3.
Double
click
on
link t
o loa
d meta
data
4. Load
5. NavigateAnd OK
![Page 47: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/47.jpg)
47
Match Design - Unduplicate
OK
![Page 48: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/48.jpg)
48
Match Design - Unduplicate
6.Click on ‘MyPass’
‘Blocking’
‘Match Commands’
![Page 49: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/49.jpg)
49
Match Design - Unduplicate
8.Save Match Specification
![Page 50: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/50.jpg)
50
Match Design - Unduplicate
9.Give Name and ‘Save’
![Page 51: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/51.jpg)
51
Match Design - Unduplicate
10. Configuration
![Page 52: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/52.jpg)
52
Match Design - Unduplicate
11. Data Sample
12. Data Frequency
13. Data Source Name14. User Name (qsmatch)15. Password (qsmatch)
![Page 53: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/53.jpg)
53
Match Design - Unduplicate
16. Add Blocking Columns
![Page 54: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/54.jpg)
54
Match Design - Unduplicate
17. Select Column
![Page 55: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/55.jpg)
55
Match Design - Unduplicate
18. Add MATCH Column
![Page 56: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/56.jpg)
56
Match Design - Unduplicate
19. Business Name
![Page 57: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/57.jpg)
57
Match Design - Unduplicate
20. Compare Type
![Page 58: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/58.jpg)
58
Match Design - Unduplicate
21. Data ColumnRight-Click
![Page 59: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/59.jpg)
59
Match Design - Unduplicate
Frequencies
![Page 60: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/60.jpg)
60
Match Design - Unduplicate
22. Select
23. Parameter
![Page 61: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/61.jpg)
61
Match Design – Unduplicate (Fully Configured)
![Page 62: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/62.jpg)
62
Match Design – Unduplicate
Grouping option:Match Sets: See all matches and duplicates togetherMatch Pairs+Sort: See the master record repeated
![Page 63: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/63.jpg)
63
Match Design – Unduplicate
Default Display (Grouped by Match Sets)
Grouped by Match Pairs and then sorted Ascending by Weight
![Page 64: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/64.jpg)
64
Match Design – Unduplicate
Compare Weights:See how any two records score
![Page 65: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/65.jpg)
65
Match Design – Unduplicate
Statistics Tab
Change What Shows
![Page 66: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/66.jpg)
66
Match Design – Unduplicate
Change How Shows
![Page 67: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/67.jpg)
67
Match Design – UnduplicateTOTAL Statistics Tab
Change What Shows
Change How Shows
![Page 68: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/68.jpg)
68
Match Implementation - Unduplicate
![Page 69: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/69.jpg)
69
Uduplication Implementation
Job: Tech Symposium\QualityStage\3.Unduplicate\Unduplicate
1. Double Click
![Page 70: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/70.jpg)
70
Uduplication Implementation
2. Click ‘…’
![Page 71: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/71.jpg)
71
Uduplication Implementation
8. Output Tab to map columns
![Page 72: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/72.jpg)
72
Survive
![Page 73: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/73.jpg)
73
Survive
Job: Tech Symposium\QualityStage\4.Survive\Survive
1. Double Click
![Page 74: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/74.jpg)
74
Survive
3. Highlight and‘Modify Rule’
2. Select Group Identification Column
![Page 75: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/75.jpg)
75
Survive
4. Output Column5. Technique
![Page 76: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/76.jpg)
76
Survive
Out-of-the-boxTechniques
![Page 77: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/77.jpg)
77
Survive
‘Complex’ available
![Page 78: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/78.jpg)
78
Single Design Environment
All phases of data quality:– Investigate
– Standardize
– Match• Unduplicate• Reference
– Survive
![Page 79: IBM Information Integration Capabilities · PDF fileNotes on any Repository Object – Metadata Information – Any Analytical Result Supports user- ... DataStage Visual tools for](https://reader035.fdocuments.net/reader035/viewer/2022081323/5a78e7377f8b9a77088ccfe5/html5/thumbnails/79.jpg)
79