2552 IBM WebSphere Quality Stage 8.0 Deep Dive

30
IBM WebSphere QualityStage 8.0 Deep Dive

Transcript of 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

Page 1: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 1/30

IBM WebSphere QualityStage 8.0

Deep Dive

Page 2: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 2/30

 Agenda

What is QualityStage Outline of Product Strategy

Tour of QualityStage 8.0

 ± Product Look & Feel

 ± Palette of functionality

How QualityStage has improved functionally

Demonstration

How to upgrade Q&A

Page 3: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 3/30

What is QualityStage?

Page 4: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 4/30

Need for Data Quality

Critical Problems Need to create & maintain 360 degree views of 

customers, suppliers, products, locations, events

Need to leverage data - make reliable decisions,

comply with regulations, meet service agreements

Why? No common standards across organization

Unexpected values stored in fields

Required information buried in free-form fields

Fields evolve - used for multiple purposes

No reliable keys for consolidated views

Operational data degrades 2% per month

Alternative Approaches Denial ± problem misunderstood and ignored until

too late; load and explode

Hand-coding - clerical exception processing; very

time consuming and resource intensive

Simplistic cleansing apps - evolved from direct

marketing & list hygiene, lack flexibility

Kent Fried Chick

Kentucky Fried

Kentucky Fried Chicken

KFC

Molly Talber DBA KFC

Mrs. M. Talber 

John & Molly Talber 

Talber, KFC, ATIMA

Data Sources Data ValuesData Sources Data Values

227G CB&N ATURAL STICKMOZZ WRAPPER

227G CB&N AT STICK PQUE/MOZZ WRAPP.

Page 5: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 5/30

Measuring & Resolving: Designing Data

Quality Rules Data quality rules are be embedded into data flows

Investigate

source data

Standardize

information

Match records

together Survive the best

data across

sources into a

new record

Page 6: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 6/30

Investigation ± Word

Parsing:Separating multi-valued fields into individual pieces

123 | St. | Virginia | St.

VirginiaVirginia

Lexical analysis:Determining business significance of individual pieces

Context Sensitive:Identifying various data structures andcontent

Number Street Alpha StreetType Type

123 | St. | Virginia | St.

House StreetNumber Street Name Type

123 | St. Virginia | St.

123123 St.St. St.St.

³The instructions for handling the data are inherent within the data itself.´

Page 7: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 7/30

Standardization - AddressInput File:

 Address Line 1 Address Line 2

639 N MILLS AVENUE ORLANDO, FLA 32803

306 W MAIN STR, CUMMING, GA 30130

3142 WEST CENTRAL AV TOLEDO OH 43606

843 HEARD AVE AUGUSTA-GA-30904

1139 GREENE ST ACCT #1234 AUGUSTA GEORGIA 309014275 OWENS ROAD SUITE 536 EVANS GA 30809

Result File:

House # Dir Str. Name Type Unit No. NYSIIS City SOUNDEX State Zip ACCT#

639 N MILLS AVE MAL ORLANDO O645 FL 32803

306 W MAIN ST MAN CUMMING C552 GA 30130

3142 W CENTRAL AVE CANTRAL TOLEDO T430 OH 43606

843 HEARD AVE HAD AUGUSTA A223 GA 30904

1139 GREENE ST GRAN AUGUSTA A223 GA 30901 1234

4275 OWENS RD STE 536 ON EVANS E152 GA 30809

Results in strongly ́ typedµ fixed fielded standardized data

Page 8: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 8/30

Effective MatchingMatching is the most benefi c i al  and t echni cally  chall enging part of 

d ata qual i ty 

Matching:

 ± Should be based on statistical probability

 ± Must consider frequency, discriminating values, & reliability of fieldswhen determining which fields to weight in a match

 ± Against more fields of data produces higher quality matches

 ± Logic is a very business-sensitive issue ± business users should be

involved in the design & review of match results

Page 9: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 9/30

What Do You Do with Match Results?

Clerical review

Record linkage

Survivorship

Append/

Fix sources

Cross-reference

? =

Page 10: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 10/30

Product Strategy

Page 11: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 11/30

The IBM Solution: IBM Information Server Del iver ing infor mat ion y ou can tr ust 

IBM Information Server 

Discover, model, andgovern information

structure and content

Standardize, merge,and correct information

Combine andrestructure information

for new uses

Synchronize, virtualizeand move information

for in-line delivery

Unified Deployment

Unified Metadata Management

Page 12: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 12/30

One Stop Shopping For Data Quality and Data

Transformation QualityStage 8.0 delivers best in class data quality features All in One

Environment

 ± Delivers collaborative environment via common repository

 ± Leverages shared canvas ± Design as you Think

 ± Removes dependencies on metabroker & plug-in technology to WebSphereDataStage

 ± Read/Write directly from a Database with preferred connectivity options ± Massive productivity gains and control over integration tasks

QualityStage 8.0 delivers advanced functionality to enable a New Class of Users in developing matching applications

 ± A dynamic control center driven approach

 ± Eliminates the need for additional products in showcasing underlying dataand meta data information

 ± Results are shown as you develop

Page 13: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 13/30

WebSphere DataStage & QualityStage Designer 8.0

Page 14: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 14/30

Impact Analysis ± Graphical View

- Find dependencies«What does this item depend on?

- Find where used«Where is this item used?

Impact Analysis:

Results shown using the Advanced Find window

Page 15: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 15/30

Job Difference ± Integrated report

Difference report displayedin Designer - jobs opened

automaticallyfrom report hot links

Options available to:

Print report Simple ³Find´ in report Launch external diff 

tool for more in-depthdiff of textualproperties, e.g. columnname, column length

Page 16: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 16/30

Rich Reporting ± Collaborative Documentation

Job Design flowprovides visualrepresentation of 

processes

Control

descriptiondelivered as

hotlink for quicknavigation to

relevant process

Thin clientimplementation

providescollaboration

Page 17: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 17/30

Lets take a tour through

QualityStage 8.0

Page 18: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 18/30

How we¶ve improved and

functionally

Page 19: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 19/30

Investigation Flexibility

S ing l e P r oc ess

Land Result s

P r oc ess F ast er 

Page 20: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 20/30

Integrated Approach - QualityStage &

Information Analyzer 

Sharing metadata at Hawk

 ± Both Information Analyzer and QualityStage store Table metadata in the common

repository

Allows sharing of metadata definitions

Provides single metadata import from data source ~ for use in both tools

 ± Analytical information available in QS Designer 

Enables QualityStage user to see analysis data for shared tables

³Analytical Information´ tab on the

EditRow dialog when looking at the

details of an individual column from«

 ± «a Table Definition

 ± «a stage editor 

³Analytical Information´ tab on the Table

Definition dialog

Page 21: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 21/30

Standardization Benefits

Di r ect f r om DB or  f lat  fi l e

Opt imize disk 

Rul es ar e now µfi r st  class¶ object s

Page 22: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 22/30

Introduction to New Match Design

Environment - Features

The Major Components

Holding AreaHistogram

Data Viewer 

Decision Rules

Pass Composer 

Cutoff Tuning

Page 23: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 23/30

Statistics

Introduction to New Match Design

Environment - FeaturesThe Major Components ( c ont.)

Baseline Analysis

Customizable Graphics

Page 24: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 24/30

Demonstration of Major 

Features

Page 25: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 25/30

How to upgrade

Page 26: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 26/30

QualityStage Import Features

Br ing all stages int o thecanv as as a sing l e stage

P r oc ess f r om end t o end wi th minimal  c onfigurat ion

W r i t e new pr oject s using 

H awk design f ramewor k 

Page 27: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 27/30

QualityStage Import Features(2)

Br ing all stages int o thecanv as as a pr oc ess f l ow 

S how cases each ar ea for   pr oc ess and all owsint eg rat ion wi th other H awk 

stages

Conver sion ut i l i t ies in plan

Page 28: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 28/30

Platforms

Beta ± Windows Server 2003

 ± AIX 5.2, 5.3

 ± Red Hat Enterprise Linux AS 3.0

 ± DS & QS Client: Windows XP

GA Adds

 ± Red Hat Enterprise Linux AS 4.0

 ± SuSE Enterprise Linux 9, 10

 ± HP-UX 11i1 (11.11), 11i2 (11.23) ± PA-RISC ± Solaris 2.9, 2.10

NLS Support, but not localized

Page 29: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 29/30

The IBM Information Server Advantage A Compl et e Infor mat ion Inf rastr uct ur e

 A c omprehensive, unified foundation for enterprise information

architectures, scalable to any volume and processing requirement

Auditable data quality as a foundation for trusted information across

the enterprise

Metadata-driven integration, providing breakthrough productivityand flexibility for integrating and enriching information

Consistent, reusable information servi c es²along with application

services and process services, an enterprise essential

 Accelerated time to value with proven, industry-aligned solutionsand expertise

Broadest and deepest c onnec tivity to information across diverse

sources: structured, unstructured, mainframe, and applications

Page 30: 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

8/8/2019 2552 IBM WebSphere Quality Stage 8.0 Deep Dive

http://slidepdf.com/reader/full/2552-ibm-websphere-quality-stage-80-deep-dive 30/30

Thank You & Questions