Defining a Person Metadata Model to Improve Data · PDF fileDefining a Person Metadata Model...

33
Defining a Person Metadata Model to Improve Data Quality 8 th February 2011 Ian Woodrow [email protected] Atos, Atos and fish symbol, Atos Origin and fish symbol, Atos Consulting, and the fish itself are registered trademarks of Atos Origin SA. August 2006 © 2006 Atos Origin. Confidential information owned by Atos Origin, to be used by the recipient only. This document or any part of it, may not be reproduced, copied, circulated and/or distributed nor quoted without prior written approval from Atos Origin. [email protected]

Transcript of Defining a Person Metadata Model to Improve Data · PDF fileDefining a Person Metadata Model...

Defining a Person Metadata Model to Improve Data Quality8th February 2011

Ian Woodrow

[email protected]

Atos, Atos and fish symbol, Atos Origin and fish symbol, Atos Consulting, and the fish itself are registered trademarks of Atos Origin SA. August 2006© 2006 Atos Origin. Confidential information owned by Atos Origin, to be used by the recipient only. This document or any part of it, may not be reproduced, copied, circulated and/or distributed nor quoted without prior written approval from Atos Origin.

[email protected]

Make things as simple as possible, but no simpler

Ei t i

Essentially, all models are wrong, but some are useful

Einstein

y g

Box & Draper

If you can’t measure it you can’t manage itIf you can’t measure it, you can’t manage it…Various known

2 Defining a Person Metadata Model to Improve Data Quality

Your PresenterYour Presenter

R l S i /P j t M d I f ti A l t

Projects

Role – Service/Project Manager and Information Analyst

EmploymentProjects

» Data Migration and Cleansing

» Data Standards Implementations

» National Audit Office (E&AD)

» Capgemini

F l» Sales and Account Development

» Business and Data Analysts

A l A ti

» Freelance

» Atos Origin

» Accruals Accounting

» [email protected]

Recent Training

» TOGAF9 Enterprise Data Architect

P i 2 P titi» Prince2 Practitioner

» Value Analysis.

3 Defining a Person Metadata Model to Improve Data Quality

IntroductionIntroduction

» Information Management Service

» Corporate Data Model

» Person Metadata Model» Person Metadata Model

» Data Standards

» Data Profile Reporting

» Questions and Queries» Questions and Queries

» Contact Me.

4 Defining a Person Metadata Model to Improve Data Quality Optional chapter number (Arial 10 plain)

Information Management ServiceInformation Management Service

5 Defining a Person Metadata Model to Improve Data Quality

Corporate Data ModelCorporate Data Model

PEOPLE/ORGANISATION

DOCUMENTSORGANISATION

COMPANY

SERVICESEVENTS

LOCATIONS

6 Defining a Person Metadata Model to Improve Data Quality Optional chapter number (Arial 10 plain)

ApplicationsApplications

R i t» Registry

» Membership

B fit» Benefit

» Biometrics

» Resources» Resources

» Overseas

7 Defining a Person Metadata Model to Improve Data Quality

Data StandardsData Standards

S l t T t Att ib tSelect Target Attribute

» Identify Local Application Alternatives» Look for Commonality» Look for Commonality» Review Government Standards» Review UK National Standards» Review Other Government Standards (e.g. NIST)( g )» Review International Standards» Assess Applicability» Decide Solution» Confirm Solution» Publish Solution

T t Att ib t M t d tTarget Attribute Metadata

» Datatype» Length » Description

8 Defining a Person Metadata Model to Improve Data Quality

» Length» Values Lists » Constraints

» Defaults

Tooling: Best of Breed SolutionTooling: Best of Breed Solution

PRODUCTS

INTRANETCHANNELS

Erwin Data ModellerModel Manager

INTRANETTECH DOC LIBRARY

Model ManagerProcess Modeller (BPWin)

Discovery

9 Defining a Person Metadata Model to Improve Data Quality

CDM Person Subject AreaCDM Person Subject Area

10 Defining a Person Metadata Model to Improve Data Quality

Cross-reference CDM to ApplicationsCross-reference CDM to Applications

ATTRIBUTE CDM REGISTRY MEMBERSHIPATTRIBUTE CDM REGISTRY MEMBERSHIPFull NameFamily Name PERSON.FAMILY NAME REGISTER.PRINCIPAL NAME PEOPLE.SURNAMEGiven NamesDate of BirthGenderNationality(s)Language (s)Alternative DetailsOrganisation Start DateOrganisation Start DateOrganisation ReferenceInternational ID (Passport)NI NumberFirst Line AddressPostcodeTelephone Number(s)

11 Defining a Person Metadata Model to Improve Data Quality

Publishing the ResultsPublishing the Results

12 Defining a Person Metadata Model to Improve Data Quality

Person Metadata Model VisionPerson Metadata Model – Vision

» To Create a Whole Customer View» Person Identity» Key Events» Key Events» Sufficient for Identity Resolution

» Organisation Facts » Technical Factsg» Siloed Data Sources» Several Suppliers» Considering:

» Technical Facts» Different Databases

-SQL ServerOracle» Considering:

» Master Data Management» Service Oriented Architecture

-Oracle-Access

» No ETL Tool on server

13 Defining a Person Metadata Model to Improve Data Quality

Benefits of Person Metadata Model ApproachBenefits of Person Metadata Model Approach

» Address data shortfall in existing applications (enrichment)

F b i f li ti d t d l» Form basis for new application data models

» Bench mark internal developments and suppliers offerings

Baseline for Profiling» Baseline for Profiling

14 Defining a Person Metadata Model to Improve Data Quality Optional chapter number (Arial 10 plain)

Person Metadata ModelPerson Metadata Model

15 Defining a Person Metadata Model to Improve Data Quality Optional chapter number (Arial 10 plain)

Person Baseline AnalysisPerson Baseline Analysis

ApplicationsApplications (21 existing + new)

16 Defining a Person Metadata Model to Improve Data Quality

CDM Attribute ScoringCDM Attribute Scoring

17 Defining a Person Metadata Model to Improve Data Quality Optional chapter number (Arial 10 plain)

Application Data Model to CDM AnalysisApplication Data Model to CDM Analysis

18 Defining a Person Metadata Model to Improve Data Quality Optional chapter number (Arial 10 plain)

Data Standards - ApproachData Standards - Approach

S l t T t Att ib tSelect Target Attribute

» Identify Local Application Alternatives» Look for Commonality» Look for Commonality» Review Government Standards» Review UK National Standards» Review Other Government Standards (e.g. NIST)( g )» Review International Standards» Assess Applicability» Decide Solution» Confirm Solution» Publish Solution

19 Defining a Person Metadata Model to Improve Data Quality

Data Standards Observations and IssuesData Standards – Observations and Issues

» Local Standards» Name

- Two Fields (Given Names and Family Name)Si l C i Fi ld- Single Composite Fields

» Problems- Which is the Family Name (single field option)- Which is the Family Name (single field option)- Cultures with no Family Name- Length, Which Character set to Useg ,- Using National Standards to International Situation

» GenderSi l Ch t tt i ll (b t l t t)- Single Character pretty universally (but also text)

- Which standard to apply (M/F or H/D, not known, not disclosed not specified)

20 Defining a Person Metadata Model to Improve Data Quality

disclosed, not specified)

Optional chapter number (Arial 10 plain)

Data Standards: Given Names ResolutionData Standards: Given Names Resolution

» eGIF

» Originally CDM mandated to be compliant with eGIF» Too short» Too short

» Application Standard» Acknowledged as almost long enough!» Acknowledged as almost long enough!

» New Standard for length 100 characters prevails

» Promoted the publication on CDM website/Release Note

21 Defining a Person Metadata Model to Improve Data Quality Optional chapter number (Arial 10 plain)

Data Standards: Gender ResolutionData Standards: Gender Resolution

» Application examplespp p» Values: M/F, M/F/U, M/F/D

» eGIF:» eGIF:» 4 values 0,1(Male), 2(Female), 9

» Use of ISO/IEC 5218:2004 standard enables international» Use of ISO/IEC 5218:2004 standard, enables international interchange

Standard for 4 UK values with mappings» Standard for 4 UK values with mappings

» Promoted the publication on CDM website/Release Note

22 Defining a Person Metadata Model to Improve Data Quality Optional chapter number (Arial 10 plain)

Data Profile ReportingData Profile Reporting

» Standard Data Profiling- Field Type Distribution- Field Uniqueness

Relationship Integrity- Relationship Integrity» CDM Profiling

- Person BaselinePerson Baseline- Event Baseline

» Issues- Root Cause Analysis- People/Process/Technology Approach

23 Defining a Person Metadata Model to Improve Data Quality Optional chapter number (Arial 10 plain)

Field Type Distribution ReportField Type Distribution Report

Presents the percentage of rows within a particular field that fall into to any of the following field type categories:to any of the following field type categories:

» Null

» Integer

St i» String

» Decimal

Space» Space

24 Defining a Person Metadata Model to Improve Data Quality

Field Uniqueness ReportField Uniqueness Report

Presents the percentage uniqueness of a column

» Null Percentageg

» All Distinct e.g. ID fields

» Reference Data (few distinct values)

» Form a view of adequate uniquenessuniqueness

25 Defining a Person Metadata Model to Improve Data Quality

Relationship Integrity ReportRelationship Integrity Report

Presents the widows and orphans by number/percentage

TABLE 1.FIELD 1 and TABLE 2.FIELD 2Join AnalysisJoin Analysis

valid as of xx/xx/xxxx

3016 1247 1247 1769

131200902522782

15742872

10000100000

100000010000000

100000000

1247 1247 1769

15

110

1001000

10000

1.FI

ELD

1Lo

aded

ng T

ABLE

1 R

ows

ng T

ABLE

Val

ues

TABL

End

TAB

LE2

Valu

es

TABL

End

TAB

LE2

Row

s

ng T

ABLE

2 Va

lues

ng T

ABLE

2 R

ows

2.FI

ELD

2Lo

aded

TABL

E R

ows

Non

-Mat

chi n

1.FI

ELD

1

Non

-Mat

chin

1.FI

ELD

1

Mat

chin

g 1.

FIEL

D 1

a2.

FIEL

D 2

Mat

chin

g 1.

FIEL

D 1

a2.

FIEL

D 2

Non

-Mat

chin

2.FI

ELD

2

Non

-Mat

chin

2.FI

ELD

2

TABL

E 2

Row

s 26 Defining a Person Metadata Model to Improve Data Quality

Profiling Examples: Given NamesProfiling Examples: Given Names

» Presence – 101 Null

» Unique Values – 99 999 N/A» Unique Values 99,999 N/A

» Patterns – 4,000» Leading Spaces – 100» Contains Digits – 50» Initials Only – 25» Trailing Spaces – 50» Leading Punctuation – 100» Non-printable Characters – 10p

» Minimum Value – 0

» Maximum Value – Â…

27 Defining a Person Metadata Model to Improve Data Quality

Profiling Examples: GenderProfiling Examples: Gender

» Presence – 25,000 Null

» Unique Values – 25 values» Unique Values 25 values

» Patterns – 8» Leading Spaces – 1» Contains Digits – N/Ag» Initials Only – 2» Trailing Spaces – 3» Leading Punctuation – 4» Non-printable Characters – 5p

» Minimum Value – male

» Maximum Value – vnm

28 Defining a Person Metadata Model to Improve Data Quality

Gender Unique ValuesGender – Unique Values

29 Defining a Person Metadata Model to Improve Data Quality

CDM Profiling Report and PresentationCDM Profiling – Report and Presentation

» Report» Executive Summary

» Presentation

IntroductionExecutive Summary» Observations

- People - Process

» Introduction

» Process Overview

» Observations- Technology

» Profile Scope» How to read the Document

» Observations

» Data Facts» Standard Reports» Person Baseline Analysis

» Business Drivers» Profile Objective» Profile Approach

» Person Baseline Analysis

» Findings

» Recommendations» Assumptions» Observation Details» Next Steps

» Next Steps

30 Defining a Person Metadata Model to Improve Data Quality

Addressing the IssuesAddressing the Issues

» Root Cause Analysis » Reporting Approach» Fact Based» To preclude problem

recurring

» People» Process» Technologyrecurring » Technology

» ExamineCode SpecificationsUI and UI Standards Use of dropdownsField Validation Talk with UsersField Validation Talk with Users

» General BA Tools.

31 Defining a Person Metadata Model to Improve Data Quality

» General BA Tools.

Thank YouThank You

32 Defining a Person Metadata Model to Improve Data Quality

For more information please contact:For more information please contact:

Ian Woodrow

m +44 (0)7974 674045 [email protected]

Atos Origin (UK)4 Triton Square

NW1 3HG, London

Atos, Atos and fish symbol, Atos Origin and fish symbol, Atos Consulting, and the fish itself are registered trademarks of Atos Origin SA. August 2006© 2006 Atos Origin. Confidential information owned by Atos Origin, to be used by the recipient only. This document or any part of it, may not be reproduced, copied, circulated and/or distributed nor quoted without prior written approval from Atos Origin.

www.atosorigin.com