Metadata Strategies - Data Squared

38
Peter Aiken, Ph.D. Metadata 1 DAMA International President 2009-2013 DAMA International Achievement Award 2001 (with Dr. E. F. "Ted" Codd DAMA International Community Award 2005 Peter Aiken, Ph.D. 33+ years in data management Repeated international recognition Founder, Data Blueprint (datablueprint.com) Associate Professor of IS (vcu.edu) DAMA International (dama.org) 10 books and dozens of articles Experienced w/ 500+ data management practices Multi-year immersions: US DoD (DISA/Army/Marines/DLA) Nokia Deutsche Bank Wells Fargo Walmart PETER AIKEN WITH JUANITA BILLINGS FOREWORD BY JOHN BOTTEGA MONETIZING DATA MANAGEMENT Unlocking the Value in Your Organization’s Most Important Asset. The Case for the Chief Data Ocer Recasting the C-Suite to Leverage Your Most Valuable Asset Peter Aiken and Michael Gorman 2 Copyright 2017 by Data Blueprint Slide #

Transcript of Metadata Strategies - Data Squared

Page 1: Metadata Strategies - Data Squared

Peter Aiken, Ph.D.

Metadata1

• DAMA International President 2009-2013

• DAMA International Achievement Award 2001 (with Dr. E. F. "Ted" Codd

• DAMA International Community Award 2005

Peter Aiken, Ph.D.• 33+ years in data management • Repeated international recognition • Founder, Data Blueprint (datablueprint.com) • Associate Professor of IS (vcu.edu) • DAMA International (dama.org) • 10 books and dozens of articles • Experienced w/ 500+ data

management practices • Multi-year immersions:

– US DoD (DISA/Army/Marines/DLA)– Nokia – Deutsche Bank– Wells Fargo – Walmart– …

PETER AIKEN WITH JUANITA BILLINGSFOREWORD BY JOHN BOTTEGA

MONETIZINGDATA MANAGEMENT

Unlocking the Value in Your Organization’sMost Important Asset.

The Case for theChief Data OfficerRecasting the C-Suite to LeverageYour Most Valuable Asset

Peter Aiken andMichael Gorman

2Copyright 2017 by Data Blueprint Slide #

Page 2: Metadata Strategies - Data Squared

1. In the context of data management

2. What is it and why is it important?

3. Major types & subject areas

4. Benefits, application & sources

5. Implementation considerations

6. Guiding principles & building blocks

7. Specific teachable example

8. Take Aways, References and Q&A

3Copyright 2017 by Data Blueprint Slide #

Metadata

UsesUsesReuses

What is data management?

4Copyright 2017 by Data Blueprint Slide #

Sources

Data Engineering

Data Delivery

Data

Storage

Specialized Team Skills

Data Governance

Understanding the current and future data needs of an enterprise and making that data effective and efficient in supporting business activitiesAiken, P, Allen, M. D., Parker, B., Mattia, A., "Measuring Data Management's Maturity: A Community's Self-Assessment" IEEE Computer (research feature April 2007)

Data management practices connect data sources and uses in an organized and efficient manner • Engineering • Storage • Delivery • Governance

When executed, engineering, storage, and delivery implement governance

Note: does not well-depict data reuse

Page 3: Metadata Strategies - Data Squared

What is data management?

5Copyright 2017 by Data Blueprint Slide #

Sources

Data Engineering

Data Delivery

Data

Storage

Resources(optimized for reuse)

Data GovernanceA

naly

tic In

sigh

tSpecialized Team Skills

You can accomplish Advanced Data Practices without becoming proficient in the Foundational Data Practices however this will: • Take longer • Cost more • Deliver less • Present

greaterrisk(with thanks to Tom DeMarco)

Data Management Practices Hierarchy

Advanced Data

Practices • MDM • Mining • Big Data • Analytics • Warehousing • SOA

Foundational Data Practices

Data Platform/Architecture

Data Governance Data Quality

Data Operations

Data Management Strategy

Technologies

Capabilities

Copyright 2017 by Data Blueprint Slide # 6

Page 4: Metadata Strategies - Data Squared

DMM℠ Structure of 5 Integrated DM Practice Areas

Data architecture implementation

Data Governance

Data Management

Strategy

Data Operations

PlatformArchitecture

SupportingProcesses

Maintain fit-for-purpose data, efficiently and effectively

7Copyright 2017 by Data Blueprint Slide #

Manage data coherently

Manage data assets professionally

Data life cycle management

Organizational support

Data Quality

Data Management Strategy is often the weakest link

Data architecture implementation

Data Governance

Data Management

Strategy

Data Operations

PlatformArchitecture

SupportingProcesses

Maintain fit-for-purpose data, efficiently and effectively

8Copyright 2017 by Data Blueprint Slide #

Manage data coherently

Manage data assets professionally

Data life cycle management

Organizational support

Data Quality

3 3

33

1

Page 5: Metadata Strategies - Data Squared

9Copyright 2017 by Data Blueprint Slide #

Dat

a M

anag

emen

t B

ody

of K

now

ledg

e (D

M B

oK)

10Copyright 2017 by Data Blueprint Slide #

Dat

a M

anag

emen

t B

ody

of K

now

ledg

e (D

M B

oK V

2)

Page 6: Metadata Strategies - Data Squared

Metadata Management from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

11Copyright 2017 by Data Blueprint Slide #

1. In the context of data management

2. What is it and why is it important?

3. Major types & subject areas

4. Benefits, application & sources

5. Implementation considerations

6. Guiding principles & building blocks

7. Specific teachable example

8. Take Aways, References and Q&A

12Copyright 2017 by Data Blueprint Slide #

Metadata

Page 7: Metadata Strategies - Data Squared

Meta data, Meta-data, or metadata• In the history of language, whenever two words are

pasted together to form a combined concept initially, a hyphen links them

• With the passage of time, the hyphen is lost. The argument can be made that that time has passed

• There is a copyright on the term "metadata," but it has not been enforced

• So, the term is "metadata"

13Copyright 2017 by Data Blueprint Slide #

14Copyright 2017 by Data Blueprint Slide #

DataAboutData

Page 8: Metadata Strategies - Data Squared

15Copyright 2017 by Data Blueprint Slide #

UsesSources Metadata Governance

Metadata Engineering

Metadata Delivery

Metadata Storage

Specialized Team Skills

Metadata Practices

• If metadata is data then what technologies and techniques should we use to manage it?

• Data management technologies and techniques

16Copyright 2017 by Data Blueprint Slide #

• Your organization's networking group allocates the responsibility for knowing:

– All the devices permitted to logon to your network

– Locations of all permitted access points

• This responsibility belongs to a named individual(s)

Tracking network users and access points is metadata

Page 9: Metadata Strategies - Data Squared

The prefix meta-1. Situated behind: metacarpus.

2. a. Later in time: metestrus. b. At a later stage of development: metanephros.

3. a. Change; transformation: metachromatism. b. Alternation: metagenesis.

4. a. Beyond; transcending; more comprehensive: metalinguistics. b. At a higher state of development: metazoan.

5. Having undergone metamorphosis: metasomatic.

6. a. Derivative or related chemical substance: metaprotein. b. Of or relating to one of three possible isomers of a benzene ring with two attached chemical groups, in which the carbon atoms with attached groups are separated by one unsubstituted carbon atom: meta-dibromobenzene. Definition of the prefix meta- (Emphasis added – source: American Heritage English Dictionary © 1993 Houghton Mifflin).

Meta

17Copyright 2017 by Data Blueprint Slide #

4. a. Beyond; transcending; more comprehensive: metalinguistics. b. At a higher state of development: metazoan.

Definitions• Metadata is

– Everywhere in every data management activity and integral to all IT systems and applications.

– To data what data is to real life. Data reflects real life transactions, events, objects, relationships, etc. Metadata reflects data transactions, events, objects, relations, etc.

– The data that describe the structure and workings of an organization’s use of information, and which describe the systems it uses to manage that information. [quote from David Hay's book, page 4]

• Data describing various facets of a data asset, for the purpose of improving its usability throughout its life cycle [Gartner 2010]

• Metadata unlocks the value of data, and therefore requires management attention [Gartner 2011]

• Metadata Management is – The set of processes that ensure proper creation, storage, integration, and

control to support associated use of metadata

18Copyright 2017 by Data Blueprint Slide #

Page 10: Metadata Strategies - Data Squared

19Copyright 2017 by Data Blueprint Slide #

Analogy: a library card catalog• Identifies

– What books are in the library, and – Where they are located

• Search by – Subject area – Author, or – Title

• Catalog shows – Author – Subject tags – Publication date and – Revision history

• Determine which books will meet the reader’s requirements

• Without the catalog, finding things is difficult, time consuming and frustratingfrom The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

20Copyright 2017 by Data Blueprint Slide #

Page 11: Metadata Strategies - Data Squared

Definition (continued)• Metadata is the card catalog in a

managed data environment • Abstractly, Metadata is the descriptive

tags or context on the data (the content) in a managed data environment

• Metadata shows business and technical users where to find information in data repositories

• Metadata provides details on where the data came from, how it got there, any transformations, and its level of quality

• Metadata provides assistance with what the data really means and how to interpret it

21Copyright 2017 by Data Blueprint Slide #

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

22Copyright 2017 by Data Blueprint Slide #

Page 12: Metadata Strategies - Data Squared

Defining Metadata

Metadata is any combination of any circle and the data in the center that unlocks the value of the data!

23Copyright 2017 by Data Blueprint Slide #

Adapted from Brad Melton

Data

WhereWhy

What How

Who

When

Data

Data

Library Metadata ExampleLibraries can operate efficiently through careful use of metadata (Card Catalog)Who: Author What: Title Where: Shelf LocationWhen: Publication Date

A small amount of metadata (Card Catalog) unlocks the value of a large amount of data (the Library)

24Copyright 2017 by Data Blueprint Slide #

Library Book

WhereWhy

What How

Who

When

Page 13: Metadata Strategies - Data Squared

Outlook Example

25Copyright 2017 by Data Blueprint Slide #

"Outlook" metadata is used to navigate/manage emailWhat: "Subject" How: "Priority" Where: "USERID/Inbox", "USERID/Personal" Why: "Body" When: "Sent" & "Received”• Find the important stuff/weed out junk • Organize for future access/outlook rules • Imagine how managing e-mail (already non-trivial)

would change if Outlook did not make use of metadata Who: "To" & "From?"

Why Metadata Matters

• They know you rang a phone sex service at 2:24 am and spoke for 18 minutes. But they don't know what you talked about.

• They know you called the suicide prevention hotline from the Golden Gate Bridge. But the topic of the call remains a secret.

• They know you spoke with an HIV testing service, then your doctor, then your health insurance company in the same hour. But they don't know what was discussed.

• They know you received a call from the local NRA office while it was having a campaign against gun legislation, and then called your senators and congressional representatives immediately after. But the content of those calls remains safe from government intrusion.

• They know you called a gynecologist, spoke for a half hour, and then called the local Planned Parenthood's number later that day. But nobody knows what you spoke about. – https://www.eff.org/deeplinks/2013/06/why-metadata-matters

26Copyright 2017 by Data Blueprint Slide #

Page 14: Metadata Strategies - Data Squared

1. In the context of data management

2. What is it and why is it important?

3. Major types & subject areas

4. Benefits, application & sources

5. Implementation considerations

6. Guiding principles & building blocks

7. Specific teachable example

8. Take Aways, References and Q&A

27Copyright 2017 by Data Blueprint Slide #

Metadata

Typically Managed Architectures • Process Architecture

– Arrangement of inputs -> transformations = value -> outputs – Typical elements: Functions, activities, workflow, events, cycles, products, procedures

• Systems Architecture – Applications, software components, interfaces, projects

• Business Architecture – Goals, strategies, roles, organizational structure, location(s)

• Security Architecture – Arrangement of security controls relation to IT Architecture

• Technical Architecture/Tarchitecture – Relation of software capabilities/technology stack – Structure of the technology infrastructure of an enterprise, solution or system – Typical elements: Networks, hardware, software platforms, standards/protocols

• Data/Information Architecture – Arrangement of data assets supporting organizational strategy – Typical elements: specifications expressed as entities, relationships, attributes,

definitions, values, vocabularies

28Copyright 2017 by Data Blueprint Slide #

Page 15: Metadata Strategies - Data Squared

Metadata Subject AreasSubject Areas Components

1) Business Analytics Data definitions, reports, users, usage, performance

2) Business Architecture Roles and organizations, goals and objectives

3) Business Definitions Business terms and explanations for a particular concept, fact, or other item found in an organization

4) Business Rules Standard calculations and derivation methods

5) Data Governance Policies, standards, procedures, programs, roles, organizations, stewardship assignments

6) Data Integration Sources, targets, transformations, lineage, ETL workflows, EAI, EII, migration/conversion

7) Data Quality Defects, metrics, ratings

8) Document Content Management

Unstructured data, documents, taxonomies, ontologies, name sets, legal discovery, search engine indexes

29Copyright 2017 by Data Blueprint Slide #

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Metadata Subject Areas, continuedSubject Areas Components

9) Information Technology Infrastructure Platforms, networks, configurations, licenses

10)Conceptual data models Entities, attributes, relationships and rules, business names and definitions.

11)Logical Data Models Files, tables, columns, views, business definitions, indexes, usage, performance, change management

12)Process Models Functions, activities, roles, inputs/outputs, workflow, timing, stores

13)Systems Portfolio and IT Governance

Databases, applications, projects, and programs, integration roadmap, change management

14)Service-oriented Architecture (SOA) information:

Components, services, messages, master data

15)System Design and Development Requirements, designs and test plans, impact

16)Systems Management Data security, licenses, configuration, reliability, service levels

30Copyright 2017 by Data Blueprint Slide #

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 16: Metadata Strategies - Data Squared

ClassificationNames

ModelNames

*Horizontal integration lines are shown for example purposes only and are not a complete set. Composite, integrative rela-tionships connecting every cell horizontally potentially exist.

AudiencePerspectives

EnterpriseNames

ClassificationNames

AudiencePerspectives

© 1987-2011 John A. Zachman, all rights reserved. Zachman® and Zachman International® are registered trademarks of John A. Zachman

C o m p o s i t e I n t e g r a t i o n s

Alignment

Transformations

C o m p o s i t e I n t e g r a t i o n s

Alignment

Transformations

C o m p o s i t e I n t e g r a t i o n s C o m p o s i t e I n t e g r a t i o n s

Alignment

Transformations

Alignment

Transformations

Version 3.0

A l i g n m e n t

A l i g n m e n t

How Where Who WhenWhat Why

ProcessFlows

DistributionNetworks

ResponsibilityAssignments

TimingCycles

InventorySets

MotivationIntentions

OperationsInstances

(Implementations)

TheEnterprise

TheEnterprise

EnterprisePerspective

(Users)

ExecutivePerspective(Business Context

Planners)

Business MgmtPerspective

(Business Concept Owners)

ArchitectPerspective(Business Logic

Designers)

EngineerPerspective(Business Physics

Builders)

TechnicianPerspective

(Business ComponentImplementers)

ScopeContexts

(Scope Identification Lists)

BusinessConcepts

(Business Definition Models)

SystemLogic(System

Representation Models)

TechnologyPhysics(Technology

Specification Models)

ToolComponents(Tool Configuration

Models)

e.g. e.g. e.g. e.g. e.g. e.g.

e.g. e.g. e.g. e.g. e.g. e.g.

e.g. e.g. e.g. e.g. e.g. e.g.

e.g. e.g. e.g. e.g. e.g. e.g.e.g.: primitive e.g.: composite model:

model:

Forecast SalesPlan ProductionSell ProductsTake OrdersTrain EmployeesAssign TerritoriesDevelop MarketsMaintain FacilitiesRepair ProductsRecord Transctns

Material Supply NtwkProduct Dist. NtwkVoice Comm. NtwkData Comm. Ntwk Manu. Process NtwkOffice Wrk Flow NtwkParts Dist. NtwkPersonnel Dist. Ntwketc., etc.

General MgmtProduct MgmtEngineering DesignManu. EngineeringAccountingFinanceTransportationDistributionMarketingSales

Product CycleMarket CyclePlanning CycleOrder CycleEmployee CycleMaint. CycleProduction CycleSales CycleEconomic CycleAccounting Cycle

ProductsProduct TypesWarehousesParts BinsCustomersTerritoriesOrdersEmployeesVehiclesAccounts

New MarketsRevenue GrowthExpns ReductionCust ConvenienceCustomer Satis.Regulatory Comp.New CapitalSocial ContributionIncreased YieldIncreased Qualitye.g. e.g. e.g. e.g. e.g. e.g.

Operations TransformsOperations In/Outputs

Operations LocationsOperations Connections

Operations RolesOperations Work Products

Operations IntervalsOperations Moments

Operations EntitiesOperations Relationships

Operations EndsOperations Means

ProcessInstantiations

DistributionInstantiations

ResponsibilityInstantiations

TimingInstantiations

Inventory Instantiations

MotivationInstantiations

List: Timing Types

Business IntervalBusiness Moment

List: Responsibility Types

Business RoleBusiness Work Product

List: Distribution Types

Business LocationBusiness Connection

List: Process Types

Business TransformBusiness Input/Output

System TransformSystem Input /Output

System LocationSystem Connection

System RoleSystem Work Product

System IntervalSystem Moment

Technology TransformTechnology Input /Output

Technology LocationTechnology Connection

Technology RoleTechnology Work Product

Technology IntervalTechnology Moment

Tool TransformTool Input /Output

Tool LocationTool Connection

Tool RoleTool Work Product

Tool IntervalTool Moment

List: Inventory Types

Business EntityBusiness Relationship

System EntitySystem Relationship

Technology EntityTechnology Relationship

Tool EntityTool Relationship

List: Motivation Types

Business EndBusiness Means

System EndSystem Means

Technology EndTechnology Means

Tool EndTool Means

Timing IdentificationResponsibility IdentificationDistribution IdentificationProcess Identification

Timing DefinitionResponsibility DefinitionDistribution DefinitionProcess Definition

Process Representation Distribution Representation Responsibility Representation Timing Representation

Process Specification Distribution Specification Responsibility Specification Timing Specification

Inventory Identification

Inventory Definition

Inventory Representation

Inventory Specification

Inventory Configuration Process Configuration Distribution Configuration Responsibility Configuration Timing Configuration

Motivation Identification

Motivation Definition

Motivation Representation

Motivation Specification

Motivation Configuration

31Copyright 2017 by Data Blueprint Slide #

1. In the context of data management

2. What is it and why is it important?

3. Major types & subject areas

4. Benefits, application & sources

5. Implementation considerations

6. Guiding principles & building blocks

7. Specific teachable example

8. Take Aways, References and Q&A

32Copyright 2017 by Data Blueprint Slide #

Metadata

Page 17: Metadata Strategies - Data Squared

7 Metadata Benefits1. Increase the value of strategic information (e.g. data warehousing,

CRM, SCM, etc.) by providing context for the data, thus aiding analysts in making more effective decisions.

2. Reduce training costs and lower the impact of staff turnover through thorough documentation of data context, history, and origin.

3. Reduce data-oriented research time by assisting business analysts in finding the information they need in a timely manner.

4. Improve communication by bridging the gap between business users and IT professionals, leveraging work done by other teams and increasing confidence in IT system data.

5. Increased speed of system development’s time-to-market by reducing system development life-cycle time.

6. Reduce risk of project failure through better impact analysis at various levels during change management.

7. Identify and reduce redundant data and processes, thereby reducing rework and use of redundant, out-of-data, or incorrect data.

33Copyright 2017 by Data Blueprint Slide #

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Metadata for Semistructured Data• Unstructured data

– Any data that is not in a database or data file, including documents or other media data

• Metadata describes both structured and unstructured data • Metadata for unstructured data exists in many formats,

responding to a variety of different requirements • Examples of Metadata repositories describing unstructured data:

– Content management applications – University websites – Company intranet sites – Data archives – Electronic journals collections – Community resource lists

• Common method for classifying Metadata in unstructured sources is to describe them as descriptive metadata, structural metadata, or administrative metadata

34Copyright 2017 by Data Blueprint Slide #

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 18: Metadata Strategies - Data Squared

Metadata for Unstructured Data: Examples• Examples of descriptive metadata:

– Catalog information – Thesauri keyword terms

• Examples of structural metadata – Dublin Core – Field structures – Format (audio/visual, booklet) – Thesauri keyword labels – XML schemas

• Examples of administrative metadata – Source(s) – Integration/update schedule – Access rights – Page relationships (e.g. site navigational design)

35Copyright 2017 by Data Blueprint Slide #

Enve

ra B

usin

ess

Valu

e

36Copyright 2017 by Data Blueprint Slide #

Page 19: Metadata Strategies - Data Squared

The Real Value of Metadata

37Copyright 2017 by Data Blueprint Slide #

Specific Example• Four metadata

sources:

1. Existing reference models (i.e., ADRM)

2. Conceptual model created two years ago

3. Existing systems (to be reverse engineered)

4. Enterprise data model

38Copyright 2017 by Data Blueprint Slide #

} from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 20: Metadata Strategies - Data Squared

1. In the context of data management

2. What is it and why is it important?

3. Major types & subject areas

4. Benefits, application & sources

5. Implementation considerations

6. Guiding principles & building blocks

7. Specific teachable example

8. Take Aways, References and Q&A

39Copyright 2017 by Data Blueprint Slide #

Metadata

Investing in Metadata?• How can IT staff convince managers to plan,

budget, and apply resources for metadata management?

• What is metadata and why is it important?

• What technologies are involved?

• Internet and intranet technologies are part of the answer and will get the immediate attention of management.

40Copyright 2017 by Data Blueprint Slide #

Page 21: Metadata Strategies - Data Squared

41Copyright 2017 by Data Blueprint Slide #

Business Goals ModelDefines the mission of the enterprise, its long-range goals, and the business policies and assumptions that affect its operations.Business Rules ModelRecords rules that govern the operation of the business and the Business Events that trigger execution of Business Processes.

Enterprise Structure ModelDefines the scope of the enterprise to be modeled. Assigns a name to the model that serves to qualify each component of the model.

Extension Support ModelProvides for tactical Information Model extensions to support special tool needs.

Info Usage ModelSpecifies which of the Entity-Relationship Model component instances are used by other Information Model components.

Global Text ModelSupports recording of extended descriptive text for many of the Information Model components.

DB2 ModelRefines the definition of a Relational Database design to a DB2-specific design.

IMS Structures ModelDefines the component structures and elements and the application program views of an IMS Database.

Flow ModelSpecifies which of the Entity Relationship Model component instances are passed between Process Model components.

Applications Structure ModelDefines the overall scope of an automated Business Application, the components of the application and how they fit together.

Data Structures ModelDefines the data structures and their elements used in an automated Business Application.

Application Build ModelDefines the tools, parameters and environment required to build an automated Business Application.

Derivations/Constraints ModelRecords the rules for deriving legal values for instances of Entity-Relationship Model components, and for controlling the use or existence of E-R instance.

Entity-Relationship ModelDefines the Business Entities, their properties (attributes) and the relationships they have with other Business Entities.

Organization/Location ModelRecords the organization structure and location definitions for use in describing the enterprise.

Process ModelDefines Business Processes, their sub processes and components.

Relational Database ModelDescribes the components of a Relational Database design in terms common to all SAA relational DBMSs.

Test ModelIdentifies the various file (test procedures, test cases, etc.) affiliated with an automated business Application for use in testing that application.

Library ModelRecords the existence of non-repository files and the role they play in defining and building an automated Business Application.

Panel/Screen ModelIdentifies the Panels and Screens and the fields they contain as elements used in an automated Business Application.

Program Elements ModelIdentifies the various pieces and elements of application program source that serve as input to the application build process.

Value Domain ModelDefines the data characteristics and allowed values for information items.

Strategy ModelRecords business strategies to resolve problems, address goals, and take advantage of business opportunities. It also records the actions and steps to be taken.Resource/Problem Model

Identifies the problems and needs of the enterprise, the projects designed to address those needs, and the resources required.

Process Model

Extension Support Model

Application Structure

Model

DB2 Model

Relational Database

Model

Global Text Model

Strategy Model

Derivations/ Constriants

Model

Application Build Model

Test Model Panel/ Screen Model

IMS Structure Model

Data Structure

Model

Program Elements

Model

Business ModelGoals

Organization/ LocationModel

Resource/ Problem

Model

Enterprise Structure

Model

Entity- Relationship

Model

Info Usage Model

Value Domain Model

Flow Model

Business Rules Model

LibraryModel

IBM's AD/Cycle Information Model

Roles and Responsibilities• Suppliers:

– Data Stewards – Data Architects – Data Modelers – Database

Administrators – Other Data

Professionals – Data Brokers – Government and

Industry Regulators

• Participants: – Metadata

Specialists – Data Integration

Architects – Data Stewards – Data Architects

and Modelers – Database

Administrators – Other DM

Professionals – Other IT

Professionals – DM Executives – Business Users

• Consumers: – Data Stewards – Data

Professionals – Other IT

Professionals – Knowledge

Workers – Managers and

Executives – Customers

and Collaborators

– Business Users

42Copyright 2017 by Data Blueprint Slide #

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 22: Metadata Strategies - Data Squared

Metadata …• Isn't

– Is not a noun – One persons data is another's metadata

• Is more of a verb? – Represents a use of existing facts, rather than a type of data itself

• A gerund – a form that is derived from

a verb but that functions as a noun – e.g., asking in do you mind my asking you? – Describes a use of data - not a type of data

• Describes the use of some attributes of data to understand or manage that same data from a different (usually higher) level of abstraction

• Value proposition – Is this data worth including within the scope of our metadata practices?

43Copyright 2017 by Data Blueprint Slide #

Keep the proper focus• Wrong question:

– Is this metadata?

• Right question:

– Should we include this data item within the scope of our metadata practices?

44Copyright 2017 by Data Blueprint Slide #

Page 23: Metadata Strategies - Data Squared

Definition of Bed

45Copyright 2017 by Data Blueprint Slide #

Entity: BED Data Asset Type: Principal Data Entity Purpose: This is a substructure within the Room

substructure of the Facility Location. It contains information about beds within rooms.

Source: Maintenance Manual for File and Table Data (Software Version 3.0, Release 3.1)

Attributes: Bed.Description Bed.Status Bed.Sex.To.Be.Assigned Bed.Reserve.Reason

Associations: >0-+ Room Status: Validated

Purpose statement incorporates motivations

A purpose statement describing – Why the organization is maintaining information about this business concept; – Sources of information about it; – A partial list of the attributes or characteristics of the entity; and – Associations with other data items;

this one is read as "One room contains zero or many beds."

46Copyright 2017 by Data Blueprint Slide #

Draft

Page 24: Metadata Strategies - Data Squared

Met

adat

a Im

plem

enta

tion

Phas

es

47Copyright 2017 by Data Blueprint Slide #

F o r 1 m a n a g e a b l e b u s i n e s s p r o b l e m !

0 500 1000 1500 2000 2500

Manage Positions (2%)

Plan Careers (~5%)

Administer Training (~5%)

Plan Successions (~5%

Manage Competencies (20%)

Recruit Workforce (62%)

Develop Workforce (29.9%)Administer Workforce (28.8%)

Compensate Employees (23.7%)Monitor Workplace (8.1%)

Define Business (4.4%)Target System (3.9%)

EDI Manager (.9%)Target System Tools (.3%)

Administer Workforce

Metadata Uses

48Copyright 2017 by Data Blueprint Slide #

Page 25: Metadata Strategies - Data Squared

Technology• Metadata repositories • Data modeling tools • Database management systems • Data integration tools • Business intelligence tools • System management tools • Object modeling tools • Process modeling tools • Report generating tools • Data quality tools • Data development and administration tools • Reference and mater data management tools

49Copyright 2017 by Data Blueprint Slide #

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Whither the "data dictionary?

50Copyright 2017 by Data Blueprint Slide #

• The classic "data dictionary" pretty much died by the early '90s – ... they ever-so-kindly renamed "data dictionary" to "metadata

repository" & then promptly went belly up. Ask pretty much and IBMer today if they've ever heard of AD/Cycle or RepositoryManager... guaranteed response will be a blank stare.

• My calculation says 5% survival rate from 1973 to 2003 – (Dave Eddy - [email protected])

Page 26: Metadata Strategies - Data Squared

Build Your Own Metadata Repository

51Copyright 2017 by Data Blueprint Slide #

Sample Low-tech

Repository

52Copyright 2017 by Data Blueprint Slide #

Page 27: Metadata Strategies - Data Squared

53Copyright 2017 by Data Blueprint Slide #

FTI Metadata Model

54Copyright 2017 by Data Blueprint Slide #

Page 28: Metadata Strategies - Data Squared

1. In the context of data management

2. What is it and why is it important?

3. Major types & subject areas

4. Benefits, application & sources

5. Implementation considerations

6. Guiding principles & building blocks

7. Specific teachable example

8. Take Aways, References and Q&A

55Copyright 2017 by Data Blueprint Slide #

Metadata

Goals and Principles• Provide organizational

understanding of terms and usage

• Integrate metadata from diverse sources

• Provide easy, integrated access to metadata

• Ensure metadata quality and security

56Copyright 2017 by Data Blueprint Slide #

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 29: Metadata Strategies - Data Squared

Activities• Understand metadata requirements

• Define the metadata architecture

• Develop and maintain standards

• Implement a managed environment

• Create and maintain metadata

• Integrate metadata

• Management metadata repository-like functionality

• Distribute and deliver metadata

• Query, report and analyze metadata

57Copyright 2017 by Data Blueprint Slide #

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Activities: Metadata Standards Types

• Two major types: – Industry or consensus

standards – International standards

• High level framework can show – How standards are related – How they rely on each

other for context and usage

58Copyright 2017 by Data Blueprint Slide #

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 30: Metadata Strategies - Data Squared

Primary Deliverables• Metadata repository-like

functionality

• Quality metadata

• Metadata analysis

• Data lineage

• Change impact analysis

• Metadata control procedures

• Metadata models and architecture

• Metadata management operational analysis

59Copyright 2017 by Data Blueprint Slide #

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

15 Guiding Principles1. Establish and maintain a clear approach, policies,

and goals for metadata management and usage 2. Secure sustained commitment, funding, and vocal support

from senior management 3. Take an enterprise perspective to ensure future extensibility,

but implement through iterative and incremental delivery 4. Develop your people and process before evaluating,

purchasing, and installing technologies 5. Create or adopt standards to ensure enterprise

interoperability 6. Ensure effective acquisition approaches for both internal

and external metadata 7. Maximize user access since a solution that is not accessed

or is under-accessed will not show business value

60Copyright 2017 by Data Blueprint Slide #

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 31: Metadata Strategies - Data Squared

15 Guiding Principles, continued8. Understand and communicate the necessity of Metadata and

the purpose of each type of metadata; socialization of the value of Metadata will encourage business usage

9. Measure content and usage 10. Leverage XML, messaging and web services 11. Establish and maintain enterprise-wide business involvement

in data stewardship, assigning accountability for metadata 12. Define and monitor procedures and processes to ensure

correct policy implementation 13. Include a focus on roles, staffing,

standards, procedures, training, & metrics 14. Provide dedicated Metadata experts

to the project and beyond 15. Certify Metadata quality

61Copyright 2017 by Data Blueprint Slide #

from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

1. In the context of data management

2. What is it and why is it important?

3. Major types & subject areas

4. Benefits, application & sources

5. Implementation considerations

6. Guiding principles & building blocks

7. Specific teachable example

8. Take Aways, References and Q&A

62Copyright 2017 by Data Blueprint Slide #

Metadata

Page 32: Metadata Strategies - Data Squared

Example: iTunes Metadata

• Example: – iTunes

Metadata • Insert a recently

purchased CD • iTunes can:

– Count the number of tracks (25)

– Determine the length of each track

63Copyright 2017 by Data Blueprint Slide #

• When connected to the Internet iTunes connects to the Gracenote(.com) Media Database and retrieves: – CD Name – Artist – Track Names – Genre – Artwork

• Sure would be a pain to type in all this information

64Copyright 2017 by Data Blueprint Slide #

Page 33: Metadata Strategies - Data Squared

• To organize iTunes – I create a "New Smart Playlist" for

Artist's containing "Miles Davis"

65Copyright 2017 by Data Blueprint Slide #

• Notice I didn't get the desired results • I already had another Miles Davis recording in iTunes • Must fine-tune the request to get the desired results

– Album contains "The complete birth of the cool"

• Now I can move the playlist "Miles Davis" to a folder

66Copyright 2017 by Data Blueprint Slide #

Page 34: Metadata Strategies - Data Squared

• The same: – Interface –Processing –Data Structures

• are applied to –Podcasts –Movies –Books –.pdf files

• Economies of scale are enormous

67Copyright 2017 by Data Blueprint Slide #

1. In the context of data management

2. What is it and why is it important?

3. Major types & subject areas

4. Benefits, application & sources

5. Implementation considerations

6. Guiding principles & building blocks

7. Specific teachable example

8. Take Aways, References and Q&A

68Copyright 2017 by Data Blueprint Slide #

Metadata

Page 35: Metadata Strategies - Data Squared

Metadata Take Aways

• 'Data about data'

• Metadata unlocks the value of data, and therefore requires management attention [Gartner 2011]

• Metadata is less about what and more about how

• Metadata is the language of data governance

• Metadata defines the essence of integration challenges

69Copyright 2017 by Data Blueprint Slide #

UsesSources Metadata Governance

Metadata Engineering

Metadata Delivery

Metadata Practices

Metadata Storage

Specialized Team Skills

References & Recommended Reading

70Copyright 2017 by Data Blueprint Slide # from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 36: Metadata Strategies - Data Squared

References, cont’d

71Copyright 2017 by Data Blueprint Slide # from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

References, cont’d

72Copyright 2017 by Data Blueprint Slide # from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Page 37: Metadata Strategies - Data Squared

References, cont’d

73Copyright 2017 by Data Blueprint Slide # from The DAMA Guide to the Data Management Body of Knowledge © 2009 by DAMA International

Questions?

It’s your turn! Use the chat feature or Twitter (#dataed) to submit

your questions to Peter now.

+ =

74Copyright 2017 by Data Blueprint Slide #

Page 38: Metadata Strategies - Data Squared

Upcoming Events

Your Data Strategy January 9, 2018 @ 2:00 PM ET/11:00 AM PT

Sign up here: www.datablueprint.com/webinar-schedule or www.dataversity.net

75Copyright 2017 by Data Blueprint Slide #