Whitepaper on Master Data Management

15
The University of Texas at Dallas White Paper Master Data Management Product Information Management for HP Printing and Personal Systems Atul Jena Abhrajit Ghosh Jagruti Dwibedi By

Transcript of Whitepaper on Master Data Management

The University of Texas at Dallas

WhitePaper Master Data Management

Product Information Management for HP Printing and Personal Systems

Atul Jena

Abhrajit Ghosh

Jagruti Dwibedi

By

Naveen Jindal School of Management, Paper

Atul Je The University of Texas at Dallas 1

Master Data Management

Product Information Management for HP PPS

University of Texas at Dallas

1. Executive Summary………………………………………………...2

2. Introduction…………………….……………...……………………3

3. Liabilities of bad data………………………….……………………4

4. PIM Capabilities……………………………………..……………...5

5. PIM Architecture..…………………………………………………..6

6. PIM Implementation at HP………………………………………….8

7. Data Governance……………………………………………………11

8. PIM Vendors……………….…….…………………………………12

9. Conclusion……………….…….…………………………………...13

References…………………….…….…………………………….…….14

Naveen Jindal School of Management, Paper

Atul Je The University of Texas at Dallas 2

Executive Summary

Working at Hewlett Packard, which is both a product as well as a service

based company, one thing was noticed over the years, is the amount of time

and money lost over poor-quality data. Clearly, an organization like HP

works on multiple departments, managing vast amount of data about its

customers, products, suppliers, location and more. With multiple

departments managing so much data, there are anomalies which results in

no single consolidated version of the truth about its business. It’s an

expensive problem.

Master Data Management is a framework that reasserts business processes

to present master data to the business users in a consistent and contextual

manner. Such presentation of accurate data will help business users in

making smarter and economical decisions. Broadly, two separate domain

specific streams emerged as a part of MDM: Customer Data Integration

(CDI) and Product Information Management (PIM).

This paper will discuss Product Information Management for HP printing

and personal systems. From stating the liabilities of bad data quality to

building a PIM architecture for product solutions, this paper will highlight

end-to-end solution that merges and centralizes product information across

the enterprise.

Disclaimer: This paper is a case study for HP PPS Global and is presented

as a view point of handling the data quality challenge. No internal product

information of the company has been used.

Naveen Jindal School of Management, Paper

Atul Je The University of Texas at Dallas 3

Introduction

The product data quality challenge in HP is formidable because of the

complexity of managing product information across numerous

departments, hundreds to thousands of suppliers, and thousand to

millions individual product items. Poor data quality leads to inefficient

internal processes and missed sales revenues. But as stated earlier,

cleansing product data alone isn’t the answer- retailers, distributors,

manufacturers need a comprehensive solution that provide much more.

"With numerous manual data entry processes across multiple

applications, product data errors are pervasive and result in purchase

order discrepancies, longer lead times and inefficient use of human

resources," said Andrew White, enterprise and supply chain

management research director at technology consultancy Gartner.

To meet this challenge one would need a system which combines

product information management with robust capabilities in data

integration and governance. As a single repository for all product data

for distribution in all sales channels, the PIM should provide a

cohesive, centralized platform for all channel commerce.

While everyone chases the customer insight part of the equation (the

360o view of the customer) realizing the power and potential of product

information (the single view of products) should be the goal for HP to

be able to recommend and promote the exact products the customers

are likely to buy.

Naveen Jindal School of Management, Paper

Atul Je The University of Texas at Dallas 4

Liabilities of Bad Data

As a retailer HP needs to know:

All about its customer

Their profiles, histories, preferences, behaviors across all channels

(web, mobile, social, call centers, in-store, customer service ,

etc.)

All about its products

So a personal insight to things each customer is most likely to buy

can be mapped

If the management of product information is poorly done it may

become unsustainable in the market.

The problems faced with product information are

It’s incomplete: shoppers aren’t sure and click away

It’s Out-of-date: as it takes a lot of time to update each channel

It’s inconsistent: with different images or descriptions in different

channels

It’s boring: relying on generic data instead of on-brand descriptions,

images and video

Inconsistent database: For example, a mobile team has a different

database from the web team and the store team.

It takes ages to get to market: this causes ‘shelf lag’ that eats up sales

and margin.

Naveen Jindal School of Management, Paper

Atul Je The University of Texas at Dallas 5

PIM Capabilities

With a wide range of products in the ranging from printers to servers,

HP leads the market in delivering best experience through its products.

Clearly with the capabilities PIM provides the enterprise should benefit

the most from it.

A PIM solution will allow HP to do the following:

Locate and use appropriate data from heterogeneous sources.

Access structure product data, which consists such things as model

name, product number, technical description and features set.

Unstructured data are not easily modeled into a PIM repository like

warranty (PDF), videos about the product etc.

Cleanse data and related content.

Identify and create missing product information.

Connect and transmit data.

Unify and relate a single product instance to multiple types of

content. By collecting, validating, and approving the product

related content, the PIM provides one synthetic representation,

which is available on different purposes.

Enable cross media publishing of product catalogs.

Distribute disparate product information from a single source.

Enable multi-lingual catalog creation and deployment

Create personalized catalog views of the product information. Such

a view contains only the product information that the specific user

cares about.

Naveen Jindal School of Management, Paper

Atul Je The University of Texas at Dallas 6

PIM Architecture

There are four PIM solution architectures: External Reference, Registry,

Reconciliation Engine, and Transaction Hub.

External Reference (or Consolidation) is the low-end PIM solution

architecture; a reference database that points to all data but does not actually

contain any data. ; does not define, create, or manage a centralized platform

where Master Data is integrated to create a “single version of the truth.”

The Registry architecture consists of a registry of unique master entity

identifiers. An entity resolution service identifies the master entity records

and the data-source links that were used to maintain the attributes are

maintained by the Data Hub.

The Reconciliation Engine (or Coexistence) architecture is a step up from

the Registry architecture. It harmonizes product Master Data across

databases and acts as a central reference point. This architecture provides

synchronization between itself and legacy systems; retailers will often

implement it as an intermediate architecture (i.e., after they have outgrown

the Registry architecture.

The Transaction Hub architecture stores the up-to-date product Master

Data with its associated enriched attribute data. It also supports new and

legacy transactional and analytical applications, and includes a business

service and data integration layer. This architecture is well-suited to

companies that need to collect information, cleanse it, build it on the fly,

and serve it to other destinations. Hence, it is a perfect solution for HP PPS.

The following figure illustrates the general PIM architecture. The PIM hub

contains the MDM Data Storage, the Validation Engine, the Workflow

Engine, References, and the Metadata. This information is made available

through the Security and Access Layer. The latter ensures that you present

content only to persons who are entitled to have it, even as you allow

authorized persons to modify that content.

Naveen Jindal School of Management, Paper

Atul Je The University of Texas at Dallas 7

Figure: The PIM Solution Architecture

The Enterprise service Bus is used to make available the information

both upstream and downstream using mechanisms such as PubSub,

Web Services or Batch FTP- —that will allow HP to collect the

information or publish it to its consumers whether they are supply

chain, e-commerce, publishing, or stores.

Naveen Jindal School of Management, Paper

Atul Je The University of Texas at Dallas 8

PIM Implementation at HP

HP PPS has broadly divided its products into two categories- Consumer

(Pavilion, Envy, Omen, Deskjets, etc.) & Commercial (Probook,

EliteBook, Z Workstations, and Officejet etc.) units. Features are

modified to both the series of units over the time. New products are

introduced in each segment and out dated products are recalled. Spare

parts list are maintained for all these units. Consumers constantly look

for products online or seek tech support depending on the information

they see. Immaculate data presentation is an absolute need.

The MDM services should be robust enough to manage the Master data,

Data Quality, services like authorization, introduction of new products

and much more. A PIM allows to create lots of metadata, including

description of product categories, descriptions of the information that

needs to be collected, the rules about the information, and the

exceptions to those rules.

So, HP would need a model that would be robust enough in handling

problems of duplication, wrong information, authorization, etc.

The following Transactional Hub Model handles all of these.

Working Model of a Transaction Hub MDM Architecture

The MDM services component is composed of the following components

shown in the Figure:

Interface services: These services provide a consistent entry point to

invoke MDM services through a variety of technologies regardless of

how the service is called. In addition, the interface services have the

Naveen Jindal School of Management, Paper

Atul Je The University of Texas at Dallas 9

ability to accept multiple request message formats through support for

pluggable parsers

Lifecycle management services: Lifecycle management services

provide business and information services for all master data domains

such as customer, product, account or location to create, access and

manage master data held within the master data repository.

Data quality management services: The services in this group can be

divided into two groups:

Data validation and cleansing services provide capabilities to

specify and enforce data integrity rules.

Reconciliation services provide matching services which check

whether or not a new product is a duplicate to an existing product,

conflict resolution services and merge, collapse and split services which

are used by data stewards to reconcile duplicates.

Master data event management services: The master data event

management services provide the ability to create business rules to react

to certain changes on master data and to trigger notification events.

Hierarchy and relationship management services: Hierarchy

services create and maintain hierarchies.

Authoring services: Authoring services are used to define or extend the

definition of master data entities, hierarchies, relationships and

groupings.

Base services: The base services component provides services in the

following four groups:

Privacy and security services implement authorization on four

different levels:

o Service level: determines who is allowed to use the service

o Entity level: determines who is allowed to read/write a particular

entity

o Attribute level: determines who can read/write which attribute

o Record level: determines who can update which particular

records

Audit logging services have the ability to write a complete history

of all transactions and events which occurred for a complete trace

on what happened in the MDM system which can also be used for

problem determination or to comply with certain legal requirements.

The workflow services support collaborative authoring of master

data in processes like New Product Introduction and enable business

rules and delegation of tasks to external components.

Search services allow you to look up and retrieve master data

Master data repository: The master data repository has the following

parts:

The metadata: This part of the repository has all relevant metadata

stored such as a description of the data model for the master data.

The master data: This part of the repository is where the master data is

physically stored.

The history data: The history data is a complete history on all the master

data entity changes in the repository. This enables point-in-time queries

against the MDM data.

Naveen Jindal School of Management, Paper

Atul Je The University of Texas at Dallas 10

The reference data: Here lookup tables such as country codes,

measurement units for products, marital status, and the like are stored.

How it works

Master Data is scattered across applications when MDM is applied to it.

Thus in the Figure, (1), the master data (both HP consumer and commercial

units) from the source application system has to be extracted, cleansed,

standardized, de-duplicated, transformed and loaded into the MDM system

(2).These steps are performed in the Master Data Integration phase. For HP,

once the MDM system built with the Transaction Hub MDM pattern is

complete, all redundant copies of the master data in the source application

systems can be deleted as indicated by the white colour of the master data

parts of the persistence. Furthermore, the source applications are "MDM

enabled".

This means, whenever a transaction (3) is invoked on the source application

system which affects transactional data (for example, billing data of a

pavilion laptop) and master data, the master data portion of this transaction

invokes a master data service of the MDM system for processing. Only the

transactional part is processed locally.

Customers/Consumers access applications (UI) which consume master data

by (4) invoking the MDM services to retrieve master data in a read-only

way.

An MDMUI (5) on enterprise level is used to create and change master data.

An MDM UI can be part of an enterprise portal implementation, for

example. The key imperative is that all changes to master data by any source

system are only performed through services of the MDM system. This

guarantees the required level of master data consistency at all times and

enables customers reach out to the correct/desired products.

Naveen Jindal School of Management, Paper

Atul Je The University of Texas at Dallas 11

Data Governance

Data Governance can be defined as the mechanism by which we ensure

that the right corporate data is available to the right people at the right

time in the right format with the right context through the right

channels.

With the unprecedented growth in the amount of Data in the recent

years, the data needs to be controlled and understood in order to process

it effectively in a secure manner. Data governance isn’t a definable

solution; rather, it’s a journey toward transparency—offering a clearer

understanding of what information you have, how to manage it, and

how it can be used to advance the enterprise.

A product information governance project may appear to be a daunting

effort when one begins to structure the data rules.

The best practice is to develop a data roadmap to provide a clear and

precise understanding of the data and its use within HP. The road map

should detail how data is required and submitted for use within the

enterprise, account for the multiple uses of the data (purchasing,

engineering, marketing, and maintenance), plus the required data

elements and structure needed to accommodate each software system.

Benefits as a Result of Data Governance

There are many benefits of implementing an innovative data governance

and master data management system. Many of the basic benefits, both in

process and cost, are:

Reducing inventory through identification of duplicate items,

Facilitation of inventory sharing and internal purchasing programs,

Reduced employee time spent searching for items,

Common spare part usage strategies,

Reduced downtime in manufacturing equipment due to lack of

information availability,

Ability to manage inventory using a just in-time model.

Data Governance supports both indirect and direct cost savings.

Businesses can begin to embrace the definition of operational data as an

asset of the corporation, ensuring improved data accuracy and

confidence of the data users.

Naveen Jindal School of Management, Paper

Atul Je The University of Texas at Dallas 12

PIM Vendors

It has been said that data outlasts applications. This means that an

organization’s business data survives the changing application landscape.

Technology advancements drive periodic application reengineering, but the

business products, suppliers, assets and customers remain.

The dominant PIM solutions are IBM InfoSphere, Oracle Product Hub, and

SAP NetWeaver. All three vendors, IBM, Oracle, and SAP, have been

involved with MDM for the past 10 years. They have reached their positions

of dominance through multiple acquisitions.

All three vendors offer a full MDM ecosystem, including data integration,

data quality, databases, messaging, and sometimes hardware.

This Magic Quadrant by Gartner provides insight into the segment of the

constantly evolving packaged MDM system market that focuses on

managing product data to support supply chain management (SCM), CRM

and other customer-related strategies. It positions relevant technology

providers on the basis of their Completeness of Vision relative to the

market, and their Ability to Execute on that vision.

Naveen Jindal School of Management, Paper

Atul Je The University of Texas at Dallas 13

Conclusion

PIM is Master Data Management applied to the product space. PIM is

enabled through business process improvements, organizational

improvements, and the alignment of multiple information technologies.

In this paper it was shown how HP could lose business because of poor

data. Having a customer insight is not enough. As a retailer HP relies

heavily on its products. And in the product domain it is all about knowing

the afflictions, meeting the challenges and delivering on the promise of the

personalized customer experience.

Retail business is all about staying ahead of competition. A PIM integrated

system will provide just that to HP, staying ahead by giving a wonderful

customer experience. Like HP says “If you are going to do something,

make it matter”.

Naveen Jindal School of Management, Paper

Atul Je The University of Texas at Dallas 14

References

http://www.informatica.com/us/products/master-data-

management/product-information-management/#fbid=sBHsv9WFkAk

Product Information Management: Definition, Purpose, and Offering,

By Christophe Marcant, Senior Specialist in Sapient

http://h30507.www3.hp.com/t5/Journey-through-Enterprise-IT/Data-

Governance-It-is-the-data-stupid-govern-it/ba-

p/125983#.VIZrSTHF_d2

http://www8.hp.com/h20195/V2/GetDocument.aspx?docname=4AA4-

9093ENW&cc=us&lc=en

Product Information Management (PIM) Data Governance, by Jackie

Roberts, VP at DATAFORGE™

http://www.gartner.com/technology/reprints.do?id=1-

1QTLTLC&ct=140214&st=sb