Future proofing your IM investment; insure against...

13
Future proofing your IM investment; insure against business change

Transcript of Future proofing your IM investment; insure against...

Page 1: Future proofing your IM investment; insure against …c3-website-assets.s3.amazonaws.com/2016/06/29/02/45/41/...2016/06/29  · Future proofing your IM investment; insure against business

Future proofing your IM investment; insure against business change

Page 2: Future proofing your IM investment; insure against …c3-website-assets.s3.amazonaws.com/2016/06/29/02/45/41/...2016/06/29  · Future proofing your IM investment; insure against business

A member firm of Ernst & Young Global Limited Liability limited by a scheme approved under Professional Standards Legislation

2 of 13

Table of Contents Today’s Reality .................................................................................................................................. 3

Data Warehouse Architecture Tiers ................................................................................................ 3

Where Change is Inevitable ........................................................................................................ 3

Data Warehouse Design ‘Gurus’ .................................................................................................... 3

Factoring in Change ................................................................................................................... 3

Phase 0 - Signs of Life ...................................................................................................................... 5

Same Technique Regardless of Warehouse Maturity ..................................................................... 5

SDLC/Waterfall Versus Agile Approach .......................................................................................... 5

The Agile Architecture Approach .................................................................................................... 5

The Data Vault ........................................................................................................................... 6

Anchor Modelling........................................................................................................................ 6

Defining a Common Organisational Model...................................................................................... 6

Step 1 7

Step 2 7

Step 3 7

Phase 1 - The New Born ................................................................................................................... 8

Remember ..................................................................................................................................... 9

Phase 1+ - Growing up .................................................................................................................... 10

Our Snapshot Recommendation ...................................................................................................... 11

Page 3: Future proofing your IM investment; insure against …c3-website-assets.s3.amazonaws.com/2016/06/29/02/45/41/...2016/06/29  · Future proofing your IM investment; insure against business

A member firm of Ernst & Young Global Limited Liability limited by a scheme approved under Professional Standards Legislation

3 of 13

Today’s Reality

Why do people build Data Warehouses based only on how things are now, without the inherent ability to

adapt or change?

Data Warehouse Architecture Tiers

The predominant data warehouse architecture built today is based around three key tiers:

(i) Get the data from the source systems & providers;

(ii) Combine or integrate the data, apply quality checks and calculate new metrics; and

(iii) Format the data to make reporting easy and efficient.

Where Change is Inevitable

Each of these three tiers is subject to change. For instance:

Tier 1. Old legacy systems are replaced with different or newer ones;

Tier 2. The business itself is constantly changing to improve and maximize its worth; and

Tier 3. Users of the data require more information and insight to make their decisions.

Data Warehouse Design ‘Gurus’

Typically, data warehouse design is based on one of three ‘gurus’; Bill Inmon (Inmon), Ralph Kimball

(Kimball) or Dan Linstedt (Linstedt). The contents and design of tiers two and three can therefore be

different.

Each ‘guru’ has his followers and provokes passionate debate regarding the strengths and weaknesses of

these approaches.

What is commonly overlooked however is that a data warehouse goes through phases of evolution and

depending on the phase, the development approach or methodology chosen has more impact on future

proofing than the specific design construct chosen.

Factoring in Change

Ralph Hughes, Chief Systems Architect at Ceregenics, has published several books on the subject of agile

data warehousing. We have found that this methodology is most effective for data warehouse development

as it actively factors in change.

You must be able to adjust the design construct throughout the evolution of the data warehouse to handle

the changing requirements and this can be challenging.

A critical factor in any data warehouse design and build iteration, is the ability to model the solution so that

you can review and modify the design. Some data modelers assume that there can only be one “right”

model. The extensive research behind Graeme Simsion’s book Data Modeling Theory and Practice

(Simsion, 2007) concludes that there can be multiple models, each with relative merits when judged

according to various factors, one of which is ability to adapt to change. The assembly of several candidate

models facilitates the explicit evaluation of options, and provides a critical communications asset ensuring

that everyone involved has the same picture in their head of what is being built and why.

Page 4: Future proofing your IM investment; insure against …c3-website-assets.s3.amazonaws.com/2016/06/29/02/45/41/...2016/06/29  · Future proofing your IM investment; insure against business

A member firm of Ernst & Young Global Limited Liability limited by a scheme approved under Professional Standards Legislation

4 of 13

Publications from renowned authors like Scott Ambler (Ambler, 2003) (Ambler, 2004) (Collier, 2011), John

Giles (Giles, 2011) and Len Silverston (Silverston, 2009) (Silverston, 2012) cover the subject of Data

Modelling using an Agile methodology.

We will now explain the design constructs used at specific stages during the evolution of a data warehouse

and the specific techniques that insure your investment against inevitable future change to your underlying

data structures.

Page 5: Future proofing your IM investment; insure against …c3-website-assets.s3.amazonaws.com/2016/06/29/02/45/41/...2016/06/29  · Future proofing your IM investment; insure against business

A member firm of Ernst & Young Global Limited Liability limited by a scheme approved under Professional Standards Legislation

5 of 13

Phase 0 - Signs of Life

A data warehouse starts life by looking at the individuals and groups within organizations collecting data and

information to produce reports to make decisions.

Over time, some of these individuals connect up into localized workgroups or streams within a company.

They compare what each is doing and adapt to generate a consistent approach. This goes on throughout a

company, most often, with each group unaware of what others are doing.

This situation can also exist even when a very mature data warehouse is operating.

Same Technique Regardless of Warehouse Maturity

The technique remains the same for both a new data warehouse or for an existing, mature environment even

though the collective knowledge and experience might be at different levels.

You must:

1. Establish what is driving the effort;

2. Work out where the information is coming from and in what form;

3. Establish the dynamics, relationships and construction of the data;

4. Define what checks need to be applied to the incoming data;

5. Define if, where or how to store the data;

6. Define what value add or derived information needs to added;

7. Establish what form the output or report needs to be in; and

8. Build something and check with the owner if it’s okay.

SDLC/Waterfall versus Agile Approach

A SDLC or waterfall approach is likely to involve doing each of the above in sequence and in full, prior to

moving on to the next step.

An agile approach would cycle through the steps a number of times potentially, refining and enhancing each

time, based on feedback from the person who will be the eventual user of the work, known in agile terms as

“the product owner”.

Both methods will be presented with the same issue as each of the steps above can result in change when

revisited at a future date:

How can I (or should I) design what I’m doing now to be useful in the future and require minimal

effort if something changes later on?

The Agile Architecture Approach

The ideal Agile architecture approach when presented with new, unquantified requirements is to initially

adopt the Kimball method.

Establish a source of the data;

Turn it into a star schema;

Prototype a report; and

Get feedback from the owner.

Page 6: Future proofing your IM investment; insure against …c3-website-assets.s3.amazonaws.com/2016/06/29/02/45/41/...2016/06/29  · Future proofing your IM investment; insure against business

A member firm of Ernst & Young Global Limited Liability limited by a scheme approved under Professional Standards Legislation

6 of 13

This approach could be adopted for several iterations, reworking the schema and metrics. The same

approach could be taken when the next stream or group is identified who need help generate the right

reports.

Essentially, you are duplicating the design with different context. Initially this approach works fine, however,

as more information is introduced and more changes made to the reporting requirements, this starts to

become a lot of effort, rework and can lead to confusion about what is current and what is old star schema

design.

What does evolve from this process however, is:

An understanding of the types of data required for the reporting (dimensions),

The dynamics or relationships between the data and the metrics or measures (facts); and

The firsthand experience with the quality of the data.

The Data Vault

Dan Linstedt came up with a construct called a Data Vault in 2000. Not necessarily a truly representative

name, as it implies the data is all locked up (which it isn’t), but something akin to generalization, for the

modelling types among you.

It adopts the following basic pattern:

There are things (hubs);

These things have relationships with other things and even themselves (links); and

These things have their own attributes or qualities (satellites).

There are also advanced features like PITs, Bridges and User Groups.

Anchor Modelling

A similar construct was implemented in Sweden in 2004 and subsequently published by Lars Rönnbäck from

The Data Warehousing Institute in 2007.

“Anchor Modelling is an Agile information modelling technique that offers non-destructive

extensibility mechanisms enabling robust and flexible management of changes. A key benefit of

Anchor Modelling is that changes in a data warehouse environment only require extensions, not

modifications.” (Rönnbäck, 2011)

Adopting a sixth normal form (6NF) approach, the concept of Anchor Modelling was introduced.

Anchor Modelling has the following pattern:

Things and events (anchors)

Properties of the things or events (attributes)

Relationships between things (ties)

Shared or common properties between things (knots).

Defining a Common Organizational Model

So with these basic building blocks, it is possible to model the business data from the bottom up and at the

same time introduce some common generic patterns and themes. This approach is now more aligned with

Page 7: Future proofing your IM investment; insure against …c3-website-assets.s3.amazonaws.com/2016/06/29/02/45/41/...2016/06/29  · Future proofing your IM investment; insure against business

A member firm of Ernst & Young Global Limited Liability limited by a scheme approved under Professional Standards Legislation

7 of 13

the Inmon philosophy. In fact, these building blocks provide the ideal platform to start defining a common

model for the whole organization.

Here’s how.

Step 1

Establish what the various information types are: customer, product, order, policy, account, risk, asset, etc.

These can be aligned to the dimensions created previously.

Step 2

Establish where any of these are used together, linked, have a dependency, or a cross-reference. The same

pair of information types can have more than one of these links.

Step 3

Take all this and create a conceptual model. Some basic examples are shown below illustrating different

notations and different business subject domains.

Page 8: Future proofing your IM investment; insure against …c3-website-assets.s3.amazonaws.com/2016/06/29/02/45/41/...2016/06/29  · Future proofing your IM investment; insure against business

A member firm of Ernst & Young Global Limited Liability limited by a scheme approved under Professional Standards Legislation

8 of 13

Phase 1 - The New Born

Once you have accepted that change is inevitable and that the data warehouse will evolve, the focus should

be on constructing your solution so that when change happens, the impact to existing reports, code and

processes is minimal or non-existent.

The aim is to separate the relationships from the data and the data from the keys.

Traditional 3NF binds the keys and relationships in with all the data. Add a relationship, change a key,

remove an existing relationship, all can cause significant rebuilding of the underlying model, database, data

loads and reports.

Using internal (or surrogate keys) to provide the unique reference instead, allows greater flexibility in the

design and minimizes or eliminates any impact when things change.

Enhancing the sample conceptual models shown previously, the logical model now begins to evolve.

Page 9: Future proofing your IM investment; insure against …c3-website-assets.s3.amazonaws.com/2016/06/29/02/45/41/...2016/06/29  · Future proofing your IM investment; insure against business

A member firm of Ernst & Young Global Limited Liability limited by a scheme approved under Professional Standards Legislation

9 of 13

Remember

Naming conventions are important when building out the model.

Typically relationships, should be prefixed with the same identifier. In the example above this is simply

“Rel”.

The business entity attributes/details/properties are named the same as their related business entity,

except with a common suffix. In the example above this is simply “Details”.

Not everyone needing to investigate this area of the data warehouse will have access to a modelling tool.

Some might be using a database query tool, others reporting products. These will typically present the user

with just a list of tables or objects.

By adopting naming standards, it will make the identification of the type of entity/table/object easier.

Page 10: Future proofing your IM investment; insure against …c3-website-assets.s3.amazonaws.com/2016/06/29/02/45/41/...2016/06/29  · Future proofing your IM investment; insure against business

A member firm of Ernst & Young Global Limited Liability limited by a scheme approved under Professional Standards Legislation

10 of 13

Phase 1+ - Growing up

Adapting to change doesn’t necessarily involve major rework or adjustments.

By adopting the basic constructs explained above, changes to the model are generally additions.

Consequently, existing code and constructs will continue to work without requiring any remedial action.

Obviously if they intend to make use of the new changes, then they will need to be enhanced.

Making consistent modifications to the model is quite critical. Keep to the naming standards to ensure ease

of use.

During this phase, master and reference data sets begin to evolve. In Anchor modelling, these are known as

knots. Consistent and unambiguous definitions of entities and their associated attributes and relationships

are very important.

The approach to enhancing the data warehouse remains the same, with requirements coming in from the

user community; restrictions or mandates from the source systems; and the architects and those responsible

for standards trying to consistently join the two ends of the data warehouse together: “Data In and Data Out”.

Page 11: Future proofing your IM investment; insure against …c3-website-assets.s3.amazonaws.com/2016/06/29/02/45/41/...2016/06/29  · Future proofing your IM investment; insure against business

A member firm of Ernst & Young Global Limited Liability limited by a scheme approved under Professional Standards Legislation

11 of 13

Our Snapshot Recommendation

In the simplest terms, adopt:

An agile delivery approach for report and data mart design. This will help ensure the ongoing retention of

users and product owners

A disciplined waterfall approach to the formal data acquisition process from source systems to assist with

reliable and accurate provision of data.

A hybrid of the two approaches in the middle.

Following this approach, will ensure minimal rework and down time associated with the inevitable changes to

the underlying data structure.

Page 12: Future proofing your IM investment; insure against …c3-website-assets.s3.amazonaws.com/2016/06/29/02/45/41/...2016/06/29  · Future proofing your IM investment; insure against business

A member firm of Ernst & Young Global Limited Liability limited by a scheme approved under Professional Standards Legislation

12 of 13

Bibliography

Ambler, S. (2003). Agile database techniques : effective strategies for the agile software developer. Wiley

Publishing.

Ambler, S. (2004). The Object Primer: Agile Model Driven Development with UML 2. Cambridge University

Press.

Collier, B. (2011, June 22). Agile Data Modeling: Evolving Toward Excellence. Retrieved from TDWI:

http://tdwi.org/articles/2011/06/22/agile-data-modeling.aspx

Giles, J. (2011). The Nimble Elephant. Amazon.

Inmon, B. (n.d.). Retrieved from Bill Inmon - Corporate Information Factory: http://www.inmoncif.com/home/

Kimball, R. (n.d.). Retrieved from Kimball Group: http://www.kimballgroup.com/about-kimball-group/

Linstedt, D. (n.d.). Retrieved from Dan Linstedt - Data Vault: http://danlinstedt.com/about/data-vault-basics/

Rönnbäck, b. L. (2011, May). Anchor Modelling with Bi-Temporal Data. Retrieved from Anchor Modelling:

http://www.anchormodeling.com/wp-content/uploads/2011/05/Anchor-Modeling-with-Bitemporal-

Data.pdf

Silverston, L. (2009). The Data Model Resource Book: Universal Patterns for Data Modeling. John Wiley &

Sons Inc.

Silverston, L. (2012, Feb). The Las Vegas 2012 Conference . Retrieved from TDWI World Conference

Series: http://events.tdwi.org/events/las-vegas-world-conference-

2012/Speakers/Speaker%20Window.aspx?SpeakerId=%7B75239155-BDF6-4082-8D0E-

C5403A9E72BD%7D&ID=%7B1DF417B1-CDE2-4924-9752-BC0610A76F46%7D

Simsion, G. (2007). Data Modeling Theory and Practice. Technics Publications.

Page 13: Future proofing your IM investment; insure against …c3-website-assets.s3.amazonaws.com/2016/06/29/02/45/41/...2016/06/29  · Future proofing your IM investment; insure against business

A member firm of Ernst & Young Global Limited Liability limited by a scheme approved under Professional Standards Legislation

13 of 13

EY | Assurance | Tax | Transactions | Advisory

About EY

EY is a global leader in assurance, tax, transaction and

advisory services. The insights and quality services we

deliver help build trust and confidence in the capital markets

and in economies the world over. We develop outstanding

leaders who team to deliver on our promises to all of our

stakeholders. In so doing, we play a critical role in building a

better working world for our people, for our clients and for

our communities.

EY refers to the global organization, and may refer to one or

more, of the member firms of Ernst & Young Global Limited,

each of which is a separate legal entity. Ernst & Young

Global Limited, a UK company limited by guarantee, does

not provide services to clients. For more information about

our organization, please visit ey.com.

© 2016 Ernst & Young Australia.

All Rights Reserved.

This communication provides general information which is current at

the time of production. The information contained in this

communication does not constitute advice and should not be relied

on as such. Professional advice should be sought prior to any action

being taken in reliance on any of the information. Ernst & Young

disclaims all responsibility and liability (including, without limitation,

for any direct or indirect or consequential costs, loss or damage or

loss of profits) arising from anything done or omitted to be done by

any party in reliance, whether wholly or partially, on any of the

information. Any party that relies on the information does so at its

own risk. Liability limited by a scheme approved under Professional Standards Legislation.

eyc3.com

ey.com/analytics

Contact details:

[email protected]

EYC3 creates intelligent client organizations using data & advanced analytics.

Our team of data scientists, analysts,

developers, business consultants and

industry experts work with clients at all stages of their information evolution.