Datawarehousing

47
Data Warehousing Hennie de

description

 

Transcript of Datawarehousing

Page 1: Datawarehousing

Data Warehousing

Hennie de Nooijer

Page 2: Datawarehousing

Let’s d

ig in

Position

Definition

Expert debate

Architecture

Methodology

Technology

Trends

Data Warehousing

Position

Page 3: Datawarehousing

Information provisioning

Page 4: Datawarehousing

Controlled information provisioning

DWH

Information provisioning

Page 5: Datawarehousing

Business Intelligence

Data warehouse

ETL

RDBMS

Hardware

Page 6: Datawarehousing

Position

Definition

Expert debate

Architecture

Methodology

Technology

Trends

Data Warehousing

Definition

Page 7: Datawarehousing

A data warehouse is a repository of an organization's electronically stored data. Data warehouses are designed to facilitate reporting and analysis Inmon, W.H. Tech Topic: What is a Data Warehouse? Prism Solutions. Volume 1. 1995..en.wikipedia.org/wiki/Data_warehouse A collection of data, from a variety of sources, organized to provide useful guidance to an organization's decision makersen.wiktionary.org/wiki/data_warehouse An information repository from which queries and analysis may be made.www.pcai.com/web/glossary/pcai_d_f_glossary.html A separate database that is designed for reporting and querying. The data in a warehouse is derived from the data in the transaction database (Banner database) and can also include data from other sources. ...www.wellesley.edu/EAIGroup/Glossary.html A database for query and analysis, as opposed to a database for processing transactions. Separating the two functions improves flexibility and performance.www.atlab.com/index.php/LIMS-Glossary-Terms-A-E.html A computer based-information system that is home for "secondhand" data that originated from either another application or from an external system or source. A data warehouse is a read-only, integrated database designed to answer comparative and "what-if" scenarios. ...www.mnhs.org/preserve/records/recordsguidelines/guidelinesglossary.html A data warehouse is, simply put, a central place where data is stored at record or summary level for the purpose of analysis and reporting.www.idph.state.ia.us/adper/data_warehouse_terms.asp A system for storing and delivering massive quantities of data.www.cs.ualberta.ca/~zaiane/courses/cmput690/glossary.html Data Warehouse is a database specifically designed to contain historic snapshots of various operational system data, normally in an aggregated form which is used by data analysts and other end users for analyzing, reporting, tracking, and supporting strategic decisions. ...www.acf.hhs.gov/programs/cb/systems/sacwis/glossary.htm Maestro's Data Warehouse stores and manages recipient profiles and target groups stored within LISTSERV Maestro.www.lsoft.com/manuals/Maestro/2.1/Admin/WebHelp/Glossary_of_Terms.htm A collection of data pulled together primarily from operational systems and specifically structured and tuned for easy access and use for query, reporting and analysis purposes. ...ais.its.psu.edu/services/bi/glossary.asp A data warehouse combines data from multiple and varied sources into one comprehensive and easily manipulated database. It does not replace existing systems, but draws information from the systems that are currently in place and facilitates reporting and analysis of this data.as.exeter.ac.uk/projects/theprojectsteam/ourcurrentprojects/bi/glossary/ A data warehouse is a database geared towards the business intelligence requirements of an organisation. The data warehouse integrates data from the various operational systems and is typically loaded from these systems at regular intervals. ...www.oranz.co.uk/glossary_text.htm A repository of well-organized corporate data for Business Analysis and Reporting. It is also a collection of data marts.dotnetslackers.com/articles/sql/introduction-to-business-intelligence-important-terms-and-definitions.aspx A data collection -- prepackaged or summarized according to specific business rules and designed to support management decision making. Data warehouses contain a wide variety of data that present a coherent picture of business information.www.indiana.edu/~iuie/IUIE_HELP/ie_help_glossary.html

What’s a Data Warehouse?

Page 8: Datawarehousing

A Data Warehouse is a

subject-oriented,

integrated, time-variant,

non-updatable collection

of data used in support

of decision-making

processes

Page 9: Datawarehousing

Subject oriented

Page 10: Datawarehousing

Integrated

Page 11: Datawarehousing

Time variant

Page 12: Datawarehousing

Non updatable

Page 13: Datawarehousing

Data Warehousing

Position

Definition

Expert debate

Architecture

Methodology

Technology

Trends

Expert debate

Page 14: Datawarehousing

Bill H. Inmon

Architecture (CIF)

Enterprise

Top down

DWH 2.0

Page 15: Datawarehousing

Ralph Kimball

Dimensional modeling

Business subject focus

Bottom up

Data bus

Page 16: Datawarehousing

Dan Linstedt

Data modeling

All data, all the time

Method of design

Data Vault

Page 17: Datawarehousing

Data Warehousing

Position

Definition

Expert debate

Architecture

Methodology

Technology

Trends

Architecture (Latin architectura, from the Greek ἀρχιτέκτων – arkhitekton, from ἀρχι- "chief" and τέκτων "builder, carpenter") can mean:The art and science of designing and erecting buildings and other physical structures.The practice of an architect, where architecture means to offer or render professional services in connection with the design and construction of a building, or group of buildings and the space within the site surrounding the buildings, that have as their principal purpose human occupancy or use.[1]

A general term to describe buildings and other structures.A style and method of design and construction of buildings and other physical structures.A wider definition may comprise all design activity, from the macro-level (urban design, landscape architecture) to the micro-level (construction details and furniture). Architecture is both the process and product of planning, designing and constructing form, space and ambience that reflect functional, technical, social, and aesthetic considerations. It requires the creative manipulation and coordination of material, technology, light and shadow. Architecture also encompasses the pragmatic aspects of realising buildings and structures, including scheduling, cost estimating and construction administration. As documentation produced by architects, typically drawings, plans and technical specifications, architecture defines the structure and/or behavior of a building or any other kind of system that is to be or has been constructed.Architectural works are often perceived as cultural and political symbols and as works of art. Historical civilizations are often identified with their surviving architectural achievements.Architecture sometimes refers to the activity of designing any kind of system and the term is common in the information technology world.

Architecture

Page 18: Datawarehousing

Report

Analyses

Trends

Forecasting

Dashboard

BSC

Mining

Integration

Storage

Presentation

Architecture

Sources Data WarehouseInformationManagement

Page 19: Datawarehousing

DWH

Conventional architecture

TRANSFORM

Integration Storage Presentation

Business

Informatio

n Model

Current Business Demands/Wishes

STAGE

Page 20: Datawarehousing

Business

Information

Model

Leveringscondities

Leverancier

Materiaalsoort

Materiaalbehoeftemagazijn

Bestelling

Levering

Magazijn

omvangwerkdag

werkdag

omvang

Is geplaatst onder/betreft

Is bereid te leveren/kan geleverd worden door

Ontvangt/Is geplaatst bij

Verplicht tot/Is realisatie van

op

heeft

Bestaat uit/zit in

Bestaat uit/komt voor in

Betreft de bereidhied tot het levereren aan een/kan conform worden geleverd aan

Voorziet in/wordt in voorzien door

Wordt ontvangen door/ontvangt

Bestaat uit

Komt voor inmet Moet in voorzien worden voor

Page 21: Datawarehousing

STAGE

source

DWH

business

DWH

Modern architecture

TRANSFORM

Integration Storage PresentationStorage

Current B

usiness

Informatio

n Model

Current Business Demands/Wishes

ALL DATA, A

LL THE TIME

Page 22: Datawarehousing

Data Warehousing

Position

Definition

Expert debate

Architecture

Methodology

Technology

Trends

A methodology is instantiated and materialized by a set of methods, techniques and tools. A tool is any instrument or apparatus that is necessary to the performance of some task. A methodology does not describe specific methods; nevertheless it does specify several processes that need to be followed. These processes constitute a generic framework. They may be broken down in sub-processes, they may be combined, or their sequence may change. However any task exercise must carry out these processes in one form or another.[3]

Methodology may be a description of process, or may be expanded to include a philosophically coherent collection of theories, concepts or ideas as they relate to a particular discipline or field of inquiry.

Methodology may refer to nothing more than a simple set of methods or procedures, or it may refer to the rationale and the philosophical assumptions that underlie a particular study relative to the scientific method. For example, scholarly literature often includes a section on the methodology of the researchers.

Methodology

Page 23: Datawarehousing

Historic

Correct

Storage

Page 24: Datawarehousing

CURRENT DATA NEW DATA

Minor

Major

Venn diagram

DEL

ETED

/

ARCH

IVED

NEW

UNCHANGED

Page 25: Datawarehousing

Mechanisms:

Type 0 static data

Type I no history correct storage (overwrite)

Type II history correct storage (versioning)

Type III semi-history correct storage, using extra fields

Type IV using historic tables for history correct storage

Type V ??????

Type VI = type 1 + 2 + 3 + “current”-flag

Page 26: Datawarehousing

Type 2

UpdateInsert

Page 27: Datawarehousing

Type 2Artifical key

Page 28: Datawarehousing

Type 2 on type 2 on …

PRODUCTPRODUCT

GROUPPRODUCT

LINE

Product group description has changed

Page 29: Datawarehousing

Product Group Hub

ProductGroup Id

Product Group Sat

Description

Product Line Hub

ProductLine Id

Product Line Sat

Description

Product Hub

Product Id

Product Sat

Description

Product group description has changed

Page 30: Datawarehousing
Page 31: Datawarehousing

Data Warehousing

Position

Definition

Expert debate

Architecture

Methodology

Technology

Trends

Technology

Technology is the usage and knowledge of tools, techniques, crafts, systems or methods of organization. The word technology comes from the Greek technología (τεχνολογία) — téchnē (τέχνη), an 'art', 'skill' or 'craft' and -logía (-λογία), the study of something, or the branch of knowledge of a discipline.[1] The term can either be applied generally or to specific areas: examples include construction technology, medical technology, or state-of-the-art technology or high technology. Technologies can also be exemplified in a material product, for example an object can be termed state of the art.Technologies significantly affect human as well as other animal species' ability to control and adapt to their natural environments. The human species' use of technology began with the conversion of natural resources into simple tools. The prehistorical discovery of the ability to control fire increased the available sources of food and the invention of the wheel helped humans in travelling in and controlling their environment. Recent technological developments, including the printing press, the telephone, and the Internet, have lessened physical barriers to communication and allowed humans to interact freely on a global scale. However, not all technology has been used for peaceful purposes; the development of weapons of ever-increasing destructive power has progressed throughout history, from clubs to nuclear weapons.Technology has affected society and its surroundings in a number of ways. In many societies, technology has helped develop more advanced economies (including today's global economy) and has allowed the rise of a leisure class. Many technological processes produce unwanted by-products, known as pollution, and deplete natural resources, to the detriment of the Earth and its environment. Various implementations of technology influence the values of a society and new technology often raises new ethical questions. Examples include the rise of the notion of efficiency in terms of human productivity, a term originally applied only to machines, and the challenge of traditional norms.

Page 32: Datawarehousing

Extract Transform Load

EXTRACT

TRANSFORM LOAD

Page 33: Datawarehousing

Microsoft

SSIS

Page 34: Datawarehousing

SAP BO DATA INTEGRATOR

Page 35: Datawarehousing

Oracle

Warehouse

Builder

Page 36: Datawarehousing

Why ETL-tools?

Page 37: Datawarehousing

Standardization

Why ETL-tools?

Page 38: Datawarehousing

Maintainability

Why ETL-tools?

Page 39: Datawarehousing

Transparency

Why ETL-tools?

Page 40: Datawarehousing

Transferability

Why ETL-tools?

Page 41: Datawarehousing

Quality control

Why ETL-tools?

Page 42: Datawarehousing

Data Warehousing

Position

Definition

Expert debate

Architecture

Methodology

Technology

Trends

Technology

Page 43: Datawarehousing
Page 44: Datawarehousing

Data

Warehous

ing

Page 45: Datawarehousing

Having only a Data Warehouse

does not help users makebetter decisions …

Page 46: Datawarehousing

A Data Warehouse provides a toolset

that enables to create better

information provisioning solutions

Page 47: Datawarehousing