Data Warehousing Introduction. Text and Resources The Data Warehouse Lifecycle Toolkit, Kimball,...

Post on 24-Dec-2015

215 views 1 download

Transcript of Data Warehousing Introduction. Text and Resources The Data Warehouse Lifecycle Toolkit, Kimball,...

Data Warehousing

Introduction

Text and Resources

The Data Warehouse Lifecycle Toolkit, Kimball, Reeves, Ross, and Thornthwaite

Internet resources

Data Warehousing Institute

Teradata Institute

Intelligent Enterprise

Data Warehouse Approach

An old idea with a new interest:

Cheap Computing Power

Special Purpose Hardware

New Data Structures

Intelligent Software

Heightened Business Competition

Data Warehouse

“Queryable source of data in the enterprise”

Common source of consistent organizational information

Identify problems and opportunities

User focused

Retrieval focused

Goals of the Course

Understand the Data Warehouse philosophy

Dimensional modeling

Tools for Warehouse management

Business intelligence

Business practice

What To Expect

Help develop course expectations for the future

Two tests

Exercises and a semester project

Graduate Presentation

What is a data warehouse?

A database filled with large volumes of cross-indexed historical business information that users can access with various query tools.

The warehouse usually resides on its own server and is separate from the transaction-processing or “run-the-business” systems.

Purpose of a data warehouse

Provides an architecture for the flow of data from operational systems to decision support systems DW involves a many record analysis, during

which all data has to be locked

Used to discover trends and patterns Present opportunities Identify problems

ROI of data warehouses

New insights into Customer habits Developing new products Selling more products

Cost savings and revenue increasesCross-selling of productsLess mainframe computer storageIdentify and target most profitable customers

Capital outlay and development/training time can be extraordinary.Quality of system outputLevels of riskIntangibles

Cio.com (middle ground)

Course Outline

Introduction and basic principles

Data extraction: SQL data definition code

Warehouse Architecture: Dimensional Modeling

Data Cleansing: SAS Datastep coding

Data Presentation: MS Analysis Services