Data Warehousing – A Technology Marvel -by Swati Chawla.

24
Data Warehousing – Data Warehousing – A Technology Marvel A Technology Marvel -by Swati Chawla -by Swati Chawla

Transcript of Data Warehousing – A Technology Marvel -by Swati Chawla.

Page 1: Data Warehousing – A Technology Marvel -by Swati Chawla.

Data Warehousing –Data Warehousing –

A Technology MarvelA Technology Marvel

-by Swati Chawla-by Swati Chawla

Page 2: Data Warehousing – A Technology Marvel -by Swati Chawla.

2

Agenda

• Introduction• Business Need Beyond Reporting• Traditional Approaches• Definition• Data Classification• Components of Data Warehouse• Benefits• Tools For DataWarehousing• Data Modeling Terminologies• Schemas

– Star Schema– Snowflake Schema

Page 3: Data Warehousing – A Technology Marvel -by Swati Chawla.

3

Scenario 1

• Your company has made less profit than previous year?

• What could be the reason?

• How would you generate a report of your yearly sales and how long would you need to figure out the problem?

• Your manager wants the reason as early as possible…

Page 4: Data Warehousing – A Technology Marvel -by Swati Chawla.

4

Business Need beyond Reporting….

Page 5: Data Warehousing – A Technology Marvel -by Swati Chawla.

5

Scenario 2

• You are a frequent Traveler

• You have a Saving Bank account with ABC Bank pvt. Ltd.

• You use your Bank’s ATM card to buy your Air Tickets…

• Now, one day you receive an exciting offer from the bank stating a 15 percent discount on all the Air Tickets booked using Bank’s ATM Card …..

• Sounds Fascinating , Isn’t it?

• What Do You Think would have Happened?

• How did you Bank Get to know about you Nature of your

transactions…????

• Did the Bank Manager own a Magic-Ball ?

Page 6: Data Warehousing – A Technology Marvel -by Swati Chawla.

6

Traditional Approaches

• Programs were written to analyze the data stored on tapes or on Mainframes .

• With the advent of personal computers, programs were run on data dump (Data Islands) stored on individual PCs in order to analyze the data.

• Decision Support System

• Executive Information systems

Page 7: Data Warehousing – A Technology Marvel -by Swati Chawla.

7

Data Warehousing has the key to all these Questions ….

Page 8: Data Warehousing – A Technology Marvel -by Swati Chawla.

8

Defining Data Warehouse

According to Bill Inmon, known as the father of Data Warehousing, a data warehouse is a subject oriented, integrated, time-variant, nonvolatile collection of data in support of management decisions.

Few of the applications of DWH:

– Cloth Manufacturer: Analyze sales and product trends by location to understand customer buying patterns

– Pharma Manufacturer: Analysis of physicians and their prescribing patterns

– Retailer: Analyze sale fluctuations across different regions

– Movie Theatre Chain: Key performance indicators including average ticket price, attendance, box office ticket sales, concession sales, buttered vs. non-buttered popcorn

– Airline Industry: Analysis of airline network trends by revenue class, routes, origin-destination, point of booking

Page 9: Data Warehousing – A Technology Marvel -by Swati Chawla.

9

Operational Data

Data classification

Operational processing

Analytical

processing

Informational Data

Data

Page 10: Data Warehousing – A Technology Marvel -by Swati Chawla.

10

Informational & Operational Data

Data warehouse OLTP DB

Typical operation Query scans thousands or millions of rows. For example" Find the total sales of last month."

Accesses only a handful of records. For example" Retrieve the current order for this customer."

Schema design De-normalized or partially normalized schemas

Fully normalized schemas

Data Modification: A data warehouse is updated on a regular basis. The end users of a data warehouse do not directly update the data.

The OLTP database is always up to date, and reflects the current state of each business transaction.

Historical Data Data warehouses usually store many months or years of data.

OLTP systems usually store data from only a few weeks or months.

User Knowledge worker, Business Analyst Clerk, IT Professional

#Users Hundreds Thousands

Page 11: Data Warehousing – A Technology Marvel -by Swati Chawla.

11

Components of Data Warehouse

A Data Warehouse typically comprises of following components –

• Source Data Layer • Data Transformation Layer• Data Store / Warehouse Layer• Reporting Layer• Metadata Layer• Operations Layer

Page 12: Data Warehousing – A Technology Marvel -by Swati Chawla.

12

Page 13: Data Warehousing – A Technology Marvel -by Swati Chawla.

13

Source Data Layer & Data Transformation Layer

ETL is the process of Extracting, Transforming & Loading Data in the process of Data Warehousing.

• EXTRACTION: The data are extracted from the source. Data can be extracted from more than a single source.

• TRANSFORMATION: Manipulations can be made to the data that are being extracted from the source. The Manipulations needed are done at this stage. It includes converting the data into a format and presenting it in such a manner, which facilitates the easy understanding of data and enhances the business user’s capability to carry out the business data analysis .

• LOADING: The modified data is then loaded into the Data Warehouse . Loading involves the insertion of data into the target system, that is, the data warehouse.

Page 14: Data Warehousing – A Technology Marvel -by Swati Chawla.

14

Orders

Billing

OLTP

DWH

CustomerService

Product

Customer

Marketing

Finance

Data Marts

Data Flow (Data Warehousing Layer)

A Data Mart is -•Scaled down version of DWH which is designed for a particular line of business.•Focuses on one subject area or only one group of users.

Page 15: Data Warehousing – A Technology Marvel -by Swati Chawla.

15

Reporting Layer

• Reporting is the process of development and production of business reports based on data warehouse data.

• Data mining is the process of examining data for trends and patterns that might have evaded human analysis.

• OLAP an acronym for 'Online Analytical Processing' is a technique by which the data sourced from a data warehouse or data mart is visualized and summarized to provide perspective multidimensional view across multiple dimensions.

Page 16: Data Warehousing – A Technology Marvel -by Swati Chawla.

16

Data Warehousing – End to End

Page 17: Data Warehousing – A Technology Marvel -by Swati Chawla.

17

Benefits

Data Warehouse –

• Queries do not impact Operational systems• Provides quick response to queries for reporting• Enables Subject Area Orientation• Integrates data from multiple, diverse sources• Enables multiple interpretations of same data by different

users or groups• Provides thorough analysis of data over a period of time• Accuracy of Operational systems can be checked• Provides analysis capabilities to decision makers

Page 18: Data Warehousing – A Technology Marvel -by Swati Chawla.

18

Tools Available For Data Warehousing:-

Page 19: Data Warehousing – A Technology Marvel -by Swati Chawla.

19

500

2. Sold in the monthof July 2006

1. Bottles of Soft Drink

3. Sold in JalandharCity

Fact

Page 20: Data Warehousing – A Technology Marvel -by Swati Chawla.

20

Data Modeling Terminologies

• Fact table consists of the measurements, metrics or facts of a business process .

• Dimension table is one of the set of companion tables to a fact table.

• Schema is a collection of database objects, including tables, views, indexes, and synonyms

Page 21: Data Warehousing – A Technology Marvel -by Swati Chawla.

21

Data Warehouse Schemas

– Star Schema

• Star Schema is a relational database schema for representing multidimensional data. The center of the star schema consists of a large fact table and it points towards the dimension tables

– Snowflake Schema

• A snowflake schema is a variation on the star schema, in which very large dimension tables are normalized into multiple tables. Dimensions with hierarchies can be decomposed into a snowflake structure when it is required to normalize the dimension tables, in order to save space. Snowflake schema approach increases the number of joins and results in poor performance in retrieval of data.

Page 22: Data Warehousing – A Technology Marvel -by Swati Chawla.

22

Example of a Star Schema

Page 23: Data Warehousing – A Technology Marvel -by Swati Chawla.

23

Example of a Snowflake Schema

Page 24: Data Warehousing – A Technology Marvel -by Swati Chawla.

Thank You