Data Warehouse Fundamentals

34
Data Warehouse Fundamentals Rabie A. Ramadan, PhD 2

description

Data Warehouse Fundamentals. Rabie A. Ramadan, PhD 2. What did you do in Your Assignment ?. For an airlines company, how can strategic information increase the number of frequent flyers? Discuss giving specific details. - PowerPoint PPT Presentation

Transcript of Data Warehouse Fundamentals

Page 1: Data Warehouse Fundamentals

Data Warehouse Fundamentals

Rabie A. Ramadan, PhD

2

Page 2: Data Warehouse Fundamentals

2

What did you do in Your Assignment ?

For an airlines company, how can strategic information increase the number of frequent flyers? Discuss giving specific details.

You are a Senior Analyst in the IT department of a company manufacturing automobile parts. The marketing heads are complaining about the poor response by IT in providing strategic information. Draft a proposal to them explaining the reasons for the problems and why a data warehouse would be the only viable solution.

Page 3: Data Warehouse Fundamentals

3

What did you do in the Project ? Egypt Election System

• Governorates’ database system • Multiple databases on Multiple Servers

• Summarization System • Meta data

• Data Warehouse Server

• Web page with query based system

Page 4: Data Warehouse Fundamentals

4http://www.inf.unibz.it/dis/teaching/DWDM/index.html

Page 5: Data Warehouse Fundamentals

5

Definitions & Motivations

Why Data Mining? Explosive Growth of Data: from terabytes to petabytes Data Collections and Data Availability

• Crawlers, database systems, Web, etc.

Sources• Business: Web, e-commerce, transactions, etc.

• Science: Remote sensing, bioinformatics, etc.

• Society and everyone: news, YouTube, etc.

Page 6: Data Warehouse Fundamentals

6

Why Data Mining?

Problem: We are drowning in data, but starving for knowledge!

Solution: Use Data Mining tools for Automated Analysis of massive data sets

Page 7: Data Warehouse Fundamentals

7

What is Data Mining?

Data mining (knowledge discovery from data)• Extraction of interesting (non-trivial, implicit,

previously unknown and potentially useful) patterns or knowledge from huge amount of data

Page 8: Data Warehouse Fundamentals

8

What is Data Mining?

Alternative names• Knowledge discovery (mining) in databases (KDD),

• knowledge extraction,

• data/pattern analysis,

• data archeology,

• Data dredging,

• information harvesting,

• business intelligence,

• etc.

Page 9: Data Warehouse Fundamentals

9

Knowledge Discovery (KDD) Process

Page 10: Data Warehouse Fundamentals

10

Knowledge Discovery (KDD) Process

Page 11: Data Warehouse Fundamentals

11

Typical Architecture of a Data Mining System

Page 12: Data Warehouse Fundamentals

12

Confluence of Multiple Disciplines

Page 13: Data Warehouse Fundamentals

13

Why Confluence of Multiple Disciplines?

Tremendous amount of data• Scalable algorithms to handle terabytes of data (e.g., Flickr

had 5 billion images in September, 2010 [http://blog.flickr.net/en/2010/09/19/5000000000/])

High dimensionality of data• Data can have tens of thousands of features (e,g., DNA

microarray)

Page 14: Data Warehouse Fundamentals

14

Why Confluence of Multiple Disciplines?

Page 15: Data Warehouse Fundamentals

15

Different Views of Data Mining Data View

• Kinds of data to be mined Knowledge view

• Kinds of knowledge to be discovered Method view

• Kinds of techniques utilized Application view

• Kinds of applications

Page 16: Data Warehouse Fundamentals

16

Data to Mined

In principle, data mining should be applicable to any data repository

We will have examples about:• Relational databases• Data warehouses• Transactional databases• Advanced database systems

Page 17: Data Warehouse Fundamentals

17

Relational Databases

Page 18: Data Warehouse Fundamentals

18

Data Warehouses

Page 19: Data Warehouse Fundamentals

19

Transactional Databases

Page 20: Data Warehouse Fundamentals

20

Advanced Database Systems(1)

Page 21: Data Warehouse Fundamentals

21

Advanced Database Systems(2)

Page 22: Data Warehouse Fundamentals

22

Knowledge to be Discovered

Page 23: Data Warehouse Fundamentals

23

Characterization and Discrimination

Page 24: Data Warehouse Fundamentals

24

Characterization and Discrimination (1)

Page 25: Data Warehouse Fundamentals

25

Class Activity

• Differentiate between Data Mining and Data warehousing?

Data warehousing is merely extracting data from different sources, cleaning the data and storing it in the warehouse. Where as data mining aims to examine or explore the data using queries

What are the Different problems that “Data mining” can solve? Data mining can be used in a variety of fields/industries like marketing,

advertising of goods, products, services, AI, government intelligence. How does the data mining and data warehousing work

together? Data warehousing can be used for analyzing the business needs by storing

data in a meaningful form. Using Data mining, one can forecast the business needs. Data warehouse can act as a source of this forecasting. 

Page 26: Data Warehouse Fundamentals

26

Frequent Patterns, Associations, Correlations

Page 27: Data Warehouse Fundamentals

27

Classification and Prediction

Page 28: Data Warehouse Fundamentals

28

Cluster Analysis

Page 29: Data Warehouse Fundamentals

29

Outlier Analysis

Page 30: Data Warehouse Fundamentals

30

Evolution Analysis

Page 31: Data Warehouse Fundamentals

31

Techniques Utilized

Page 32: Data Warehouse Fundamentals

32

Applications Adapted

Page 33: Data Warehouse Fundamentals

33

Major Challenges in Data Mining

Page 34: Data Warehouse Fundamentals

34

Summary