Data Mining and WareHousing

Post on 20-Jun-2015

318 views 1 download

Tags:

description

This is a ppt on introductory overview on data warehousing.

Transcript of Data Mining and WareHousing

By-By-

Ms.Vishakha AgarwalMs.Vishakha Agarwal

(B-tech 3(B-tech 3rdrd yr , CSE) yr , CSE)

Roll no:1035210106Roll no:1035210106

Data Warehousing and its need. Data Warehouse DBMS Vs. Data Warehouses Architecture Data Warehouse Components Data Warehouse -data characteristics Data Warehousing tools Advantages Disadvantages

We will discuss We will discuss about… about…

Definition :- It is a technology that allows us to gather , store & present data in a form suitable for human exploration.

Need:- It was needed by big organizations for data analysis.

Data Warehousing and its Data Warehousing and its needneed

It is a –

Subject-oriented Integrated Time-variant Non-volatile collection of data to support decision-making

process of an enterprise. It is a multi-dimensional model.

Data WarehouseData Warehouse

DBMS Vs. Data DBMS Vs. Data WarehousesWarehouses

DBMS Focuses on present.

Use of atomic data. Each transaction

accesses only small amount of data.

Supports day-to-day operations.

Changing ,incomplete data.

Data Warehouse

Focuses on past , present and future.

Use of aggregate data. Most analysis targets

large amounts of data. Supports information

analysis. Static ,historic data.

ArchitectureArchitecture

There are four characteristics of data in data warehouses.

Subject-oriented Data

Integrated Data

Time-variant Data

Non-volatile Data

Granular Data

Data Warehouse-Data Data Warehouse-Data characteristicscharacteristics

Extract transform load tools(ETL)

Data Warehousing Data Warehousing toolstools

Data cleaning

Integration tool

Quality-management tool

Query tool

Reporting tool

Other tools…Other tools…

Very large storage. Enhances end-user access to a wide variety of

data. Potentially lower computing costs and

increased productivity. Providing a place to combine related data from

separate sources. Security: data and access. Query processing: multiple options.

AdvantagesAdvantages

It is a costly method.

Data warehouses can get outdated relatively

quickly.

Lack of flexibility.

Data warehouses are not the optimal environment

for unstructured data.

Difficult to accommodate changes in data types and

ranges, data source schema, indexes and queries.

DisadvantagesDisadvantages