BI Dimensional Modeling
-
Upload
mahendran-ranganathan -
Category
Documents
-
view
44 -
download
1
description
Transcript of BI Dimensional Modeling
-
5/28/2018 BI Dimensional Modeling
1/22
Dimensional Modeling
Chapter 2
-
5/28/2018 BI Dimensional Modeling
2/22
The Dimensional Data ModelAn alternative to the normalized data
model
Present information as simply aspossible (easier to understand)
Return queries as quickly as possible
(efficient for queries) Track the underlying business processes
(process focused)
-
5/28/2018 BI Dimensional Modeling
3/22
The Dimensional Data Model Contains the same information as the
normalized model
Has far fewer tables Grouped in coherent business
categories
Pre-joins hierarchies and lookup tablesresulting in fewer join paths and fewerintermediate tables
Normalized fact table with denormalized
dimension tables.
-
5/28/2018 BI Dimensional Modeling
4/22
GB Video E-R Diagram
Customer
#Cust No
F Name
L Name
Ads1Ads2
City
StateZip
Tel No
CC No
Expire
Rental
#Rental No
Date
Clerk No
Pay TypeCC No
Expire
CC Approval
Line
#Line No
Due Date
Return DateOD charge
Pay type
Requestor
of
Owner of
Video
#Video NoOne-day fee
Extra days
Weekend
Title
#Title No
Name
Vendor No
Cost
Name for
Holder of
-
5/28/2018 BI Dimensional Modeling
5/22
Customer
CustID
Cust No
F Name
L Name
Rental
RentalID
Rental No
Clerk No
Store
Pay Type
LineLineID
OD Charge
OneDayCharge
ExtraDaysCharge
WeekendCharge
DaysReserved
DaysOverdue
CustID
AddressIDRentalId
VideoID
TitleID
RentalDateID
DueDateID
ReturnDateID
Video
VideoID
Video No
Title
TitleID
TitleNo
Name
Cost
Vendor Name
Rental DateRentalDateID
SQLDate
Day
Week
Quarter
Holiday
Due Date
DueDateID
SQLDate
Day
Week
Quarter
Holiday
Return Date
ReturnDateID
SQLDate
Day
Week
Quarter
Holiday
Address
AddressID
Adddress1
Address2
City
State
Zip
AreaCode
Phone
GB Video Data Mart
-
5/28/2018 BI Dimensional Modeling
6/22
Fact Table
Measurements associated with a specific businessprocess
Grain: level of detail of the table
Process events produce fact records Facts (attributes) are usually
Numeric Additive
Derived facts included
Foreign (surrogate) keys refer to dimension tables(entities)
Classification values help define subsets
-
5/28/2018 BI Dimensional Modeling
7/22
Dimension Tables
Entities describing the objects of the process
Conformed dimensions cross processes
Attributes are descriptive
Text Numeric Surrogate keys
Less volatile than facts (1:m with the fact table)
Null entries Date dimensions
Produce by questions
-
5/28/2018 BI Dimensional Modeling
8/22
Bus Architecture
An architecture that permits aggregating
data across multiple marts
Conformed dimensions and attributes Drill Down vs. Drill Across
Bus matrix
-
5/28/2018 BI Dimensional Modeling
9/22
Keys and Surrogate Keys
A surrogate key is a unique identifier for data
warehouse records that replaces source
primary keys (business/natural keys)
Protect against changes in source systems
Allow integration from multiple sources
Enable rows that do not exist in source data
Track changes over time (e.g. new customer
instances when addresses change)
Replace text keys with integers for efficiency
-
5/28/2018 BI Dimensional Modeling
10/22
Slowly Changing Dimensions
Attributes in a dimension that change moreslowly than the fact granularity
Type 1: Current only
Type 2: All history
Type 3: Most recent few (rare)
Note: rapidly changing dimensions usually
indicate the presence of a business processthat should be tracked as a separatedimension or as a fact table
-
5/28/2018 BI Dimensional Modeling
11/22
CustKey BKCustID CustName CommDist Gender HomOwn?
1552 31421 Jane Rider 3 F N
Date CustKey ProdKey Item Count Amount
1/7/2004 1552 95 1 1,798.00
3/2/2004 1552 37 1 27.95
5/7/2005 1552 87 2 320.26
2/21/2006 15522387 42 1 19.95
Cust
Key
BKCust
ID
Cust
Name
Comm
Dist
Gender Hom
Own?
Eff End
1552 31421 Jane Rider 3 F N 1/7/2004 1/1/2006
2387 31421 Jane Rider 31 F N 1/2/2006 12/31/9999
Fact Table
Dimension with a slowly changing attribute
-
5/28/2018 BI Dimensional Modeling
12/22
Date Dimensions
One row for every day for which you expect to
have data for the fact table (perhaps generated
in a spreadsheet and imported)
Usually use a meaningful integer surrogate key
(such as yyyymmdd 20060926 for Sep. 26,
2006). Note: this order sorts correctly.
Include rows for missing or future dates to beadded later.
-
5/28/2018 BI Dimensional Modeling
13/22
Degenerate Dimensions
Dimensions without attributes. (Such as
a transaction number or order number.)
Put the attribute value into the fact tableeven though it is not an additive fact.
-
5/28/2018 BI Dimensional Modeling
14/22
Snowflaking(Outrigger Dimensions or Reference Dimensions)
Connects entities to dimension tables
rather than the fact table
Complicates coding and requiresadditional processing for retrievals
Makes type 2 slowly changing
dimensions harder to maintain Useful for seldom used lookups
-
5/28/2018 BI Dimensional Modeling
15/22
M:N Multivalued Dimensions
Fact to Dimension
Dimension to Dimension
Try to avoid these. Solutions can be
very misleading.
-
5/28/2018 BI Dimensional Modeling
16/22
Multivalued Dimensions
ORDERS (FACT)
SalesRepKey
ProductKey
SalesRepGrpKey
CustomerKey
OrderQty
SALESREP
SalesRepKey
Name
Address
SALESREP-ORDER-BRIDGE
SalesRepKey
SalesrepGroupKeyWeight= (1/NumReps)
-
5/28/2018 BI Dimensional Modeling
17/22
Hierarchies
Group data within dimensions: SalesRep
Region
State County Neighborhood
Problem structures
Variable depth Frequently changing
-
5/28/2018 BI Dimensional Modeling
18/22
Heterogeneous Products
Several different kinds of entry with
different attributes for each
(The sub-class problem)
-
5/28/2018 BI Dimensional Modeling
19/22
Aggregate Dimensions
Dimensions that represent data at
different levels of granularity
Remove a dimension Roll up the hierarchy (provide a new shrunkendimensionwith new surr-key that represents
rolled up data)
-
5/28/2018 BI Dimensional Modeling
20/22
Junk Dimensions
Miscellaneous attributes that dont
belong to another entity, usually
representing processing levels Flags Categories Types
-
5/28/2018 BI Dimensional Modeling
21/22
Fact Tables
Transaction Track processes at discrete points in time
when they occur
Periodic snapshot Cumulative performance over specific time
intervals
Accumulating snapshot Constantly updated over time. May includemultiple dates representing stages.
-
5/28/2018 BI Dimensional Modeling
22/22
Aggregates
Precalculated summary tables
Improve performance
Record data an coarser granularity