Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language !...

18
Dimensional Fact Model Stuttgart, 26/11/2014 Stefano Cazzella @StefanoCazzella http://caccio.blogdns.net http://bimodeler.com stefano.cazzella{at}gmail.com 1 BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella

Transcript of Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language !...

Page 1: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

Dimensional Fact Model

Stuttgart, 26/11/2014 Stefano Cazzella @StefanoCazzella http://caccio.blogdns.net http://bimodeler.com stefano.cazzella{at}gmail.com

1 BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella

Page 2: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

Complexity in SE and IS development

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 2

The art of programming is the art of organizing complexity, of mastering multitude and avoiding its bastard chaos as effectively as possible.

– Edsger Dijkstra, “Notes on Structured Programming”

Page 3: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

Project Layers

• User requirements • Conceptual model Business

• Technical choices • Logical model Design

• Tecnology • Physical model Build

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 3

Page 4: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

Civil Engineering Example

Business

What the client wants

Design

The technical blueprint

Build

The desired building

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 4

Page 5: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

Model-driven engineering

• Business centric

• No tecnical details

PIM

• Tecnical design

• System architecture

PSM • Tecnical deliverables

• System realization

Build

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 5

Model transformation

Model transformation

Page 6: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

Project Layers for Data Mart

• DFM Business

• Relational model Design

• DBMS specific DDL Build

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 6

Dimensional Fact Model

Page 7: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

Why Dimensional Fact Model ?

Formal language à well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic definition

Simple and effective graphical notation (representation)

Specifically defined to represent multi-dimensional models

Does not imply any technical/implementation choice

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 7

1

2

3

4

Page 8: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

DFM Notation Compendium

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 8

Page 9: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

Data Mart building process

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 9

Business user’s needs

Model transformation

Logical data model (Relational model:

tables, columns, etc.)

Phisical data model (DDL with indexes,

partions, etc.)

Model transformation

Multidimensional data model

(Dimensional Fact Model)

Requirements definition

Data Mart

Deployment

Technical specifications

Implementation strategy

+ =

Page 10: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

Data Mart building process

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 10

Business user’s needs

Model transformation

Logical data model (Relational model:

tables, columns, etc.)

Phisical data model (DDL with indexes,

partions, etc.)

Model transformation

Multidimensional data model

(Dimensional Fact Model)

Requirements definition

Data Mart

Deployment

Technical specifications

Implementation strategy

+ = Formalize user’s needs in a conceptual (business-centric) model, then …

… transform it in a logical model integrating technical specification …

… and transform it again in a physical model that realizes the business requirements

Page 11: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

Business - From requisite to DFM

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 11

•  Context: weblog analytics - the analysis of the visits of several web sites belonging to different domains (eg. Google Analytics)

•  Requisite: monitoring and analyzing the number of visits and their monthly and daily average duration for each page of the websites, or each domain, distributed by the geographic region of the IP of the visitors.

11 BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella

þ Domain definition þ Aggregation rules þ Optional dependencies

+

Page 12: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

Design choice

• Star-schema (denormalized dimension table) • Snow-flake (hierarchies implemented by tables in 3NF)

Reference ROLAP model:

• Use natural key (the dimension attribute à PK column) • Use surrogate key (add a new column with no business meaning) • Use slow-changing dimension (SCD) of type 2 • Use implicit dimension (no dimension table, only a column in the fact table)

Hierarchy implementation strategy (for every dimension)

• Text à VARCHAR(250) ; Currency à NUMBER(9,2) ; etc.

Domain ßà Data type association

• Table name prefix (D for Dimensions, F for Facts) ; Number à NBR ; etc.

Standard naming conventions and abbreviations

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 12

Page 13: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

Transform DFM in a Relational Model

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 13

Model transformation

Fact grain Technical design choices: •  Reference ROLAP model à star-schema •  Hierarchy Viewerà use surrogate key •  Hierarchy Page à SCD – Type 2

Surrogate key

SCD-2 Start date End date

13 BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella

Page 14: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

Build choice

• SqlServer – Oracle – Hive / Hadoop

Choice the DBMS

• Generate unique keys / primary keys / integrity constraints (foreign keys)

Generate constraints?

• Add clustered indexes / column-store indexes / bitmap indexes / etc.

Add specific indexes

• Organize fact tables in partitions (by hash, value, range, etc.)

Define table partitions

• Define file groups / tablespaces for tables, partitions, indexes

Distribute data over multiple volumes

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 14

Page 15: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

Phisical model and DDL (1)

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 15

Implementation choices & best practice: •  DBMS à SQL Server •  Fact F_VISITS partitioned by year •  Column-store index on day and duration •  2 distinct file groups for tables and indexes

Partition scheme and functions

Columnstore index

File groups

15 BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella

Page 16: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

Phisical model and DDL (2)

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 16

Implementation choices & best practice: •  DBMS à Oracle •  Fact F_VISITS partitioned by year •  Bitmap index on viewer dimension •  2 distinct table spaces for tables and

indexes

Table partitions

Bitmap index

Table spaces

16 BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella

Page 17: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

BI Modeler

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 17

•  In order to apply a model-driven approach, BI Project teams need a software tool to: þ Manage (draw) all the models - DFM, relational, etc. þ Support (and drive) the model transformation process

•  There was (are) no many tools able to do that so, in 2006 I started working on the development of …

http://bimodeler.com

Page 18: Dimensional Fact Model - BI Academy … · Why Dimensional Fact Model ? Formal language ! well-specified syntax and an unequivocally interpretation (semantic) based on a sound algebraic

DEMO

BI ACADEMY Launch@Germany - Stuttgart, 26/11/2014 - Stefano Cazzella 18

Create a DFM about SALES from scratch

Define the fact schema and its measures

Add some dimensions / hierarchies

Define and associate domains to attributes and measures

Transform a DFM in a relational data model

Define an implementation strategy for Hierarchies

Associate Data type to domains

Apply a naming convention

Add physical properties to the relational model

Choose a DBMS

Create partitions

Create indexes

Generate DDL