Automation of MultiDimensional DB Design (poster)

1
Project Goal Full-featured solution for multidimensional database design 19 TH ACM CONFERENCE ON MANAGEMENT OF D ATA COMAD@AHMEDABAD .INDIA 2013 Future Work References Further Information E. F. Codd, S. B. Codd, and C. T. Salley . Providing OLAP to user-analysts: An IT mandate. 1993. Alfredo Cuzzocrea and Rim Moussa: Multidimensional Database Design via Schema Transformation: Turning TPC-H into the TPC-H*d Multidimensional Benchmark. COMAD, 2013. https://sites.google.com/site/rimmoussa/auto_multidimensional_dbs [email protected] A UTO-MDB: A FRAMEWORK FOR A UTOMATED MULTIDIMENSIONAL D ATABASE DESIGN VIA SCHEMA TRANSFORMATION ALFREDO CUZZOCREA ICAR-CNR & UNIV . OF CALABRIA. ITALY RIM MOUSSA, HEJER AKAICHI LA TICE LAB. UNIV . OF TUNIS & ESTI UNIV . OF CARTHAGE . TUNISIA TPC-H*d Benchmark Auto-MDB Framework Simple Rules for turning business queries into OLAP Cubes Measures definition Fact Table definition Dimensions definition Turning Business Query Q8 of TPC-H benchmark into an OLAP Cube Motivations Questions of Developpers of BI Solutions 1. Advantages of On-Line Analytical Processing: Presentation -visual OLAP, user interaction Ease of Maintenance -data is stored as is viewed, Performance -aggregated data calculus, 2. BI market is booming, according to research from market watchers, such as Pringle & Company and Gartner, the market for BI platforms will remain one of the fastest growing software markets in most regions 3. MDB Design milestone is often neglected, OLAP cubes are defined in a haphazard way without worrying about performance and maintenance cost. How to define cubes? will there be a single cube or multiple cubes? Which optimizations are the most suitable for running the workload? Data fragmentation & parallel OLAP ? Derived data (aggregate tables, indexes, derived attributes, data synopsis) ? TPC-H*d Benchmark Truly OLAP variant of TPC-H benchmark the most prominent decision support system benchmark TPC-H SQL workload translated into MDX (MultiDimensional eXpressions) The workload is composed of 23 MDX statements for OLAP cubes and 23 MDX statements for OLAP business queries. Screenshots of C8 and Q8 Pivot Tables and corresponding MDX Statements Application to TPC-DS benchmark Advanced Virtual Cube Design Investigate more derived data strategies, such as data synopsis calculus

description

 

Transcript of Automation of MultiDimensional DB Design (poster)

Page 1: Automation of MultiDimensional DB Design (poster)

Project Goal Full-featured solution for multidimensional database design

19TH ACM CONFERENCE ON MANAGEMENT OF DATA [email protected] 2013

Future Work

References

Further Information

E. F. Codd, S. B. Codd, and C. T. Salley. Providing OLAP to user-analysts:

An IT mandate. 1993.

Alfredo Cuzzocrea and Rim Moussa: Multidimensional Database Design via

Schema Transformation: Turning TPC-H into the TPC-H*d Multidimensional

Benchmark. COMAD, 2013.

https://sites.google.com/site/rimmoussa/auto_multidimensional_dbs

[email protected]

AUTO-MDB: A FRAMEWORK FOR AUTOMATED MULTIDIMENSIONAL DATABASE DESIGN

VIA SCHEMA TRANSFORMATION

ALFREDO CUZZOCREA ICAR-CNR & UNIV. OF CALABRIA. ITALY

RIM MOUSSA, HEJER AKAICHI LATICE LAB. UNIV. OF TUNIS & ESTI UNIV. OF CARTHAGE . TUNISIA

TPC-H*d BenchmarkAuto-MDB Framework

Simple Rules for turning business queries into OLAP

Cubes

Measures definition

Fact Table definition

Dimensions definition

Turning Business Query Q8 of TPC-H benchmark into

an OLAP Cube

Motivations Questions of Developpers of BI Solutions

1. Advantages of On-Line Analytical Processing:

Presentation -visual OLAP, user interaction

Ease of Maintenance -data is stored as is viewed,

Performance -aggregated data calculus,

2. BI market is booming, according to research from market

watchers, such as Pringle & Company and Gartner, the market

for BI platforms will remain one of the fastest growing software

markets in most regions

3. MDB Design milestone is often neglected, OLAP cubes are

defined in a haphazard way without worrying about

performance and maintenance cost.

How to define cubes?

will there be a single cube or multiple cubes?

Which optimizations are the most suitable for running the

workload?

Data fragmentation & parallel OLAP ?

Derived data (aggregate tables, indexes, derived attributes, data

synopsis) ?

TPC-H*d Benchmark

Truly OLAP variant of TPC-H benchmark –the most prominent decision

support system benchmark

TPC-H SQL workload translated into MDX (MultiDimensional

eXpressions)

The workload is composed of 23 MDX statements for OLAP cubes and

23 MDX statements for OLAP business queries.

Screenshots of C8 and Q8 Pivot Tables and corresponding MDX

Statements

Application to TPC-DS benchmark

Advanced Virtual Cube Design

Investigate more derived data strategies, such as data synopsis calculus