Cubes 1.0 Overview

79
Cubes 1.0 Overview light data warehouse and conceptual modelling Štefan Urbánek, @Stiivi [email protected] November 2014

description

New feature overview of Cubes 1.0 – lightweight Python OLAP and pluggable data warehouse. Video: https://www.youtube.com/watch?v=-FDTK80zsXc Github sources: https://github.com/databrewery/cubes

Transcript of Cubes 1.0 Overview

Page 1: Cubes 1.0 Overview

Cubes 1.0 Overviewlight data warehouse and conceptual modelling

Štefan Urbánek, @Stiivi [email protected] November 2014

Page 2: Cubes 1.0 Overview

understandingthrough metadata

Page 3: Cubes 1.0 Overview

datamodel

reporting apps / modules

metadata

Page 4: Cubes 1.0 Overview

logical

physical

Page 5: Cubes 1.0 Overview

Categorical Data

∑ =

Page 6: Cubes 1.0 Overview

OLAP(online analytical processing)

lightweight framework for

conceptual modelling and analytics

Page 7: Cubes 1.0 Overview

Original Cubesbefore 1.0

Page 8: Cubes 1.0 Overview

process or server

store

|

Workspace1 × 1 × model

Page 9: Cubes 1.0 Overview

We needed more!

Page 10: Cubes 1.0 Overview

Models

Stores

file

Postgres Mongo API

APIdatabase

multiple model parts, different sources

multiple data sources, heterogenous

Page 11: Cubes 1.0 Overview

Cubes 1.0

Page 12: Cubes 1.0 Overview

Python ≥ 3.4works with ≥ 2.7 too for the “two” series

Page 13: Cubes 1.0 Overview

■ analytical workspace

■ model providers

■ new and improved backends

■ better extensibility

■ authorisation

Page 14: Cubes 1.0 Overview

Analytical Workspace

Page 15: Cubes 1.0 Overview

Cubes

Model Providers

Stores

sales churn eventsactivations

Static Model Provider

API Model Provider

BI Data(Postgres)

BI Data 2(Mongo)

Events(API)

Page 16: Cubes 1.0 Overview

Workspace

Cubes

Model Providers

Stores

sales churn eventsactivations

Static Model Provider

BI Data(Postgres)

BI Data 2(Mongo)

crm sales events

[workspace] models_path: /var/lib/cubes/models

[models] crm: crm.cubesmodel sales: sales.cubesmodel events: events.cubesmodel

[store crm] type: sql url: postgresql://localhost/crm

[store events] type: mongo host: localhost collection: events

Page 17: Cubes 1.0 Overview

BYOBbring your own backend

Slicer

Page 18: Cubes 1.0 Overview

Backend

Page 19: Cubes 1.0 Overview

|Browser

"Store

#Provider

Page 20: Cubes 1.0 Overview

Logical Physical

physical data store(database or API)

|Browser

"Store

#Provider

∑aggregate

connectcreate model

model

cubes

dimensions

model

backend objects

Page 21: Cubes 1.0 Overview

Model Provider

model

cubes

dimensions

Page 22: Cubes 1.0 Overview
Page 23: Cubes 1.0 Overview

Model Provider

■ metadata on-the-fly

■ local or external source

■ might be linked to a store

model

cubes

dimensions

Page 24: Cubes 1.0 Overview

Model

required

automatic

automatic

automatic

required

Slicer cube dimension

key/attribute

property

column (table)

Dimensions

dimension

Cubes / Facts

metric

table

collection

event

Google Analytics

Mixpanel

MongoDB

SQL

Backend

Page 25: Cubes 1.0 Overview

Model Improvements

Page 26: Cubes 1.0 Overview

Model

■ measures → aggregates

■ more front-end metadata cube categories, dimension role and cardinality

■ customised dimension linking

Page 27: Cubes 1.0 Overview
Page 28: Cubes 1.0 Overview

"measures": [ { "name": "amount", "label": "Sales Amount" }, { "name": "vat", "label": "VAT" } ]

"aggregates": [ { "name": “total_sales", "label": "Total Sales Amount", "measure": "amount", "function": "sum" }, { "name": “total_vat", "label": "Total VAT", "measure": "vat", "function": "sum" }, { "name": "item_count", "label": "Item Count", "function": "count" } ]

Page 29: Cubes 1.0 Overview

Aggregates

■ custom name

■ can refer to other aggregates post-aggregation calculations

■ functions are backend-specific SQL aggregations: sum, count, count_nonempty, count_distinct, min, max, avg, stddev, variance, …

Page 30: Cubes 1.0 Overview

Contextual Dimensions{ "measures": [ … ],

"dimensions": [ {"name": "date", "hierarchies": ["ym", "yqm"]}, {"name": "date", "alias": "contract_date"} ],

… }

alias, hierarchies, exclude_hierarchies, default_hierarchy_name, cardinality,

nonadditive

customisable linking properties:

Page 31: Cubes 1.0 Overview

Dimension Roles

dimension.role

level.role

hint for reporting applications or backends

time

year, month, day, …

Page 32: Cubes 1.0 Overview

Cardinality

dimension.cardinality

tiny < low < medium < high

level.cardinality

<<

overload precautions

Page 33: Cubes 1.0 Overview

Browser

Page 34: Cubes 1.0 Overview

Browser■ uses logical model

■ implements aggregation

■ builds queries

■ retrieves dataLogical Physical

physical data store(database or API)

|Browser

"Store

∑aggregate

model

Page 35: Cubes 1.0 Overview

Browser Methods■ features()

■ aggregate(cell, drilldown,…)

■ members(cell, dimension, …)

■ facts(cell, …)

■ fact(id)

■ cell_details(cell, drilldown, …)

Page 36: Cubes 1.0 Overview

Split Cell

TrueFalse

__within_split__generated dimension

aggregate(split=cell)

Page 37: Cubes 1.0 Overview

Post-aggregation

■ computed on aggregation result in Python

■ moving averages, deviation, variance wma, sma, sms, smstd, smsrd, smsvar

■ aggregate property: window_size

“statutils”

Page 38: Cubes 1.0 Overview

Store

Page 39: Cubes 1.0 Overview

Store■ provides database or API connection

■ might provide a model

■ slicer tool actions (future) validation, schema, optimization, ...

Logical Physical

physical data store(database or API)

|Browser

"Store

connect

Page 40: Cubes 1.0 Overview

SQL Backendalso known as ROLAP

or SQL query generator

Page 41: Cubes 1.0 Overview

SQL Overview

■ new query builder

■ join optimisation

■ support for outer-joins

■ support for “split” dimension

■ new aggregate functions

Page 42: Cubes 1.0 Overview

fact table

join optimisation

Page 43: Cubes 1.0 Overview

master detail

match datefacts

detailmaster

detail datefacts

master detail

master datefacts

"joins" = [ { "master": "fact_contracts.contract_date_id", "detail": "dim_date.id", "method": "detail" } ]

Page 44: Cubes 1.0 Overview

Authentication and Authorisation

Page 45: Cubes 1.0 Overview
Page 46: Cubes 1.0 Overview
Page 47: Cubes 1.0 Overview

{ “lidia”: { “allowed_cubes”: [“sales”], “cube_restrictions”: { “sales”: [“store:3”] } }, “martin”: { “allowed_cubes”: [“sales”], “cube_restrictions”: { “sales”: [“store:5”] } }}

Page 48: Cubes 1.0 Overview

[workspace] authorization: simple

[authorization] rights_file: access_rights.json

!

Authorizer

Page 49: Cubes 1.0 Overview

Slicerserver

Page 50: Cubes 1.0 Overview
Page 51: Cubes 1.0 Overview

Model Queries

■ GET /cubes overview of cubes from all providers

■ GET /cube/sales/model detailed cube model with described dimensions

Page 52: Cubes 1.0 Overview

Browser Queries

■ GET /cube/name/aggregate

■ GET /cube/name/members/dim

■ GET /cube/name/facts

■ GET /cube/name/fact

■ GET /cube/name/cell

Page 53: Cubes 1.0 Overview

Aggregate

GET /cube/sales/aggregate? cut=date:2010& split=status:1&drilldown=date|region& page=10 page_size=100&

Page 54: Cubes 1.0 Overview

{ "cell": [], "total_cell_count": 2, "drilldown": [ { "record_count": 31, "amount_sum": 550840, “date.year": 2009 }, { "record_count": 31, "amount_sum": 566020, “date.year": 2010 } ], "summary": { "record_count": 62, "amount_sum": 1116860 }}

Page 55: Cubes 1.0 Overview

Special Characters

“category:10\-24” → “10-24”

“city:Nové\ Mesto\ nad\ Váhom” → “Nové Mesto nad Váhom"

Page 56: Cubes 1.0 Overview

Relative Time

date:yesterday

date:90daysago-today

expiration_date:lastmonth-next2months

uses dimension roles and Calendar

Page 57: Cubes 1.0 Overview

Output Format

format=csv

format=json

format=json_lines

*for facts and members

*

Page 58: Cubes 1.0 Overview

Deploymentreporting for your app or stand-alone

Page 59: Cubes 1.0 Overview

Public

store

Slicer server

HTML & JS Application

HTTP request

JSON reply

model

Public

store

WSGI

HTML & JS Application

HTTP request

JSON reply

Slicer Flask App

model

Public

store

JSON reply

CubesPython API

Django, Flask, …

HTML

model

Public

store

Flask

HTML

Slicer Blueprintmodel

Internal

Public

store

Slicer server

Web ApplicationPHP, RoR, Django

HTTP request

JSON reply

model

HTML

Page 60: Cubes 1.0 Overview

Front-endsgeneric ad-hoc reporting

Page 61: Cubes 1.0 Overview
Page 62: Cubes 1.0 Overview

✂Slicer

Page 63: Cubes 1.0 Overview
Page 64: Cubes 1.0 Overview
Page 65: Cubes 1.0 Overview
Page 66: Cubes 1.0 Overview
Page 67: Cubes 1.0 Overview

Jose Juan Montes, ! jjmontesl/cubesviewer

Cubes Viewer

Page 68: Cubes 1.0 Overview

checkgermany.deFront-end by: Felix Ebert (@femeb)

Data by: Friedrich Lindenberg (@pudo) Cubes 0.10.2

*

Page 69: Cubes 1.0 Overview

Summary & Future

Page 70: Cubes 1.0 Overview

Summary

■ heterogenous pluggable environment

■ easier to extend

■ better SQL query generator

Page 71: Cubes 1.0 Overview

Not Mentioned

■ localisation

■ namespaces

■ calendar

■ query logging

Page 72: Cubes 1.0 Overview

Incubated

■ non-additive properties

■ periods-to-date

■ modeler app

■ cubes.js

Page 73: Cubes 1.0 Overview

Future

■ arithmetic expressions

■ SQL improvements

■ improved API for custom browsers

■ cubes.js

Page 74: Cubes 1.0 Overview

Nutrition FactsServing Size 1 cube

Total Fat 0g

Trans Fat 0g

Amount Per Serving

Saturated Fat 0g

% Daily Value

Total Carbohydrate 0g

Sugars 0g

Dietary Fiber 0g

0%

0%

Page 75: Cubes 1.0 Overview

Want to contribute?

#TODO, #FIXME, Issue #

https://github.com/DataBrewery/cubes/issues

Page 76: Cubes 1.0 Overview

Credits

Page 77: Cubes 1.0 Overview

Thanks for 1.0

Robin Thomas Ryan Berlew

Jose Juan Montes Squarespace

and all contributors on Github

Page 78: Cubes 1.0 Overview

Thank You

"Stiivi

Page 79: Cubes 1.0 Overview

cubes.databrewery.org

github.com/DataBrewery/cubes