Cubes 1.0 Overview

Post on 09-Jul-2015

1.145 views 1 download

Tags:

description

New feature overview of Cubes 1.0 – lightweight Python OLAP and pluggable data warehouse. Video: https://www.youtube.com/watch?v=-FDTK80zsXc Github sources: https://github.com/databrewery/cubes

Transcript of Cubes 1.0 Overview

Cubes 1.0 Overviewlight data warehouse and conceptual modelling

Štefan Urbánek, @Stiivi stefan.urbanek@gmail.com November 2014

understandingthrough metadata

datamodel

reporting apps / modules

metadata

logical

physical

Categorical Data

∑ =

OLAP(online analytical processing)

lightweight framework for

conceptual modelling and analytics

Original Cubesbefore 1.0

process or server

store

|

Workspace1 × 1 × model

We needed more!

Models

Stores

file

Postgres Mongo API

APIdatabase

multiple model parts, different sources

multiple data sources, heterogenous

Cubes 1.0

Python ≥ 3.4works with ≥ 2.7 too for the “two” series

■ analytical workspace

■ model providers

■ new and improved backends

■ better extensibility

■ authorisation

Analytical Workspace

Cubes

Model Providers

Stores

sales churn eventsactivations

Static Model Provider

API Model Provider

BI Data(Postgres)

BI Data 2(Mongo)

Events(API)

Workspace

Cubes

Model Providers

Stores

sales churn eventsactivations

Static Model Provider

BI Data(Postgres)

BI Data 2(Mongo)

crm sales events

[workspace] models_path: /var/lib/cubes/models

[models] crm: crm.cubesmodel sales: sales.cubesmodel events: events.cubesmodel

[store crm] type: sql url: postgresql://localhost/crm

[store events] type: mongo host: localhost collection: events

BYOBbring your own backend

Slicer

Backend

|Browser

"Store

#Provider

Logical Physical

physical data store(database or API)

|Browser

"Store

#Provider

∑aggregate

connectcreate model

model

cubes

dimensions

model

backend objects

Model Provider

model

cubes

dimensions

Model Provider

■ metadata on-the-fly

■ local or external source

■ might be linked to a store

model

cubes

dimensions

Model

required

automatic

automatic

automatic

required

Slicer cube dimension

key/attribute

property

column (table)

Dimensions

dimension

Cubes / Facts

metric

table

collection

event

Google Analytics

Mixpanel

MongoDB

SQL

Backend

Model Improvements

Model

■ measures → aggregates

■ more front-end metadata cube categories, dimension role and cardinality

■ customised dimension linking

"measures": [ { "name": "amount", "label": "Sales Amount" }, { "name": "vat", "label": "VAT" } ]

"aggregates": [ { "name": “total_sales", "label": "Total Sales Amount", "measure": "amount", "function": "sum" }, { "name": “total_vat", "label": "Total VAT", "measure": "vat", "function": "sum" }, { "name": "item_count", "label": "Item Count", "function": "count" } ]

Aggregates

■ custom name

■ can refer to other aggregates post-aggregation calculations

■ functions are backend-specific SQL aggregations: sum, count, count_nonempty, count_distinct, min, max, avg, stddev, variance, …

Contextual Dimensions{ "measures": [ … ],

"dimensions": [ {"name": "date", "hierarchies": ["ym", "yqm"]}, {"name": "date", "alias": "contract_date"} ],

… }

alias, hierarchies, exclude_hierarchies, default_hierarchy_name, cardinality,

nonadditive

customisable linking properties:

Dimension Roles

dimension.role

level.role

hint for reporting applications or backends

time

year, month, day, …

Cardinality

dimension.cardinality

tiny < low < medium < high

level.cardinality

<<

overload precautions

Browser

Browser■ uses logical model

■ implements aggregation

■ builds queries

■ retrieves dataLogical Physical

physical data store(database or API)

|Browser

"Store

∑aggregate

model

Browser Methods■ features()

■ aggregate(cell, drilldown,…)

■ members(cell, dimension, …)

■ facts(cell, …)

■ fact(id)

■ cell_details(cell, drilldown, …)

Split Cell

TrueFalse

__within_split__generated dimension

aggregate(split=cell)

Post-aggregation

■ computed on aggregation result in Python

■ moving averages, deviation, variance wma, sma, sms, smstd, smsrd, smsvar

■ aggregate property: window_size

“statutils”

Store

Store■ provides database or API connection

■ might provide a model

■ slicer tool actions (future) validation, schema, optimization, ...

Logical Physical

physical data store(database or API)

|Browser

"Store

connect

SQL Backendalso known as ROLAP

or SQL query generator

SQL Overview

■ new query builder

■ join optimisation

■ support for outer-joins

■ support for “split” dimension

■ new aggregate functions

fact table

join optimisation

master detail

match datefacts

detailmaster

detail datefacts

master detail

master datefacts

"joins" = [ { "master": "fact_contracts.contract_date_id", "detail": "dim_date.id", "method": "detail" } ]

Authentication and Authorisation

{ “lidia”: { “allowed_cubes”: [“sales”], “cube_restrictions”: { “sales”: [“store:3”] } }, “martin”: { “allowed_cubes”: [“sales”], “cube_restrictions”: { “sales”: [“store:5”] } }}

[workspace] authorization: simple

[authorization] rights_file: access_rights.json

!

Authorizer

Slicerserver

Model Queries

■ GET /cubes overview of cubes from all providers

■ GET /cube/sales/model detailed cube model with described dimensions

Browser Queries

■ GET /cube/name/aggregate

■ GET /cube/name/members/dim

■ GET /cube/name/facts

■ GET /cube/name/fact

■ GET /cube/name/cell

Aggregate

GET /cube/sales/aggregate? cut=date:2010& split=status:1&drilldown=date|region& page=10 page_size=100&

{ "cell": [], "total_cell_count": 2, "drilldown": [ { "record_count": 31, "amount_sum": 550840, “date.year": 2009 }, { "record_count": 31, "amount_sum": 566020, “date.year": 2010 } ], "summary": { "record_count": 62, "amount_sum": 1116860 }}

Special Characters

“category:10\-24” → “10-24”

“city:Nové\ Mesto\ nad\ Váhom” → “Nové Mesto nad Váhom"

Relative Time

date:yesterday

date:90daysago-today

expiration_date:lastmonth-next2months

uses dimension roles and Calendar

Output Format

format=csv

format=json

format=json_lines

*for facts and members

*

Deploymentreporting for your app or stand-alone

Public

store

Slicer server

HTML & JS Application

HTTP request

JSON reply

model

Public

store

WSGI

HTML & JS Application

HTTP request

JSON reply

Slicer Flask App

model

Public

store

JSON reply

CubesPython API

Django, Flask, …

HTML

model

Public

store

Flask

HTML

Slicer Blueprintmodel

Internal

Public

store

Slicer server

Web ApplicationPHP, RoR, Django

HTTP request

JSON reply

model

HTML

Front-endsgeneric ad-hoc reporting

✂Slicer

Jose Juan Montes, ! jjmontesl/cubesviewer

Cubes Viewer

checkgermany.deFront-end by: Felix Ebert (@femeb)

Data by: Friedrich Lindenberg (@pudo) Cubes 0.10.2

*

Summary & Future

Summary

■ heterogenous pluggable environment

■ easier to extend

■ better SQL query generator

Not Mentioned

■ localisation

■ namespaces

■ calendar

■ query logging

Incubated

■ non-additive properties

■ periods-to-date

■ modeler app

■ cubes.js

Future

■ arithmetic expressions

■ SQL improvements

■ improved API for custom browsers

■ cubes.js

Nutrition FactsServing Size 1 cube

Total Fat 0g

Trans Fat 0g

Amount Per Serving

Saturated Fat 0g

% Daily Value

Total Carbohydrate 0g

Sugars 0g

Dietary Fiber 0g

0%

0%

Want to contribute?

#TODO, #FIXME, Issue #

https://github.com/DataBrewery/cubes/issues

Credits

Thanks for 1.0

Robin Thomas Ryan Berlew

Jose Juan Montes Squarespace

and all contributors on Github

Thank You

"Stiivi

cubes.databrewery.org

github.com/DataBrewery/cubes