Cube Intro. Decision Making Effective decision making Goal: Choice that moves an organization closer...
-
Upload
rafe-gardner -
Category
Documents
-
view
222 -
download
0
Transcript of Cube Intro. Decision Making Effective decision making Goal: Choice that moves an organization closer...
Cube IntroCube Intro
Decision MakingDecision MakingEffective decision makingEffective decision making Goal: Choice that moves an organization closer to Goal: Choice that moves an organization closer to
an agreed-on set of goals in a timely manneran agreed-on set of goals in a timely manner Needs supports Needs supports
Goals – needs to be specific Goals – needs to be specific Not to hit an iceberg – for Titanic or other boatsNot to hit an iceberg – for Titanic or other boats
Measures – needs to be concreteMeasures – needs to be concrete Away from any visible icebergAway from any visible iceberg
Feedback – needs to be timely and accurateFeedback – needs to be timely and accurate We hit an iceberg, rather than saw one – not timely enoughWe hit an iceberg, rather than saw one – not timely enough The boat can sink, actually, again too late to find outThe boat can sink, actually, again too late to find out
BI – Business IntelligenceBI – Business Intelligence
Is the delivery of accurate, useful Is the delivery of accurate, useful information to the appropriate decision information to the appropriate decision makers within necessary timeframe to makers within necessary timeframe to support effective decision makingsupport effective decision making
It is a superset of what can be provided by It is a superset of what can be provided by data miningdata mining
SQL Server 2014 is a decent tool for this SQL Server 2014 is a decent tool for this and support both Cubes/Data warehouse and support both Cubes/Data warehouse and Data miningand Data mining
OLAP/CubeOLAP/Cube
Term coined in mid-1990’s when database people needed a new research direction
Main goal: support ad-hoc but complex querying performed by business analysts
Interactive process of creating, managing, analyzing and reporting on data
Extends spreadsheet-like analysis to work with huge amounts of data in a data warehouse
Known vs. Unknown QuestionsKnown vs. Unknown QuestionsWhen using SQL statements or even When using SQL statements or even Cubes, we know the question and can Cubes, we know the question and can formulate the SQL statements (most of the formulate the SQL statements (most of the time) to dig out answers to the questionstime) to dig out answers to the questions Layout let answers – SQLLayout let answers – SQL Data-led answers – the drill down in a cube Data-led answers – the drill down in a cube
dimension (if NW sales figure is low, which dimension (if NW sales figure is low, which state is the main contributor?)state is the main contributor?)
When we do not know what to ask, we use When we do not know what to ask, we use data mining – book’s definition is less data mining – book’s definition is less accurateaccurate
What should be the sourcesWhat should be the sourcesOLTP is the sourceOLTP is the source On line transaction ProcessOn line transaction Process These are the things we have learned in database These are the things we have learned in database
classesclasses
But OLTP, the direct data store is not suitable for But OLTP, the direct data store is not suitable for BI because BI because Not to fight for computing resources of OLTP systemsNot to fight for computing resources of OLTP systems OLTP may not have all the history readily availableOLTP may not have all the history readily available BI may have to use data from several OLTP systemsBI may have to use data from several OLTP systems
The solution is generally The solution is generally Data MartData Mart
Data martsData martsData marts are related sets of data that Data marts are related sets of data that are grouped together and separated out are grouped together and separated out from the main body of data.from the main body of data.
It can be a repository of OLTP data or It can be a repository of OLTP data or portion of data from a data warehouse portion of data from a data warehouse
Data in a Data Mart are not meant for Data in a Data Mart are not meant for supporting real time operationssupporting real time operations
Data in a Data mart are generally have Data in a Data mart are generally have gone through the data clearing processgone through the data clearing process
More on OLAPMore on OLAP
Place key performance indicators (measures) into context (dimensions) measures are pre-aggregated data retrieval is significantly faster modeled in a DW
The processed “cube” is made available to business analysts who can browse the data using a variety of tools, making ad hoc interactive and analytical processing
OLAP versus OLTP
OLAP versus OLTP
Measures and dimensions
Measures: key performance indicators, such as sales amount, that you would like to evaluate
typically numerical, including volume, sales, and costs a rule of thumb: if a number makes (business/social)
sense when aggregated, then it is a measure examples:
aggregate daily volume to month, quarter and year aggregating telephone numbers would not make sense therefore, telephone numbers are not measures postcode: not a measure, but can be a dimension (?)Totals in all the transactions would be a measureTotals in all the transactions would be a measureNumber of high school students by school, county, state should Number of high school students by school, county, state should be a measurebe a measure
Measures and dimensions
Dimensions: categories of data analysis typical dimensions include product, time, region a rule of thumb: when a report is requested to
be ordered "by“ something, that something is usually a dimension
e.g. in sales report: view sales by month, by region, so the two dimensions needed are time and region
Star Schema
Dimensions and measures are physically represented by a star schema (typically) arrange the dimension tables around a
central fact table that contains the measures a fact table contains a column for each
measure as well as a column for each dimension
ExampleExampleDimension tables
Market (Store_ID, City, Region) Product (Product_ID, Name, Category, Price) Time (Time_ID, Week, Month, Quarter, Year)
Fact table Sales (Store_ID, Product_ID, Time_ID, Amount)
NotesNotes Amount should be the measureAmount should be the measure Time_ID is preprocessed to show at week or month levelTime_ID is preprocessed to show at week or month level This can be modified to show SnowflakesThis can be modified to show Snowflakes
ExampleExample
supermarket sales roll-up: summaries
over all cities, regions slice the cube to select
sales only in day 3 dice the cube to select
sales only in week 2 (days 8-14), and group by regions
The TWOThe TWO
Multi-dimensional modelMulti-dimensional model What we meant mostly when referring a cube What we meant mostly when referring a cube
or a data warehouseor a data warehouse
Tabular modelTabular model Complicated in SQL Server Complicated in SQL Server May be accomplished easily with other tools May be accomplished easily with other tools
such as verticasuch as vertica
SSMS and SSDTSSMS and SSDT
SSDT (SQL Server Data Tool), used to be SSDT (SQL Server Data Tool), used to be the BI studio, is really the Visual Studio the BI studio, is really the Visual Studio 20132013
Give a demoGive a demo