Dax en
-
Upload
marco-pozzan -
Category
Technology
-
view
625 -
download
6
description
Transcript of Dax en
SQL Saturday #264
Ancona, Italy
Introduction DAX
Sponsors & Media Partners
Marco Pozzan
Work with SQL server from 2000 version Work in BI since 2005 President of the community 1nn0va (
www.innovazionefvg.net) Project manager at Servizi CGN (www.cgn.it) SQL Server and DWH Consultant References:
twitter: @marcopozzan email: [email protected] site: www.marcopozzan.it
Agenda
What is Powerpivot? Demo of Powerpivot What is DAX? Calculated columns and Calculated Fields Evaluation Context Calculate Demo in DAX
What is Powerpivot?
Free AddIn for Microsoft Excel 2010 e 2013 Different versions for 32/64 bit (4GB limit) Do not need SQL Server or other prerequisities Very powerful analysis engine based on SSAS
di SQL Server 2012 No API available to control it No security available Always impersonates current user
Versions
installed: Client-side: inside to Excel Server-side: Built on SharePoint 2012 o SQL
Server 2012(Tabular)
the SSAS engine client-side version the engine runs in-process with Excel
SSAS 2012
Use Vertipaq engine which is a columnar database high compression
Work completely in memory No I/O, aggregates, or other… IMBI = New way of thinking about the
algorithms
Powerpivot
Import data Relationship between tables Slicer
Advantages (PowerPivot)
Fast No ETL (Power Query) Metadata (model) Integration of heterogeneous sources
Sharing Especially with Sharepoint
Expressiveness Relation and Dax
Disadvantages (PowerPivot)
There are ETL to clean the data Quality of data? Data size
Please note that these are not problems!
What is DAX?
Designed to work within a PivotTable Programming language of Tabular and
PowerPivot Resembles Excel (say )
No concept of «row» and «column» Different Type System
Mix between MDX, SQL, EXCEL
Dax Types
Non numerical: String Binary Objects (Power View)
Numerical: Currency Integer Real DateTime
(integer: dd 30/12/1899, decimal: fraction of a day)
Boolean
Type Handling
Operators are not strongly typed ("1"+1) Operator Overloading (warning ) Example
1 & 2 = "12" "1" + "2" = 3
Columns in DAX 1/2
'TableName’[ColumnName]
Quotes can be omitted if the tablename does not contain spaces (Don’t do it )
=FactInternetSales[OrderDate]
Columns in DAX 2/2
TableName can be omitted and then will look in the current table not to do it as it is hard to understand the
formulas
Brackets cannot be omitted
=[OrderDate]
Calculated Columns
Computed using DAX and persisted in the database
Use another columns Always computed for the current row FactInternetSales[OrderDate] means
The value of the OrderDate column In the FactInternetSales table For the current row Different for each row
Measures (Calculated Fields)
Do not work row by row Written using DAX Not stored on the database Use tables and aggregators Do not have the «current row»
I can not write the following formula
=FactInternetSales[OrderDate]
:=SUM(FactInternetSales[OrderDate])
Define the right name of the column
If you change the name of the columns must be changed manually in the measures So, immediately defined the right names
Calculated column e Measures
Suppose you want to calculate the margin with a calculated column:
I Can aggregate margin column with a measure
=FactInternetSales[SalesAmount]-FactInternetSales[TotalProductCost]
SUMofMargin:=SUM(FactInternetSales[Margin])
Calculated column e Measures
margin compared to sales (margine%)
This expression is not correct if he come aggregate
I must use this
=FactInternetSales[Margin] / FactInternetSales[SalesAmount]
Margine%:=SUM(FactInternetSales[Margin]) / SUM(FactInternetSales[SalesAmount])
Measures rules and convention
Define the name of the table to which it belongs
The measures are global to the model There may be two measures with the same name
in different tables You can move from one table to another, this can
not be done with computed columns
Do not refer to a measure with table name is confused with calculated columns
Summary 1/2
Columns consume memory and measures consume CPU
Are calculated at different times They have different purposes They are structured differently Are managed in different ways
Summary 2/2
Use measures (90%) Calculate ratios Calculate percentages Need complex aggregations
Use column when (10%) It requires slicer or filter values The expression is calculated on the current row
Counting Values
COUNTROWS: rows in a table COUNTBLANK: counts blanks COUNTA: counts anything but not blanks COUNT : only for numeric columns
Compatibility with Excel
DISTINCTCOUNT: performs distinct count Multidimensional -> measure group with a
distinctcount measure .. Slow like a snail
Errors in DAX 1/2
1+2 always works [SalesAmount]/[Margin] might fail Causes of errors
Conversion errors Arithmetical operations Empty or missing values
ISERROR (Expression) returns true or false, depending on the presence of an error during evaluation
Errors in DAX 2/2
IFERROR (Expression, Alternative) in case of error returns Alternative true. Useful to avoid writing expression twice
Both IFERROR and ISERROR are very slow so be careful how you use computed columns
Aggregation Functions
Work only on numeric columns Aggregation functions:
SUM AVERAGE MIN MAX
Aggregate columns only not expression SUM(Order[Quantity]) SUM(Order[Quantity]) * Orders[Quantity])
The X aggregation functions 1/2
Iterate on the table and evaluate the expression for each row
Always get two parameters: the table to iterate and the formula to evaluate
SUMX,AVERAGEX,MINX,MAXX
SUMX (Sales,Sales[Price] * Sales[Quantity]
)
The X aggregation functions 2/2
First calculate the internal parameters and then makes the sum
The columns must all be on the same table or use RELATED (if there is a relationship)
They are very slow but I do not use memory
Alternatively, the X functions
An alternative to the X functions. create a calculated column aggregate on that column very fast but use memory
Logical Functions
AND (little used) or && OR (little used) or || IF IFERROR NOT (little used) SWITCH
Switch
Color :=IF(DimProduct[Color] = "R", "Red",
IF(DimProduct[Color] = "Y", " Yellow ", "Other"))
Color :=Switch (
DimProduct[Color],"R", "Red","Y", "Yellow","Other"
)
Information Function
Completely useless (do not take the expressions but only columns) ISNUMBER ISTEXT ISNONTEXT
Useful ISBLANK ISERROR
But if we do not know us (we created the column) if it is a number or a text those who should know (by Alberto Ferrari)
DIVIDE Function
check that the denominator is not 0 IF( Sales[Price] <> 0, Sales[Quantity] /
Sales[Price],0) DIVIDE(Sales[Quantity], Sales[Price],0)
Date Function
Many useful functions:
DATE ,DATEVALUE, DAY, EDATE,EMONTH ,HOUR, MINUTE, MONTH, NOW, SECOND, TIME, TIMEVALUE, TODAY (interesting!!!), WEEKDAY, WEEKNUM, YEAR, YEARFRANC
Time intelligence functions
Evaluation Context 1/3
Characterizes DAX from any other language They are similar to the “where clause” of the
MDX query in SSAS Contexts under which a formula is evaluated
Filter Context , RowContext
Evaluation context 2/3
Filter Context: Set of active rows for the computation The filter that comes from the PivotTable
Defined by slicers, filters, columns, rows
One for each cell of the PivotTable
Evaluation context 3/3
Row Context: Contains a singles row Current row during iterations Define by X function or Calculate column
definition not by pivot tables This concept is new among MDX
because not working leaf by leaf, but only on the context.
The two context are always
Filter context: Filter tables Might be empty (All the tables are visible) It is used by aggregate functions In calculated column is all the tables because
there is not pivot table
Row context: Iterate the active row in the filter context Might be empty (There is no iteration running)
With more tables?
Evaluation Context
Filter context: Is propagated through relationships from one to many The direction of the relationships is very important. Is
different from SQL (inner,left,...) Applies only once (+ performance)
Row context: Does not propagate over relationships Use RELATED (open a new row context on the
target) Apply for each row (- performance)
Exmple of a Filter Context
Table Function
FILTER (adding new conditons. Is an iterator!!!) ALL
(Remove all conditions from a table. Returns all rows from a table)
Useful to calculate ratios and percentages Removes all filters from the specified columns in the
table VALUES (valori di una colonna compresi i blank) RELATEDTABLE (tutti i valori collegati alla riga corrente) All function returns a table
Filter
All
Mixing Filters
VALUES
Return to the table with a single column containing all possible values of the column visible in the current context
When the result is a column and a row can be used as scalar
SelectedYear:=COUNTROWS(VALUES(Dati[Year]))
RELATEDTABLE
Return only row of sales (Dati) related with the current store (Store)
=COUNTROWS (RELATEDTABLE(Dati))
Considerations
we have seen that we can: Can add a filter on a column Remove filter on the full table Mixing filter
…..but: ignore only a part of the filter context and not all add a condition to the context filter or modify an
existing condition
Calculate
The most simple but complex to understand CALCULATE( Expression,
Filter1,….FiltroN )
Computed before the filter (AND) and then the expression
All filters are processed in parallel and are independent of each other
Replace the filter context (replace whole table or a single column)
Calculate
Calculate with filter
So this formula is not correct
ProductLMC := CALCULATE( SUM(FactInternetSales[SalesAmount]); DimProduct[ListPrice] > DimProduct[StandardCost] ))
ProductLMC := CALCULATE( SUM(FactInternetSales[SalesAmount]); FILTER(DimProduct, DimProduct[ListPrice] > DimProduct[StandardCost] ))
The filter is a boolean condition that works on a single column (Ex: DimProduct[Color] = "White" or DimProduct[ListPrice] > 1000)
In this case there are too many columns in the filter (ListPrice and StandardCost)
Use FILTER
Calculate – pay attention to the filter contextProductM100:=
CALCULATE (
SUM(FactInternetSales[SalesAmount]),FILTER(
DimProduct,
DimProduct[ListPrice] >= 100 ) )
The DimProduct is evaluated in the original filter context before evaluate CALCULATE
Color = silver
Color = silver ListPrice >= 100 Filter Context
Calculate – pay attention to the context filterProductM100_Bis:=
CALCULATE (
SUM(FactInternetSales[SalesAmount]),FILTER(
ALL(DimProduct),
DimProduct[ListPrice] >= 100 ) )
The new context of filter will be the SUM are all the row because “color = silver” was removed
Color = silver
ListPrice >= 100 All column ….. Filter Context
Earlier
Returns a value from the previous row context:
In row contex we have only 1 variables available
FOR A = 1 TO 5 FOR B = 1 TO 5
IF A < B THEN NEXTNEXT
FOR A = 1 TO 5 FOR A = 1 TO 5
IF IEARLEIER (A) < A THEN NEXT
NEXT
=SUMX( FILTER(Sales; Sales[Date]<=EARLIER(Sales[Date]) && YEAR(Sales[Date]) = YEAR(EARLIER(Sales[Date])) ) ;Sales[Value] )
Calculate – Context transition
= SUM(FactInternetSales[SalesAmount]);
= CALCULATE(SUM(FactInternetSales[SalesAmount]));
In DimProduct the two expressions are the same?
ABC and Pareto Analysis
80% of effects come from 20% of the causes L’80% of sales come from 20% of customers
Pareto analysis is the basis of the classification ABC
Class A contains items for >=70% of total value Class B contains items for >=20% and <70% of total value Class C contains items for <20% of total value
ABC and Pareto Analysis
For each row calculate the TotalSales
Calculate all products with total sales greater than the selling of the row RunningTotalSales
=CALCULATE( SUM(FactInternetSales[SalesAmount]))
= SUMX( FILTER( DimProduct;
DimProduct[TotalSales] >= EARLIER(DimProduct[TotalSales]) ); DimProduct[TotalSales] )
Analisi di Pareto e l’analisi ABC
calculate the percentage of sales by product of the total sales
visualize the labels A, B, C
=DimProduct[RunningTotalSales] / SUM(DimProduct[TotalSales])
=IF( DimProduct[RunningPct] <= 0.7; "A"; IF( DimProduct[RunningPct] < =0.9;
"B"; "C";
) )
ABC and Pareto Analysis
the number of products that generate those sales
=COUNTROWS(DimProduct)
ABC and Pareto Analysis
Link and Book
PowerPivot http://www.powerpivot.com
SQLBI http://www.sqlbi.com
WebCast (Powerpivot 1.0)http://www.presentation.ialweb.it/p29261115/
Book
Q & A
Thanks!
#sqlsat264#sqlsatancona