Dax en

64
SQL Saturday #264 Ancona, Italy Introduction DAX

description

Introduction DAX

Transcript of Dax en

Page 1: Dax en

SQL Saturday #264

Ancona, Italy

Introduction DAX

Page 2: Dax en

Sponsors & Media Partners

Page 3: Dax en

Marco Pozzan

Work with SQL server from 2000 version Work in BI since 2005 President of the community 1nn0va (

www.innovazionefvg.net) Project manager at Servizi CGN (www.cgn.it) SQL Server and DWH Consultant References:

twitter: @marcopozzan email: [email protected] site: www.marcopozzan.it

Page 4: Dax en

Agenda

What is Powerpivot? Demo of Powerpivot What is DAX? Calculated columns and Calculated Fields Evaluation Context Calculate Demo in DAX

Page 5: Dax en

What is Powerpivot?

Free AddIn for Microsoft Excel 2010 e 2013 Different versions for 32/64 bit (4GB limit) Do not need SQL Server or other prerequisities Very powerful analysis engine based on SSAS

di SQL Server 2012 No API available to control it No security available Always impersonates current user

Page 6: Dax en

Versions

installed: Client-side: inside to Excel Server-side: Built on SharePoint 2012 o SQL

Server 2012(Tabular)

the SSAS engine client-side version the engine runs in-process with Excel

Page 7: Dax en

SSAS 2012

Use Vertipaq engine which is a columnar database high compression

Work completely in memory No I/O, aggregates, or other… IMBI = New way of thinking about the

algorithms

Page 8: Dax en

Powerpivot

Import data Relationship between tables Slicer

Page 9: Dax en

Advantages (PowerPivot)

Fast No ETL (Power Query) Metadata (model) Integration of heterogeneous sources

Sharing Especially with Sharepoint

Expressiveness Relation and Dax

Page 10: Dax en

Disadvantages (PowerPivot)

There are ETL to clean the data Quality of data? Data size

Please note that these are not problems!

Page 11: Dax en

What is DAX?

Designed to work within a PivotTable Programming language of Tabular and

PowerPivot Resembles Excel (say )

No concept of «row» and «column» Different Type System

Mix between MDX, SQL, EXCEL

Page 12: Dax en

Dax Types

Non numerical: String Binary Objects (Power View)

Numerical: Currency Integer Real DateTime

(integer: dd 30/12/1899, decimal: fraction of a day)

Boolean

Page 13: Dax en

Type Handling

Operators are not strongly typed ("1"+1) Operator Overloading (warning ) Example

1 & 2 = "12" "1" + "2" = 3

Page 14: Dax en

Columns in DAX 1/2

'TableName’[ColumnName]

Quotes can be omitted if the tablename does not contain spaces (Don’t do it )

=FactInternetSales[OrderDate]

Page 15: Dax en

Columns in DAX 2/2

TableName can be omitted and then will look in the current table not to do it as it is hard to understand the

formulas

Brackets cannot be omitted

=[OrderDate]

Page 16: Dax en

Calculated Columns

Computed using DAX and persisted in the database

Use another columns Always computed for the current row FactInternetSales[OrderDate] means

The value of the OrderDate column In the FactInternetSales table For the current row Different for each row

Page 17: Dax en

Measures (Calculated Fields)

Do not work row by row Written using DAX Not stored on the database Use tables and aggregators Do not have the «current row»

I can not write the following formula

=FactInternetSales[OrderDate]

:=SUM(FactInternetSales[OrderDate])

Page 18: Dax en

Define the right name of the column

If you change the name of the columns must be changed manually in the measures So, immediately defined the right names

Page 19: Dax en

Calculated column e Measures

Suppose you want to calculate the margin with a calculated column:

I Can aggregate margin column with a measure

=FactInternetSales[SalesAmount]-FactInternetSales[TotalProductCost]

SUMofMargin:=SUM(FactInternetSales[Margin])

Page 20: Dax en

Calculated column e Measures

margin compared to sales (margine%)

This expression is not correct if he come aggregate

I must use this

=FactInternetSales[Margin] / FactInternetSales[SalesAmount]

Margine%:=SUM(FactInternetSales[Margin]) / SUM(FactInternetSales[SalesAmount])

Page 21: Dax en

Measures rules and convention

Define the name of the table to which it belongs

The measures are global to the model There may be two measures with the same name

in different tables You can move from one table to another, this can

not be done with computed columns

Do not refer to a measure with table name is confused with calculated columns

Page 22: Dax en

Summary 1/2

Columns consume memory and measures consume CPU

Are calculated at different times They have different purposes They are structured differently Are managed in different ways

Page 23: Dax en

Summary 2/2

Use measures (90%) Calculate ratios Calculate percentages Need complex aggregations

Use column when (10%) It requires slicer or filter values The expression is calculated on the current row

Page 24: Dax en

Counting Values

COUNTROWS: rows in a table COUNTBLANK: counts blanks COUNTA: counts anything but not blanks COUNT : only for numeric columns

Compatibility with Excel

DISTINCTCOUNT: performs distinct count Multidimensional -> measure group with a

distinctcount measure .. Slow like a snail

Page 25: Dax en

Errors in DAX 1/2

1+2 always works [SalesAmount]/[Margin] might fail Causes of errors

Conversion errors Arithmetical operations Empty or missing values

ISERROR (Expression) returns true or false, depending on the presence of an error during evaluation

Page 26: Dax en

Errors in DAX 2/2

IFERROR (Expression, Alternative) in case of error returns Alternative true. Useful to avoid writing expression twice

Both IFERROR and ISERROR are very slow so be careful how you use computed columns

Page 27: Dax en

Aggregation Functions

Work only on numeric columns Aggregation functions:

SUM AVERAGE MIN MAX

Aggregate columns only not expression SUM(Order[Quantity]) SUM(Order[Quantity]) * Orders[Quantity])

Page 28: Dax en

The X aggregation functions 1/2

Iterate on the table and evaluate the expression for each row

Always get two parameters: the table to iterate and the formula to evaluate

SUMX,AVERAGEX,MINX,MAXX

SUMX (Sales,Sales[Price] * Sales[Quantity]

)

Page 29: Dax en

The X aggregation functions 2/2

First calculate the internal parameters and then makes the sum

The columns must all be on the same table or use RELATED (if there is a relationship)

They are very slow but I do not use memory

Page 30: Dax en

Alternatively, the X functions

An alternative to the X functions. create a calculated column aggregate on that column very fast but use memory

Page 31: Dax en

Logical Functions

AND (little used) or && OR (little used) or || IF IFERROR NOT (little used) SWITCH

Page 32: Dax en

Switch

Color :=IF(DimProduct[Color] = "R", "Red",

IF(DimProduct[Color] = "Y", " Yellow ", "Other"))

Color :=Switch (

DimProduct[Color],"R", "Red","Y", "Yellow","Other"

)

Page 33: Dax en

Information Function

Completely useless (do not take the expressions but only columns) ISNUMBER ISTEXT ISNONTEXT

Useful ISBLANK ISERROR

But if we do not know us (we created the column) if it is a number or a text those who should know (by Alberto Ferrari)

Page 34: Dax en

DIVIDE Function

check that the denominator is not 0 IF( Sales[Price] <> 0, Sales[Quantity] /

Sales[Price],0) DIVIDE(Sales[Quantity], Sales[Price],0)

Page 35: Dax en

Date Function

Many useful functions:

DATE ,DATEVALUE, DAY, EDATE,EMONTH ,HOUR, MINUTE, MONTH, NOW, SECOND, TIME, TIMEVALUE, TODAY (interesting!!!), WEEKDAY, WEEKNUM, YEAR, YEARFRANC

Time intelligence functions

Page 36: Dax en

Evaluation Context 1/3

Characterizes DAX from any other language They are similar to the “where clause” of the

MDX query in SSAS Contexts under which a formula is evaluated

Filter Context , RowContext

Page 37: Dax en

Evaluation context 2/3

Filter Context: Set of active rows for the computation The filter that comes from the PivotTable

Defined by slicers, filters, columns, rows

One for each cell of the PivotTable

Page 38: Dax en

Evaluation context 3/3

Row Context: Contains a singles row Current row during iterations Define by X function or Calculate column

definition not by pivot tables This concept is new among MDX

because not working leaf by leaf, but only on the context.

Page 39: Dax en

The two context are always

Filter context: Filter tables Might be empty (All the tables are visible) It is used by aggregate functions In calculated column is all the tables because

there is not pivot table

Row context: Iterate the active row in the filter context Might be empty (There is no iteration running)

Page 40: Dax en

With more tables?

Page 41: Dax en

Evaluation Context

Filter context: Is propagated through relationships from one to many The direction of the relationships is very important. Is

different from SQL (inner,left,...) Applies only once (+ performance)

Row context: Does not propagate over relationships Use RELATED (open a new row context on the

target) Apply for each row (- performance)

Page 42: Dax en

Exmple of a Filter Context

Page 43: Dax en

Table Function

FILTER (adding new conditons. Is an iterator!!!) ALL

(Remove all conditions from a table. Returns all rows from a table)

Useful to calculate ratios and percentages Removes all filters from the specified columns in the

table VALUES (valori di una colonna compresi i blank) RELATEDTABLE (tutti i valori collegati alla riga corrente) All function returns a table

Page 44: Dax en

Filter

Page 45: Dax en

All

Page 46: Dax en

Mixing Filters

Page 47: Dax en

VALUES

Return to the table with a single column containing all possible values of the column visible in the current context

When the result is a column and a row can be used as scalar

SelectedYear:=COUNTROWS(VALUES(Dati[Year]))

Page 48: Dax en

RELATEDTABLE

Return only row of sales (Dati) related with the current store (Store)

=COUNTROWS (RELATEDTABLE(Dati))

Page 49: Dax en

Considerations

we have seen that we can: Can add a filter on a column Remove filter on the full table Mixing filter

…..but: ignore only a part of the filter context and not all add a condition to the context filter or modify an

existing condition

Page 50: Dax en

Calculate

The most simple but complex to understand CALCULATE( Expression,

Filter1,….FiltroN )

Computed before the filter (AND) and then the expression

All filters are processed in parallel and are independent of each other

Replace the filter context (replace whole table or a single column)

Page 51: Dax en

Calculate

Page 52: Dax en

Calculate with filter

So this formula is not correct

ProductLMC := CALCULATE( SUM(FactInternetSales[SalesAmount]); DimProduct[ListPrice] > DimProduct[StandardCost] ))

ProductLMC := CALCULATE( SUM(FactInternetSales[SalesAmount]); FILTER(DimProduct, DimProduct[ListPrice] > DimProduct[StandardCost] ))

The filter is a boolean condition that works on a single column (Ex: DimProduct[Color] = "White" or DimProduct[ListPrice] > 1000)

In this case there are too many columns in the filter (ListPrice and StandardCost)

Use FILTER

Page 53: Dax en

Calculate – pay attention to the filter contextProductM100:=

CALCULATE (

SUM(FactInternetSales[SalesAmount]),FILTER(

DimProduct,

DimProduct[ListPrice] >= 100 ) )

The DimProduct is evaluated in the original filter context before evaluate CALCULATE

Color = silver

Color = silver ListPrice >= 100 Filter Context

Page 54: Dax en

Calculate – pay attention to the context filterProductM100_Bis:=

CALCULATE (

SUM(FactInternetSales[SalesAmount]),FILTER(

ALL(DimProduct),

DimProduct[ListPrice] >= 100 ) )

The new context of filter will be the SUM are all the row because “color = silver” was removed

Color = silver

ListPrice >= 100 All column ….. Filter Context

Page 55: Dax en

Earlier

Returns a value from the previous row context:

In row contex we have only 1 variables available

FOR A = 1 TO 5 FOR B = 1 TO 5

IF A < B THEN NEXTNEXT

FOR A = 1 TO 5 FOR A = 1 TO 5

IF IEARLEIER (A) < A THEN NEXT

NEXT

=SUMX( FILTER(Sales; Sales[Date]<=EARLIER(Sales[Date]) && YEAR(Sales[Date]) = YEAR(EARLIER(Sales[Date])) ) ;Sales[Value] )

Page 56: Dax en

Calculate – Context transition

= SUM(FactInternetSales[SalesAmount]);

= CALCULATE(SUM(FactInternetSales[SalesAmount]));

In DimProduct the two expressions are the same?

Page 57: Dax en

ABC and Pareto Analysis

80% of effects come from 20% of the causes L’80% of sales come from 20% of customers

Pareto analysis is the basis of the classification ABC

Class A contains items for >=70% of total value Class B contains items for >=20% and <70% of total value Class C contains items for <20% of total value

Page 58: Dax en

ABC and Pareto Analysis

For each row calculate the TotalSales

Calculate all products with total sales greater than the selling of the row RunningTotalSales

=CALCULATE( SUM(FactInternetSales[SalesAmount]))

= SUMX( FILTER( DimProduct;

DimProduct[TotalSales] >= EARLIER(DimProduct[TotalSales]) ); DimProduct[TotalSales] )

Page 59: Dax en

Analisi di Pareto e l’analisi ABC

calculate the percentage of sales by product of the total sales

visualize the labels A, B, C

=DimProduct[RunningTotalSales] / SUM(DimProduct[TotalSales])

=IF( DimProduct[RunningPct] <= 0.7; "A"; IF( DimProduct[RunningPct] < =0.9;

"B"; "C";

) )

Page 60: Dax en

ABC and Pareto Analysis

the number of products that generate those sales

=COUNTROWS(DimProduct)

Page 61: Dax en

ABC and Pareto Analysis

Page 62: Dax en

Link and Book

PowerPivot http://www.powerpivot.com

SQLBI http://www.sqlbi.com

WebCast (Powerpivot 1.0)http://www.presentation.ialweb.it/p29261115/

Book

Page 63: Dax en

Q & A

Page 64: Dax en

Thanks!

#sqlsat264#sqlsatancona