Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs...

41
Reza Rad Dataflow Integration Solution for Power BI

Transcript of Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs...

Page 1: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Reza Rad

DataflowIntegration Solution for

Power BI

Page 2: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow
Page 3: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Reza RadConsultant, Trainer

RADACAD

Consultant, Mentor, Trainer, SpeakerMicrosoft Regional DirectorMicrosoft Data Platform MVPAuthor of SQL Server and BI booksAuthor of Power BI from Rookie to Rock Star bookPower BI Trainer for thousands of DevelopersMicrosoft Certified TrainerMicrosoft Certified ProfessionalCo-Leader of NZ BI User Group & Difinity

/rezarad @rad_reza rezaradf

Page 4: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Agenda

• What is Dataflow?

• Scenarios of Use Cases

• Creating Dataflows

• Entity Types

• Licensing

• Common Data Model

• CDM Folder Structure

Page 5: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

What is Dataflow?

Page 6: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

What is Dataflow?Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports.

Page 7: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Where the Output Stored?Dataflow stores the data in the Azure Data Lake storage.

Page 8: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

But I Don’t Have Azure Data Lake Subscription!

• Dataflow manages the Data Lake configurations internally. You won’t need anything except your Power BI accounts and subscriptions.

Page 9: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Power BI Can Do Get Data from Dataflow

Page 10: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

What are the benefits of using Dataflow?

Sample Scenarios

Page 11: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Using One Power Query Table in

Multiple Power BI Reports

Page 12: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Using One Power Query Table in Multiple

Power BI Reports

• Re-usable tables or queries across multiple Power BI files, are one of the best candidates for Dataflow.

Page 13: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Different Data Sources with Different Schedule of Refresh

• Dataflow can run extract, transformation, and load (ETL) process on a different schedule for every query (or table).

Page 14: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Centralized Data Warehouse

• Dataflow can be the ETL engine, that fuels the centralized data warehouse in Azure data lake storage.

Page 15: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Versioning Data from a Data Source

Dataflow can be used for versioning the data from the

source into multiple destination tables.

Page 16: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Creating Dataflow

Page 17: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Prerequisites

Developing or Editing Dataflows are possible through Power BI service (not the Desktop)

Dataflow is only available in an app workspace (not in “My workspace”)

Administrative Control

Page 18: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Dataflow Demo

Page 19: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Computed Entities

• Created when you do Reference Query

• The main query should be “Enable Load”

Page 20: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Computed Entities

• Can be good for performance

• Main table is stored in Power BI Dataflow storage, and the new table will query the main table (not the data source)

Page 21: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Linked Entity

• Different from Computed Entity

• No transformation

• No storage

• Read only

• Just a LINK

Page 22: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Premium Requirement

Page 23: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Common Data Model

Page 24: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Silos of Data: Integration Challenge

Page 25: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Shared Data Model

Page 26: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Common Data Model

• Shared data model

• More than 250 entities

• Started with Dynamic 365

Page 27: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Business analysts

Low/no code

Data scientists, Data engineers

Low to high code

CDM folders

Enabling low friction collaboration among Data + AI professionals

Power BI

dataflows

Dynamics 365

CDS for Apps

data

Office

Substrate

Office 365

Adobe

Customer

Experience

Platform

SAP

C/4HANA

S/4HANA

Power BIAzure Data Services

data cataloging, data prep, AI,

machine learning, data warehousing

Azure IoT

ISV partners

Custom

LOB +

Developer

resources

Page 28: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Benefit of CDM

Decoupling Applications from data sources

Page 29: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Industry Accelerators

• Pre-packaged applications working with CDM

Page 30: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

ExampleHigher-Education Accelerator

Page 31: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

CDM Internals

Page 32: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Data Storage for CDM

• Azure Data lake

• CDM Folders

Page 33: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Folder Structure

Page 34: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

What is the structure of CDM Folder?

• Metadata: Model.json

• Data Files: CSV

Page 35: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Model.json

• Root elements: description, last modified time, data culture

• Entity information:

• Reference models

• Relationships

• Annotations

• Pbi:mashup: Transformations

• https://docs.microsoft.com/en-us/common-data-model/model-json

Page 36: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Dataflow and REST API

Page 37: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Licensing?

Page 38: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Summary

• What is Dataflow?

• Scenarios of Use Cases

• Creating Dataflows

• Entity Types

• Licensing

• Common Data Model

• CDM Folder Structure

Page 39: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

Power BI Book

• https://www.apress.com/gp/book/9781484240144#otherversion=9781484240151

Page 40: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow

References to Study MorePower BI from Rookie to Rock Star book: FREE

http://www.radacad.com/online-book-power-bi-from-rookie-to-rockstar

Reza Rad’s series on Dataflow

• What are the Use Cases of Dataflow for You in Power BI?

• Getting Started With Dataflow in Power BI – Part 2 of Dataflow Series

• What is the Common Data Model and Why Should I Care? Part 3 of Dataflow Series in Power BI

• Linked Entities and Computed Entities; Dataflows in Power BI Part 4

Matthew Roche’s series on Dataflow

Matthew written great series and resources for dataflow read this post.

Microsoft Documentation: Dataflow

https://docs.microsoft.com/en-us/power-bi/service-dataflows-overview

Page 41: Dataflow Integration Solution for Power BI Reza... · Dataflow is a Power Query process that runs in the cloud independently from any Power BI reports. Where the Output Stored? Dataflow