AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica....

30
` AI-Powered Data Cataloging Virtual Summit Data Cataloging for Data Governance With Maersk & Informatica

Transcript of AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica....

Page 1: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

`

AI-Powered Data Cataloging Virtual Summit

Data Cataloging for Data Governance

With Maersk & Informatica

Page 2: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Requirements:

Detailed Technical Lineage

Intelligent Glossary Associations

Agile Data Curation

Data Certifications

Key Data Element Discovery

Data Governance ensures that enterprise data is trusted, governed and

protected. Intelligent data catalogingplays a critical role in enabling agile

data governance at scale.

Data Cataloging for Data Governance

Page 3: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Enterprise Data CatalogAI, Human Knowledge and Collaboration

AI-powered automatic discovery, enrichment

and curation

Business context via intelligent business term

association

Collaboration & social curation to tap into shared

data knowledge

Page 4: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

PowerCenter | DQ MDM | BDM | DIH

BG | ILM | Axon | Informatica Cloud

Informatica

Oracle | DB2 | DB2 for z/OSSQL Server | Sybase | TeradataNetezza | JDBC | SQL Scripts |

SAP HANA | Stored Procedures

Databases

SAP R/3 | SalesforceOracle | Workday

Applications

HIVE (Cloudera, Hortonworks, MapR, IBM BigInsights, EMR, HDI)

HDFS | MapRFS |

Cloudera Navigator | Atlas

Big Data

AWS S3 | AWS Redshift | Azure SQL DB | Azure SQL DW | Azure

ADLS | Azure Blob | Google BigQuery | ADLS Gen 2

Cloud Platforms

CSV | Delimited | XML | JSON | Avro | Parquet | MS Excel | Adobe PDF | Flat File | MS

PowerPoint | MS Word

File Formats

Tableau | IBM Cognos |

SAP BusinessObjects

MicroStrategy | OBIEE

Business Intelligence

Microsoft SSIS | Erwin Models | PowerDesigner | Oracle Data Integrator | IBM DataStage | Custom Scanner Framework

Other

EnterpriseData

Catalog

Page 5: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

David Falder Gaurav Pathak Dharma Kuthanur

Speakers

Senior Technical Specialist Senior Director, Product Management

Senior Director, Product Marketing

Page 6: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

|6

Page 7: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

MAERSK

A TRANSFORMATIONAL JOURNEY

David FalderVirtual Summit – Aug 2019

Page 8: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

David Falder (UK)

Senior Technical [email protected]

|8

Speaker

Page 9: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

|9

Page 10: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

|10

Page 11: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

|11

Page 12: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

|12

Page 13: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

|13

Our Transformational Themes

Develop new, in house

capabilities

Change the way we work

Create new, innovative products

• Scrum teams, experts in data engineering

• Develop existing & hire new capabilities

• Decrease number of external consultants

• Train employees

• Test & learn, fail-fast

• Reduced governance, bureaucracy & handovers

• Embed BI in the business

• Reduce ”time to market”

• E2E responsibility for Scrum Teams

• Automate standard reports & provide self serve capabilities

• Reduce reporting legacy

• Democratise data

• Partner with IT & Digital to enable development of analytical products

Page 14: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

Future BI Requirements

Going Agile

• Classic agile (Prod. Owner + Scrum Team)

• Operating model must be able to handle net new ideas and software

• Use OKRs for improved alignment on planning in development teams

• Scalable BI platform, that supports the ambitions for Maersk BI

• Don’t compromise reliability of reports or data

From Service Delivery to Partner

• Investing in POs to enable them to establish a proximity with business stakeholders

• Enable more flexibility in the organization (Network-structure and PoC teams)

• Implement Product- and Market Mindset

Reduce Time to Market

• Paradox in expectation to be proactive and knowledgeable to earn a seat at the table

• Change perception of BI with the business stakeholders

• Involve the business more by having a rapid release cycle and an an MvPapproach - test and learn, fail-fast

• We want business and IT to partner up and work closer together, to ensure value and relevance

|14

Page 15: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

|15

Page 16: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

|16

Front End Consumption / Visualization

(Microstrategy / MSBI / Power BI / Etc)

Search / Find / Understand

Axon

Business Data

Assets / Business Lineage

/Ownership Classification /

Quality

Tech Assets / Tech Lineage

EDP

Prepare

Ontology

Cleansed

Raw

Projects

MashupsRDBMS

Sources

Data Lake

Local

.txt, xls

Data Platform Strategy – Non SQL / non-Technical

Business assets governance

Technical assets

management

**Search/Find/Add

Search/Find/

Select

Data Quality management

Rate data

DQ Profiles

1

2

3*

3* - Optional

4

5

6

7

8 Visualize

10- Consume9 Publish

1 – Load Catalogue (Others)2 – Search Find Understand3 – Rate (crowd source)4 – Search & Create Prj for Prep5 – Import – Import to project6 – Prepare – Prepare data7 - Publish data – Publish dataset8 – Visualize – in tool of choice9 – Publish Viz & share visual10 – Consumers access

IDQ** Integration into MSTR on road map

EDC

Axon, EDC, EDPEDC – Enterprise Data Catalog

EDP – Enterprise Data Preparation

IDQ – Informatica Data Quality

Axon – Axon Data Governance

Page 17: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

Data Governance Strategy

Future scope

Page 18: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

Self Service BI

Secure@source -Future scope

Page 19: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

|19

Page 20: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

|20

EDC Current Load Status

Page 21: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

EDC Development Tasks

DataBricks Azure Data Factory Power BI

|21

MSBI

PowerCenter

Page 22: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

• Data Migration• Data Engineers are saving considerable effort understanding the transformations as they migrate data from Teradata to ADLS.

• Engineers are able to generate Lineage Diagrams quickly and so that they may understand data flows and transformations quickly.

• Continuous Development and Deployment• Engineers are utilising EDC to perform impact analysis when making changes. They are quickly able to understand the impact of

making change to downstream consumers.

• Helping with New Ideas• Engineers are starting to utilise EDP to help them fast-track new ideas – this has self-service MicroStrategy in use as a visualisation

tool

|22

Additional EDC Use Cases

Page 23: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

|23

Sample Lineage Diagram

Page 24: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

|24

Sample Lineage Diagram (zoomed)

Page 25: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

|25

Sample Lineage Diagram – Transformation Logic

Page 26: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

|26

Embedding Informatica Toolset

Page 27: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Classification: Public

• Enterprise Adoption

• Additional Source Applications to facilitate lineage

• Increased focus on Operational Reporting – Include Operational DB’s

• MicroStrategy Integration

|27

The Future

Page 28: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

Demo

Page 29: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

29 © Informatica. Proprietary and Confidential.29 © Informatica. Proprietary and Confidential.

Learn More

1. Don’t miss Keynotes and Deep-Dives at the AI-Powered Data Cataloging Virtual Summit:• Market and Analyst Perspectives featuring New York Life, Tableau, and Amalgam Insights

• Data Cataloging Solution Theaters featuring Maersk, Nissan, Rabobank and Biogen

2. Stop by an Informatica World Tour near you:• Chicago Sept-11 | Washington, DC Oct-15

• Frankfurt Oct-8 | London Oct-9 | Paris Oct-10

3. Watch a Product Webinar:• Advancing Analytics Maturity with an Intelligent Data Catalog: with Mattel and Aberdeen

• Meet the Expert PM Webinar: EDC 10.2.2 Release Deep-Dive & Demo

Page 30: AI-Powered Data Cataloging Virtual Summit Data Cataloging for … · With Maersk & Informatica. Requirements: Detailed Technical Lineage. Intelligent Glossary Associations. Agile

`

Thank You