Spark Summit Europe 2016 Keynote - Databricks CEO

12
Democratizing AI with Apache Spark Ali Ghodsi Co-Founder and CEO

Transcript of Spark Summit Europe 2016 Keynote - Databricks CEO

Page 1: Spark Summit Europe 2016 Keynote  - Databricks CEO

Democratizing AI with Apache Spark

Ali GhodsiCo-Founder and CEO

Page 2: Spark Summit Europe 2016 Keynote  - Databricks CEO

AI is changing the world

2

Why now?

AlphaGoSIRI/assistantsSelf-driving cars

Page 3: Spark Summit Europe 2016 Keynote  - Databricks CEO

Data is the catalyst

3

AI hasn’t been democratized

Better training, tuning, validationMore data

Clickstreams

Sensor data (IoT)

Video

Speech

Handwriting

Page 4: Spark Summit Europe 2016 Keynote  - Databricks CEO

The hardest part of AI isn’t AI

4

“Hidden Technical Debt in Machine Learning Systems “, Google NIPS 2015

How do we democratize AI?

Page 5: Spark Summit Europe 2016 Keynote  - Databricks CEO

5

“Hidden Technical Debt in Machine Learning Systems “, Google NIPS 2015

+ AI

FLEXIBLE FAST BIG DATA

Page 6: Spark Summit Europe 2016 Keynote  - Databricks CEO

Some gaps remain

6

Manage Data infrastructure

• Create, configure, monitor resilient big data clusters.• Securely access silos of disparate data sources.• Enforce proper data governance.•1

Empower teams to be productive

• Interactively explore data and prototype ideas.• Securely share big data clusters among analysts.• Debug, troubleshoot, version-control big data applications.•

2

Establish Production-Ready Applications

• Setup robust ML data pipelines for ETL/ELT.• Productionize real-time applications with HA, FT.• Build, serve, maintain advanced machine learning models.•3

Page 7: Spark Summit Europe 2016 Keynote  - Databricks CEO

Databricks: Closing the gap

7

• Separate compute & storage

• Integrate existing data stores

• Efficient cache on first access

Just-in-Time Data Platform1

Agile + Low TCO

• Interactive notebooks, dashboards, reports

• Real-time exploration, machine learning, graph use cases

Integrated Workspace2

Accelerate Time to Value

• Workflow scheduler for ML, streaming, SQL, ETL

• Performance-optimized, high availability, fault-tolerant

Automated Spark Management3

Performance

Page 8: Spark Summit Europe 2016 Keynote  - Databricks CEO

Enterprise AI use-cases

8

Predict credit score, credit limit, anomalies

Predict energy demand based on massive weather data

Natural language processing to extract author graph

Predict player churn, predicting network outages

Predict machine equipment failure

Page 9: Spark Summit Europe 2016 Keynote  - Databricks CEO

New Frontier of AI: Deep Learning

9

Detect cancer Understand speech Infer locationIdentify landmarks in photosRecognize Mandarin and

EnglishImprove cancer detection

Page 10: Spark Summit Europe 2016 Keynote  - Databricks CEO

Faster and easier deep learning with Databricks

10

GPUs

• TensorFlow: The most popular deep learning framework.

• TensorFrames: Makes TensorFlow computations faster and easier to program on Spark.

TensorFlow on

TensorFrames and GPUs support out-of-the-box

Massive parallelism

Page 11: Spark Summit Europe 2016 Keynote  - Databricks CEO

Deep Learning on Databricks

11

Data Ingest

Feature extraction

Model Training

Product-ionizeClusters

Jobs & WorkflowsTensorFrames+

GPUs

Interactive exploration

Just-in-time data platform

Automated management

Page 12: Spark Summit Europe 2016 Keynote  - Databricks CEO

Thank you.