Spark & Hadoop at Production at Scale

17
© 2015 MapR Technologies 1 © 2015 MapR Technologies Taking Your Spark To Production Scale Anil Gadre, SVP Product Management, MapR Technologies June 15, 2015

Transcript of Spark & Hadoop at Production at Scale

Page 1: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 1© 2015 MapR Technologies

Taking Your Spark To Production Scale

Anil Gadre, SVP Product Management, MapR Technologies

June 15, 2015

Page 2: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 2

The Journey To Production Scale

Trials,

science projects

Large

mission-critical,

operational

deployments

© 2015 MapR Technologies 2

Page 3: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 3

Companies with Spark & MapR in Production

GLOBAL TELECOM

HEALTHCARE

GLOBAL FINANCIALSERVICES

Page 4: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 4

Key Issues To Plan For

Spark stack

support?

Real-time?

Enterprise reliability &

security?

Open ended agility?

1

2

3

4

Page 5: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 5

Global Managed Security Services delivered on Hadoop

Spark Stream processing used to first check for known threats

Data next processed on Hadoop using MLLib and GraphX

Additional SQL querying done via Spark SQL

Security Intelligence Operations

Page 6: Spark & Hadoop at Production at Scale

Delivers Lightning Fast Analytics for Clients

Building largest Hadoop cluster in Australia

Real-time analytics using Spark on MapR–reducing data loading time from hours to minutes

Leverage multi-tenancy, high-performance and reliability of MapR

Page 7: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 7

Next-Gen Genomics

Develop flexible platform to keep up with fast changing research techniques POSIX file access lets bio-informaticians use existing tools with open source tools (Spark)

Graph manipulations can be done reliably and at scale using Spark

Page 8: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 8

Real-Time Customer Analytics

• MapR Data Lake stores both online and archive data

• Spark on MapR reduced ETL processing

• NFS moved data into the cluster seamlessly

• 1/10th Total Cost of Ownership vs. old way

• New customer onboarding cut from months to weeks

Page 9: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 9

Databricks & MapRStrategic Partnership(since April 2014)

Support for the

complete Spark stack

Engineering & roadmap

collaboration

Back-end support+

Page 10: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 10

The Most Complete Spark Environment

Spark SQL(SQL)

Spark Streaming (Streaming)

MLlib(Machine learning)

GraphX (Graph computation)

Foundation For Enterprise-Grade Spark

Page 11: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 11

DB Operations

Real-Time and Actionable

Analytics

Operations + Analytics on One Hadoop Platform with SQL Access

Real Time

Mobile application

server

Customer 360 dashboard

Churn analysis Product/service optimization and personalization

Real-time ad targeting

Web application server

Data exploration (SQL)

•User profiles and state•User interactions•Real-time location data

•Web and mobile session state•Comments/rankings

Page 12: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 12

Spark + MapR = Ready For Production Success

World-record performance on diskHigh Performance

SLA-Driven Applications • High availability • Data protection• Disaster recoveryReliability for Production

Strategic partnership with Databricks to ensure enterprise support for the entire stack

24/7 Best-in-class Global Support

MapR-DB + Spark = real-time analyticsOperational Data Store

Page 13: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 13

FreeOn-Demand Training

www.mapr.com/training

Page 14: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 14

Self-Service Data Exploration

Data Agility with Less IT Required

Single SQL Interface for Structured and Semi-Structured Data

Page 15: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 15

MapR Introduces 3 New Spark-Based Quick Start Solutions

Real-Time Security Log Analytics

Time Series Analytics

Genome Sequencing

Page 16: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 16

Get Your Tattoo In The MapR Booth!

Show off yourKickstart My Heart skillsand enter to win Xbox 360 & Guitar Hero

Page 17: Spark & Hadoop at Production at Scale

© 2015 MapR Technologies 17

Top-Ranked NoSQL

Top-Ranked HadoopDistribution

Top-Ranked SQL-on HadoopSolution