Evolving a data-driven company from MapReduce to Spark - Ferran Galí Reniu @ PAPIs Connect
-
Upload
papisio -
Category
Technology
-
view
347 -
download
0
Transcript of Evolving a data-driven company from MapReduce to Spark - Ferran Galí Reniu @ PAPIs Connect
Node Node Node Node Node Node Node Node
HDFS - Hadoop Distributed File System
YARN
Hardware
Storage
Resource Manager
The Big Data problemHadoop
Processing
Node Node Node Node Node Node Node Node
HDFS - Hadoop Distributed File System
YARN
Hardware
Storage
Resource Manager
The Big Data problemHadoop
Processing Job
Application
Node Node Node Node Node Node Node Node
HDFS - Hadoop Distributed File System
YARN
Hardware
Storage
Resource Manager
The Big Data problemHadoop
Processing Job
Application
Making the business flow
Business Intelligence
Search engine
Mailing Push Notifications
Online Media Buying
filter() {...}
join() {...}
filter() {...}
Spark Core API: RDDs
groupByKey() {...}
collect()write()count()
...
The Big Data problemSpark stack (with DataFrames)
Spark Core (RDDs)
Spark DataFrames API
Spark SQLMachine
Learning Library
Streaming GraphX
Node Node Node Node Node Node Node Node
HDFS - Hadoop Distributed File System
YARN
Hardware
Storage
Resource Manager
The Big Data problemEasy integration with Hadoop
Processing Job
Application
Job
Evoluting a data-driven company from MapReduce to
SparkFerran Galí i Reniu
@ferrangali
Icons made by Freepik from Flaticon is licensed by CC BY 3.0