Deep Dive: How Spark Uses Memory

Wenchen Fan 2017-5-19

Agenda

• Memory Usage Overview

• Memory Contention

• Tungsten Memory Format

• Cache-aware Computation

• Future Plans

Where Spark Uses Memory

• storage: memory used to cache data that will be used later.

(controlled by memory manager)

• execution: memory used for computation in shuffles, joins,

sorts and aggregations. (controlled by memory manager)

• others: user data structure, internal metadata, objects

created by UDF, etc.

Iterator

4, 3, 5, 1, 6,

1 2 3 4 5 6 Iterator

1, 2, 3, 4, 5,

6 take(3)

Execution

Memory

What if I want the sorted values again?

Iterator

4, 3, 5, 1, 6,

1, 2, 3, 4, 5,

6 take(3)

Iterator

4, 3, 5, 1, 6,

1, 2, 3, 4, 5,

6 take(4)

Iterator

4, 3, 5, 1, 6,

1, 2, 3, 4, 5,

Execution

Memory

1 2 3 4 5 6

take(3) take(4) take(5) …

Storage

Memory

Memory Contention

• How to arbitrate memory between execution and storage?

• How to arbitrate memory across tasks running in parallel?

• How to arbitrate memory across operators running within the

same task?

Challenge #1

How to arbitrate memory between

execution and storage?

Easy, static assignment!

total available memory

execution storage

Spill to

execution storage

Evict LRU blocks to

Inefficient memory usage

leads to bad performance

execution storage

Execution can only use a fraction of the

memory, even when there is no storage!

execution storage

Efficient use of memory required user tuning

Unified Memory Management

execution storage

What happens if there is already

storage?

execution storage

Evict LRU blocks to

execution storage

Design Considerations

• Why evict storage, not execution?

• Spilled execution data will always be read back from disk, where as

cached data may not.

• What if the application relies on cache?

execution storage

This is bad!

Design Considerations

• Why evict storage, not execution?

• Spilled execution data will always be read back from disk, where as

cached data may not.

• What if the application relies on cache?

• allow users to specify a minimum unevictable amount of cached

data(not a reservation!)

Challenge #2

How to arbitrate memory across tasks

running in parallel?

Task 1

Worker machine has 4

cores Each task gets ¼ of the total

memory

Task 2 Task 3 Task 4

Dynamic Assignment

Task 1

The share of each task depends on the

number of actively running tasks

Dynamic Assignment

Task 1

The share of each task depends on the

number of actively running tasks

Dynamic Assignment

Task 1

Now another task comes along, the

first task have to spill to free up

memory

Dynamic Assignment

Task 1

Each task is now assigned ½ of the

total memory

Task 2

Dynamic Assignment

Each task is now assigned ¼ of the

total memory

Task 1 Task 2 Task 3 Task 4

Dynamic Assignment

Task 3

Last remaining task gets all the

memory

Static vs Dynamic Assignment

• Both are fair and starvation free

• Static Assignment is simpler

• Dynamic assignment handles stragglers better

Challenge #3

How to arbitrate memory across

operators running within the same task?

SELECT age, AVG(height) FROM students GROUP BY age ORDER BY AVG(height)

students.groupBy("age") .avg("height") .orderBy("avg(height)") .collect() Scan

Project

Aggregate

Project

Aggregate

The task has 6

pages of memory

Project

Aggregate

Map { // age → (total,

count)

20 → (483, 3)

21 → (935, 5)

22 → (172, 1)

Project

Aggregate

All 6 pages were

used by

Aggregate,

leaving no

memory for Sort!

Project

Aggregate

Sort Solution #1

Reserve a page

for each operator

Project

Aggregate

Sort Solution #1

Reserve a page

for each operator

Starvation free, but still not fair…

What if there were more operators?

Project

Aggregate

Sort Solution #2

Cooperative spilling

Project

Aggregate

Sort Solution #2

Project

Aggregate

Sort Solution #2

Sort forces Aggregate to

spill a page to free

memory

Project

Aggregate

Sort Solution #2

Sort needs more memory so

it forces Aggregate to spill

another page(and so on)

Project

Aggregate

Sort Solution #2

Sort finishes with 3 pages

Aggregate does not have to

spill its remaining pages

Recap: three source of contention

How to arbitrate memory …

• between execution and storage?

• across tasks running in parallel?

• across operators running with the same task?

Instead of statically reserving memory in advance, deal

with memory contention when it raises by forcing

members to spill

How Spark keep data in memory

• Put data as objects on the heap and operate on these

objects.

• Data caching is simply using a list to keep data objects.

Data objects? No!

• It is hard to monitor and control the memory usage when we have a lot of objects.

• Garbage collection will be the killer.

• Java objects has notable space overhead.

• High serialization cost when transfer data inside cluster.

Keep data as binary and operate on

binary

source

1010100001010 0010001001001 1010001000111 1010100100101

binary format

Cache Manager

Memory Manager

operators

allocate memor

memory pages allocate

memory memor

y pages

Java Objects Based Row Format

BoxedInteger(123)

String(“data”)

String(“bricks”)

• 5+ objects

• high space overhead

• slow value accessing

• expensive hashCode()

Efficient Binary Format

“bricks” 0x0 123 32 40 “data” 4 6

null tracking

bitmap

(123, “data”, “bricks”)

offset and length of

Efficient Binary Format

“data

6 2 “da”

substring

How to process binary

data more efficient?

Understanding CPU Cache

Memory is becoming

slower and slower

than CPU, we should

keep the frequently

accessed data in

CPU cache.

Understanding CPU Cache

Pre-fetch data into

CPU cache, with

cache line boundary.

The most 2 important

techniques in big data

are ...

Sort and Hash!

Naive Sort

record

pointer

Naive Sort

record

pointer

Naive Sort

record

pointer

Naive Sort

record

pointer

Naive Sort

Each comparison needs to access 2 different

memory regions, which makes it hard for CPU cache

to pre-fetch data, poor cache locality!

Cache-aware Sort

record

key-prefix

Cache-aware Sort

record

key-prefix

Cache-aware Sort

record

key-prefix

Cache-aware Sort

Most of the time, just go through the key-prefixes in a

linear fashion, good cache locality!

Naive Hash Map

key pointer value

Naive Hash Map

key pointer value

lookup

Naive Hash Map

key pointer value

hash(key) %

Naive Hash Map

key pointer value

compare these 2

Naive Hash Map

key pointer value

linear

probing

Naive Hash Map

key pointer value

compare these 2

Naive Hash Map

Each lookup needs many pointer dereferences and

key comparison when hash collision happens, and

jumps between 2 memory regions, bad cache locality!

full hash

Cache-aware Hash Map

key value

full hash

key value

lookup

full hash

key value

hash(key) % size, and

compare the full hash

full hash

key value

linear probing, and

compare the full hash

full hash

key value

compare these 2

Each lookup mostly only needs one pointer

dereference and key comparison(full hash collision is

rare), and access data in a single memory region,

better cache locality!

Recap: Cache-aware data structure

How to improve cache locality …

• store key-prefix with pointer.

• store key full hash with pointer.

Store extra information to try to keep the

memory accessing in a single region.

What’s next

• Standard binary format, may use Apache Arrow.

• SPARK-19489

• SPARK-13534

• Columnar execution engine.

• SPARK-15687

Thank You

We Are Hiring!!!

Send your resume to

wenchen@databricks.com

Work at Hangzhou,

full time Spark developer!

Deep Dive: How Spark Uses Memory -...

Transcript of Deep Dive: How Spark Uses Memory -...

Deep Dive: How Spark Uses Memory -...

Documents

Transcript of Deep Dive: How Spark Uses Memory -...

ALLUXIO (formerly Tachyon): Unify Data at Memory Speed - Effective using Spark with Alluxio at Spark Summit Boston 2017

About Intellipaat · Flexi Scheduling Attend multiple batches for lifetime & stay updated. ... Spark installation guide and Spark configuration Memory management ... Nokia Airscale

Power product innovation with Big Data technologies · Start Spark Shell • Set proper number of executors and memory per executor Convert, load, cache data • Spark >=1.6v: memory

MANAGED SOLUTIONS Apache Spark€¦ · 01/10/2015 · Collocated Data Engine Performance Monitoring Unified Memory Management Reliability Automated Provisioning of Spark Jobserver

"Spark Search" - In-memory, Distributed Search with Lucene, Spark, and Tachyon: Presented by Taka Shinagawa, SparkSearch.org

Using Apache Spark - Spark Summit · PDF fileWhat is Spark? Efﬁcient • General execution graphs • In-memory storage Usable • Rich APIs in Java, Scala, Python • Interactive

Big Data Extensions Admin Guide - KNIME...Spark Job Server uses the spark-submit command to run the Spark driver program and executes Spark jobs submitted via REST. Therefore, it must

StreamSQL on Spark: Manipulating · PDF fileKafka Messaging/ Queue Spark Streaming In-Memory Tables Spark-SQL Stream Processing Ram Store HDFS Tables Persistent Storage Online Analysis

Geo-Analytics with Apache Spark and In-Memory Data Grids

What is SPARK? - UHgabriel/courses/cosc6339_s17/BDA_11_Spark.pdf · What is SPARK? •In-Memory Cluster ... –Hadoop, –Mesos, •Spark ... Spark Essentials •Spark program has

How Totango uses Apache Spark

Main Memory Background Random Access Memory (vs. Serial Access Memory) Cache uses SRAM: Static Random Access Memory –No refresh (6 transistors/bit vs.

GPU in-memory processing using Spark for iterative computation - …hvcl.unist.ac.kr/wp-content/papercite-data/pdf/hong... · 2018-08-25 · GPU in-memory processing using Spark for

Unlock the value of big data with the DX2000 from NEC · Apache Spark—the analytics tool we used in testing—uses in-memory processing, which helps shorten the time to analyze

In-Memory Computation with Spark Lecture BigData Analytics

Anatomy of in memory processing in Spark

In-Memory Computation with Spark … · Concepts Architecture Computation Managing Jobs Examples Higher-Level AbstractionsSummary Overview to Spark [10, 12] In-memory processing (and

In-Memory Computation with Spark - uni- · PDF fileIn-Memory Computation with Spark ... Concepts Architecture Computation Managing Jobs Examples Higher-Level AbstractionsSummary Outline

Vitalii Bondarenko HDinsight: spark. advanced in memory big-data analytics with microsoft azure

Data processing in Apache Spark - ut · Spark •Directed acyclic graph (DAG) engine supports cyclic data flow and in-memory computing. •Spark works with Scala, Java and Python