SparkMeetup_10_10_2015

7/25/2019 SparkMeetup_10_10_2015

1/42

2015 IBM Corporation

RDD Deep Dive

RDD Basics How to create RDD Operations Lineage

Partitions Sh!!"e #$pe o! RDDs %&ten'ing RDD

Caching in RDD

7/25/2019 SparkMeetup_10_10_2015

2/42


RDD Basics

RDD (Resilient Distributed Dataset) Distributed collection of Object Resilient - Ability to re-compute missing

partitions (node failure)

Distributed Split across multiple partitions Dataset - Can contain any type,

yt!on"#a$a"Scala Object or %ser de&nedObject

'undamental unit of data in spar

7/25/2019 SparkMeetup_10_10_2015

3/42


RDD Basics How to create*o *ays

+oading eternal datasets Spar supports *ide range of sources Access D'S data t!roug! .nput'ormat /

Output'ormat of adoop0 Supports custom .nput"Output format

aralleli1ing collection in dri$er program

$al lineRDD 2 sc0tet'ile(3!dfs4"""pat!"to"Readme0md5)

tet'ile(3"my"directory"65) or tet'ile(3"my"directory"60g15)Spar(Conte&t)who"e#e&t*i"es retrns +!i"ena,e-content. pair

$al listRDD 2sc0paralleli1e(+ist(3spar5,5meetup5,5deepdi$e5))

7/25/2019 SparkMeetup_10_10_2015

4/42


RDD Operations

*o type of Operations ransformation Action

ransformations are lazy, not!ing actually !appens untilan action is called0

Action triggers the computation

7/25/2019 SparkMeetup_10_10_2015

5/42


Lazy Evaluation

ransformation on RDD, don7t get performedimmediately

Spar .nternally records metadata to trac t!eoperation

+oading data into RDD also gets la1y e$aluated

+a1y e$aluation reduce number of passes on t!edata by grouping operations

8apReduce 9urden on de$eloper to merge t!e

operation, comple map0

7/25/2019 SparkMeetup_10_10_2015

6/42


RDD n !ction

sc0tet'ile(3!dfs4""&le0tt:) 0;at8ap(line2ey(?@?) 0collect()

I screa, $oscrea, "ets a""screa, !oricecrea,/

I wish I werewhat I was whenI wishe' I werewhat I a,.

I

screa,

$o

screa,

"ets

a""

screa,

!or

icecrea,

+I-1.

+screa,-1.

+$o-1.

+screa,-1.

+"ets-1.

+a""-1.

+screa,-1.

+icecrea,-1.

+icecrea,-1.

+screa,-3.

+$o-1.

+"ets-1.

+I-1.

+a""-1.

7/25/2019 SparkMeetup_10_10_2015

7/42

7/25/2019 SparkMeetup_10_10_2015

8/42


RDD Partition Partition De!inition

*rag,ents o! RDD *rag,entation a""ows Spar( to e&ecte in Para""e") Partitions are 'istrite' across c"ster+Spar( wor(er.

Partitioning I,pacts parallelism I,pacts performance

7/25/2019 SparkMeetup_10_10_2015

9/42


mportance o" partition #uning

#oo "ew partitions +ess concurrency, unused cores0 8ore susceptible to data se* .ncreased memory pressure for group9y,

reduce9y>ey, sort9y>ey, etc0 #oo many partitions

'rame*or o$er!ead (more sc!edulinglatency t!an t!e time needed for actualtas0)

8any C% contet-s*itc!ing

$eed %reasona&le num&er' o" partitions Commonly bet*een = and =,

partitions +o*er bound4 At least B(xnumber o" cores in

7/25/2019 SparkMeetup_10_10_2015

10/42

7/25/2019 SparkMeetup_10_10_2015

11/42


Partition - nput Data Spar uses same class as adoop to perform .nput"Output sc0tet'ile(3!dfs4""5) in$oes adoop et.nput'ormat 9elo* are >nobs *!ic! de&nes Eartitions

dfs0bloc0si1e default =FG89(adoop F0) numPartition can be used to increase number of

partitiondefault is *!ic! means = partition mapreduce0input0&leinputformat0split0minsi1e default

=b Partition +ize . /ax0minsize1/in0goal+ize1&loc,+ize2

goal+ize . totalnput+ize3numPartitions HF89, *, =>9, IJ89 total si1e - Defaults

8a(=b,8in(IJ89,HF89) ) 2 F partitions HF89, 4*, =>9 , IJ89 total si1e - Kant more partition

8a(=b,8in(HF89,HF89)) 2 HF partition HF89, 5, =>9 2 8a(=b,8in(=F89,HF89)) 2 F

7/25/2019 SparkMeetup_10_10_2015

12/42


Partition - +hu8etrans"ormations

All s!uLe transformation pro$ides parameterfor desire number of partition

Default 9e!a$ior - Spar %ses as!artitioner0 .f spark.default.parallelismis set , taes t!at

as E of partitions .f spark.default.parallelismis not set

largest upstream RDD Ms number of partition Reduces c!ances of out of memory

)9 groupBy:ey(9 reduceBy:ey49aggregateBy:ey79 sortBy:ey59 ;oin69 cogroup9 repartition)*9repartition!

S!uLeransformation

7/25/2019 SparkMeetup_10_10_2015

13/42


Partition - Repartitioning

RDD pro$ides t*o operators repartition(numPartitions)

Can .ncrease"decrease number of partitions .nternally does s!uLe

epensi$e due to s!uLe 'or decreasing partition use coalesce

Coalesce(numPartition,Shue:[true/false]) Decreases partitions

Noes for narro* dependencies A$oids s!uLe

.n case of drastic reduction may trigger s!uLe

7/25/2019 SparkMeetup_10_10_2015

14/42


@ustom Partitioner

artition t!e data according to use case / datastructure ro$ides control o$er no of partitions, distribution

of data tends artitioner class, need to implement

getartitions / numartitons

7/25/2019 SparkMeetup_10_10_2015

15/42


Partitioning Demo

7/25/2019 SparkMeetup_10_10_2015

16/42


+hu8e - AroupBy:ey s ReduceBy:ey

$al *ordCountsKit!Nroup 2 rdd0group9y>ey()

0map(t 2< (t0?=, t0?F0sum)) 0collect()

7/25/2019 SparkMeetup_10_10_2015

17/42


+hu8e - AroupBy:ey s ReduceBy:ey

$al *ordairsRDD 2 rdd0map(*ord 2< (*ord, =))$al *ordCountsKit!Reduce 2 *ordairsRDD

0reduce9y>ey(? @ ?)

0collect()

7/25/2019 SparkMeetup_10_10_2015

18/42


#he +hu8e Re'istrition o! 'ata a,ong partition etween stages) Most o! the Per!or,ance- Re"iai"it$ Sca"ai"it$ Isses in Spar( occrs

within Sh!!"e) Li(e MapRe'ce Spar( sh!!"e ses P"" ,o'e") Consistent"$ eo"e' an' sti"" an area o! research in Spar(

7/25/2019 SparkMeetup_10_10_2015

19/42


+hu8e Overview

Spar run job stage by stage0 Stages are build up by DANSc!eduler according to

RDD7s S!uLeDependency

e0g0 S!uLeRDD " CoNroupedRDD *ill !a$e a

S!uLeDependency 8any operator *ill create S!uLeRDD "

CoNroupedRDD under t!e !ood0

Repartition"Combine9y>ey"Nroup9y"Reduce9y

>ey"cogroup 8any ot!er operator *ill furt!er call into t!e

abo$e operators e0g0 $arious join operator *ill call CoNroup0

7/25/2019 SparkMeetup_10_10_2015

20/42


Cou have seen this

join

union

group9y

StageH

Stage=

StageF

A4

94

C4

D4

ma

p4

'4

N

4

7/25/2019 SparkMeetup_10_10_2015

21/42


+hu8e is Expensive

K!en doing s!uLe, data no longer stay in memoryonly, gets *ritten to dis0

'or spar, s!uLe process mig!t in$ol$e

Data partition4 *!ic! mig!t in$ol$e $eryepensi$e data sorting *ors etc0

Data ser3deser4 to enable data been transfert!roug! net*or or across processes0

Data compression4 to reduce .O band*idt! etc0 Dis, O4 probably multiple times on one single

data bloc 0g0 S!uLe Spill, 8erge combine

7/25/2019 SparkMeetup_10_10_2015

22/42


+hu8e History

S!uLe module in Spar !as e$ol$ed o$er time0 Spar(0I-0) Same code pat! as RDD7s persistmet!od0 88ORP?OQ+P , D.S>?OQ+P optionsa$ailable0

Spar (0G-0) Separate code for s!uLe, S!uLe9loc8anager

/ 9locObjectKriter for s!uLe only0 S!uLe optimi1ation - @onsolidate +hu8e

?rite9 Spar =0 .ntroduced pluggable s!uLe frame*or Spar =0= +ort &ased +hu8e mplementation Spar =0F - Qetty transfer .mplementation0 Sort

based s!uLe is default no*0

Spar =0F@ - ternal s!uLe ser$ice etc0

7/25/2019 SparkMeetup_10_10_2015

23/42


nderstanding +hu8e

.nput Aggregation ypes of S!uLe as! based

9asic as! S!uLe

Consolidate as! S!uLe Sort 9ased S!uLe

7/25/2019 SparkMeetup_10_10_2015

24/42


nput !ggregation Li(e MapRe'ce- Spar( ino"es aggregate+Combiner. on ,ap si'e) ggregation is 'one in Sh!!"eMap#as( sing

ppen'On"$Map +In Me,or$ Hash #a"e co,iner.

3e$4s are neer re,oe' - a"es gets p'ate'

%&terna"ppen'On"$Map +In Me,or$ an' 'is( Hash #a"e co,iner.

Hash Map which can spi"" to 'is(

Append Only Map that spi"" 'ata to 'is( i! ins!!icient ,e,or$

S!uLe &le n-/emory Buer S!uLe *rites to .n-memorybuer before *riting to a s!uLe &le0

7/25/2019 SparkMeetup_10_10_2015

25/42


+hu8e #ypes Basic Hash+hu8e

Hash Base' sh!!"e +spar()sh!!"e),anager.) Hash Partitions the 'ata!or re'cers

%ach ,ap tas( writes each c(et to a !i"e) Map #as(s 6 M Re'ce #as(s 6 R Sh!!"e *i"e 6 M7R - In8Me,or$ B!!er 6 M7R

7/25/2019 SparkMeetup_10_10_2015

26/42


+hu8e #ypes Basic Hash+hu8e

Pro&lem +ets use )**:Bas buer si1e Ke !a$e )****reducers )*8apper tass er ecutor

.n-8emory 9uer si1e *ill 2 =>96=6= 9uer need *ill be )*AB"Executor !is !uge amount of 9uer is not acceptable

and t!is .mplementation cant support =

reducer0

7/25/2019 SparkMeetup_10_10_2015

27/42


+hu8e #ypes @onsolidateHash +hu8e

Solution to decrease t!e .Q-8emory 9uer si1e , Qo of 'ile0 Kit!in ecutor, 8ap ass *rites eac! 9ucet to a Segment

of t!e &le0 ES!uLe &le"ecutor 2 EReducers, E .n-8emory 9uer" ecutor2ER( Reducers)

7/25/2019 SparkMeetup_10_10_2015

28/42


+hu8e #ypes +ort Based+hu8e

Consolidate as! S!uLe needs one &le for eac!reducer0

- otal C6R intermediate &le , C 2 E of eecutorrunning map tass

Still too many &les(e0g B= reducers), Qeed signi&cant memory for compression /

seriali1ation buer0 oo many open &les issue0

Sort 9ased S!u;le is similar to map-side s!uLe

from 8apReduce .ntroduced in Spar =0= , no* its default s!uLe

7/25/2019 SparkMeetup_10_10_2015

29/42


+hu8e #ypes +ort Based+hu8e

8ap output records from eac! tas are ept in memory till t!ey can

&t0 Once full , data gets sorted by partition and spilled to single &le0 %ach Map tas( generate 1 'ata !i"e an' one in'e& !i"e

9ti"i:e e&terna" sorter to 'o the sort wor( I! ,ap si'e co,iner is re;ire' 'ata wi"" e sorte' $ (e$ an' partition

otherwise on"$ $ partition re'cer

7/25/2019 SparkMeetup_10_10_2015

30/42


+hu8e Reader On Rea'er si'e oth Sort = Hash Sh!!"e ses Hash Sh!!"e Rea'er

On re'cer si'e a set o! threa' !etch re,ote otpt ,ap "oc(s Once "oc( co,es its recor's are 'e8seria"i:e' an' passe' into a

res"t ;ee) Recor's are passe' to %&terna"ppen'On"$Map - !or or'ering

operation "i(e sortB$3e$ recor's are passe' to e&terna"Sorter)

20

Buc,et

Buc,et

Buc,et

Buc,et

Buc,et

Buc,et

Buc,et

Buc,et

Buc,et

Buc,et

Buc,et

Buc,et

Buc,et

Buc,et

Buc,et

Buc,et

Reduce#as,

Aggregator

Aggregator

Aggregator

Aggregator

Reduce#as,

Reduce#as,

Reduce#as,

7/25/2019 SparkMeetup_10_10_2015

31/42


#ype o" RDD+ - RDD nter"ace

9ase for all RDDs (RDD0scala), consists of A Set o"partitions (3splits5 in adoop) A +ist of dependencieson parent RDDs

A 'unction to computet!e partition from itsparents

Optionalpreferred locationsfor each partition

A Partitionerde&nes strategy for partitionig!as!"range

9asic operations lie map, lter, persist etc

artitions

Dependencies

Compute

referred+ocations

Partitioner

,ap-!i"ter-persists

+ineage

Optimi1edeecution

Operations

7/25/2019 SparkMeetup_10_10_2015

32/42


ExampleF HadoopRDD

partitions2 one per D'S bloc dependencies2 none compute(partition)2 read corresponding bloc

preferred+ocations(part)2 D'S bloc location partitioner 2 none

7/25/2019 SparkMeetup_10_10_2015

33/42


ExampleF /apPartitionRDD partitions2 arent artition dependencies2 3one-to-one 3parent RDD compute(partition)2 apply map on parent

preferred+ocations(part) = none (ask parent) partitioner 2 none

7/25/2019 SparkMeetup_10_10_2015

34/42


ExampleF @oAroupRDD

partitions2 one per reduce tas dependencies2 could be narro* or *ide

dependency compute(partition)2 read and join s!uLed data

preferred+ocations(part) 2 none partitioner 2 as!artitioner(numass)

7/25/2019 SparkMeetup_10_10_2015

35/42


Extending RDDs

tend RDDs to o add trans"ormation3actions

Allo* de$eloper to epress domain speci&ccalculation in cleaner *ay

.mpro$es code readability asy to maintain

@ustom RDD "or nput +ource1 Domain Kay to add ne* .nput data source

9etter *ay to epress domain speci&c data 9etter control on partitioning and distribution

7/25/2019 SparkMeetup_10_10_2015

36/42


How to Extend Add custom operators to RDD

%se scala .mpilicits 'eels and *ors lie built in operator Pou can add operator to Speci&c RDD or to all

Custom RDD tend RDD A. to create our o*n RDD .mplement compute / getartitions abstract

met!od

7/25/2019 SparkMeetup_10_10_2015

37/42

7/25/2019 SparkMeetup_10_10_2015

38/42


!dding new Operator to RDD Ke *ill use Scala .mplicit feature to add a ne*

operator to an eistingRDD !is operator *ill s!o* up only in our RDD .mplicit con$ersions are !andled by Scala

7/25/2019 SparkMeetup_10_10_2015

39/42


@ustom RDD mplementation tending RDD allo* you to create your o*n

custom RDD structure Custom RDD allo* control on computation, c!ange

partition / locality information

7/25/2019 SparkMeetup_10_10_2015

40/42


@aching in RDD Spar allo*s cac!ing"ersisting entire dataset in

memory ersisting RDD in cac!e

'irst time *!en it is computed it *ill be ept inmemory

Reuse t!e t!e cac!e partition in net set ofoperation

'ault-tolerant, recomputed in case of failure Cac!ing is ey tool for interacti$e and iterati$e

algorit!m ersist support dierent storage le$el

Storage le$el - .n memory , Dis or bot! ,ec!yon

Seriali1ed Ts Deseriali1ed

7/25/2019 SparkMeetup_10_10_2015

41/42


@aching n RDD Spar Contet tracs persistent RDDs 9loc 8anager puts partition in memory *!en &rst

e$aluated Cac!e is la1y e$aluation , no cac!ing *it!out an

action0 S!uLe also eeps its data in Cac!e after s!uLe

operations0 Ke still need to cac!e s!uLe RDDs

7/25/2019 SparkMeetup_10_10_2015

42/42

@aching Demo

SparkMeetup_10_10_2015

Documents

Transcript of SparkMeetup_10_10_2015