Distributed Systems In One Lesson

75
Distributed Systems One Lesson in @tlberglund

Transcript of Distributed Systems In One Lesson

Page 1: Distributed Systems In One Lesson

DistributedSystems

OneLessonin

@tlberglund

Page 2: Distributed Systems In One Lesson

What  are ?Distributed  Systems

Any  system  too  large  to  fit  on  one  computer.

Page 3: Distributed Systems In One Lesson

What  are

• Storage  • Transac6ons  • Computa6on  • Coordina6on

?Distributed  Systems

Page 4: Distributed Systems In One Lesson

Distributed  Storage

Page 5: Distributed Systems In One Lesson

Distributed  Storage

• Read  replica6on  • Sharding  • Consistent  hashing  • Distributed  filesystems

Page 6: Distributed Systems In One Lesson

Distributed  Storage

• MongoDB  • Cassandra  • HDFS

Page 7: Distributed Systems In One Lesson

When  Life  is  Good

Page 8: Distributed Systems In One Lesson

When  Life  is  Good

Tim:Americano

Page 9: Distributed Systems In One Lesson

When  Life  is  Good

Tim:Americano

Page 10: Distributed Systems In One Lesson

When  Life  is  Good

AmericanoTim?

Page 11: Distributed Systems In One Lesson

When  Life  is  Good

Americano

Page 12: Distributed Systems In One Lesson

When  Life  is  Busier

Tim:Triple  Skinny

Page 13: Distributed Systems In One Lesson

When  Life  is  Busier

Tim:Triple  Skinny

Page 14: Distributed Systems In One Lesson

When  Life  is  Busier

Tim:Triple  SkinnyTim:Triple  Skinny

Page 15: Distributed Systems In One Lesson

When  Life  is  Busier

Tim:Triple  Skinny

Tim:Triple  Skinny

Page 16: Distributed Systems In One Lesson

When  Life  is  Busier

Tim?

Page 17: Distributed Systems In One Lesson

ReplicaCon  Problems

• Complexity  • Consistency

Page 18: Distributed Systems In One Lesson

When  Life  is  REALLY  busy

Tim:Triple  SkinnyAaron-­‐  Faye

Page 19: Distributed Systems In One Lesson

When  Life  is  REALLY  busy

Tim:Triple  SkinnyAaron-­‐  Faye

Faye-­‐  Nancy

Page 20: Distributed Systems In One Lesson

When  Life  is  REALLY  busy

Tim:Triple  SkinnyAaron-­‐  Faye

Faye-­‐  Nancy

Nancy-­‐  Zed

Page 21: Distributed Systems In One Lesson

Sharding  Problems

• More  Complexity  • Broken  data  model  • Limited  data  access  paEerns

Page 22: Distributed Systems In One Lesson

Consistent  Hashing

2000

4000

6000

8000

A000

C000

E000

0000

Page 23: Distributed Systems In One Lesson

Consistent  Hashing

2000

4000

6000

8000

A000

C000

E000

0000

Tim:Americano

Page 24: Distributed Systems In One Lesson

Consistent  Hashing

2000

4000

6000

8000

A000

C000

E000

0000

9F72:Americano

Page 25: Distributed Systems In One Lesson

Consistent  Hashing

2000

4000

6000

8000

A000

C000

E000

0000

Tim?

Page 26: Distributed Systems In One Lesson

Consistent  Hashing

2000

4000

6000

8000

A000

C000

E000

0000

9F72?

Page 27: Distributed Systems In One Lesson

Consistent  Hashing

2000

4000

6000

8000

A000

C000

E000

0000

9F72:Americano

Page 28: Distributed Systems In One Lesson

Consistent  Hashing

2000

4000

6000

8000

A000

C000

E000

0000

Page 29: Distributed Systems In One Lesson

CAP  Theorem

C

PA

Page 30: Distributed Systems In One Lesson

CAP  Theorem

• Consistency  • Availability  • Par66on  Tolerance

Page 31: Distributed Systems In One Lesson

CAP  Theorem

• Shared  wri6ng  project  • Coffee  shop  closes  • Synchronizing  over  the  phone  • BaEery  dies  • Status  report!

Page 32: Distributed Systems In One Lesson

CAP  Theorem

C

PA

Page 33: Distributed Systems In One Lesson

Distributed  TransacCons

Page 34: Distributed Systems In One Lesson

Distributed  TransacCons

Page 35: Distributed Systems In One Lesson

ACID  TransacCons• Atomic  • Consistent  • Isolated  • Durable  • One  barista  • Mul6ple  baristas

Page 36: Distributed Systems In One Lesson

Ordering  Coffee• Receive  order  • Process  payment  • Enqueue  order  • Make  coffee  • Deliver  drink

Page 37: Distributed Systems In One Lesson

Ordering  Coffee• Receive  order  • Process  payment  • Enqueue  order  • Make  coffee  • Deliver  drink

{Diffe

rent

Actors

Page 38: Distributed Systems In One Lesson

Ordering  Coffee

• Why  split  up  order  processing?  • What  can  fail?  • What  are  the  consequences  of  failure?  • How  do  we  repair  failure?

Page 39: Distributed Systems In One Lesson

Ordering  Coffee

• How  can  we  design  a  coffee  shop  with  atomic  transac6ons?  

• How  does  that  limit  the  business?  • Why  give  up  atomicity?

Page 40: Distributed Systems In One Lesson

Distributed  ComputaCon

Page 41: Distributed Systems In One Lesson

Distributed  ComputaCon

• ScaEer/Gather  • MapReduce  • Hadoop  • Spark

Page 42: Distributed Systems In One Lesson

MapReduce

• All  computa6on  in  two  func6ons:  Map  and  Reduce  • Keep  data  (mostly)  where  it  is  • Move  compute  to  data

Page 43: Distributed Systems In One Lesson

MapReduce

Mapper

Reducer

(k ,v) [(k ,v), (k ,v), (k ,v), …]

Shuffle(k , [v, v, v, …])

[(k ,v), (k ,v), (k ,v), …]

Page 44: Distributed Systems In One Lesson

MapReduce

Once upon a midnight dreary, while I pondered, weak and weary, Over many a quaint and curious volume of forgotten lore,

While I nodded, nearly napping, suddenly there came a tapping, As of some one gently rapping, rapping at my chamber door. "'Tis some visitor," I muttered, "tapping at my chamber door-

Only this, and nothing more."

Ah, distinctly I remember it was in the bleak December, And each separate dying ember wrought its ghost upon the floor.

Eagerly I wished the morrow;- vainly I had sought to borrow From my books surcease of sorrow- sorrow for the lost Lenore-

poems/raven.txt:

Page 45: Distributed Systems In One Lesson

MapReduceIt shall clasp a sainted maiden whom the angels name Lenore-

Clasp a rare and radiant maiden whom the angels name Lenore." Quoth the Raven, "Nevermore."

"Be that word our sign in parting, bird or fiend," I shrieked, upstarting- "Get thee back into the tempest and the Night's Plutonian shore! Leave no black plume as a token of that lie thy soul hath spoken!

Leave my loneliness unbroken!- quit the bust above my door! Take thy beak from out my heart, and take thy form from off my door!"

Quoth the Raven, "Nevermore."

And the Raven, never flitting, still is sitting, still is sitting On the pallid bust of Pallas just above my chamber door;

And his eyes have all the seeming of a demon's that is dreaming, And the lamp-light o'er him streaming throws his shadow on the floor;

And my soul from out that shadow that lies floating on the floor Shall be lifted- nevermore!

poems/raven.txt:

Page 46: Distributed Systems In One Lesson

MapReduce

Once

upon

a

midnightdrearywhile

pondered

suddenly

there

came a

tapping

gently

rapping

rapping

tapping

at

my

chamber

door

Map

Page 47: Distributed Systems In One Lesson

MapReduce

Once:1

upon:1

a:1

midnight:1dreary:1while:1

pondered:1

suddenly:1

there:1

came:1 a:1

tapping:1

gently:1

rapping:1

rapping:1

tapping:1

at:1

my:1

chamber:1

door:1

Map

Page 48: Distributed Systems In One Lesson

MapReduce

Once:1

upon:1a:1

midnight:1

dreary:1while:1

pondered:1suddenly:1 there:1came:1a:1

tapping:1gently:1

rapping:1

rapping:1 tapping:1

at:1

my:1chamber:1

door:1

Shuffle

Page 49: Distributed Systems In One Lesson

MapReduce

Once:[1]

upon:[1]

midnight:[1]

dreary:[1]while:[1]

pondered:[1]suddenly:[1] there:[1]came:[1]a:[1,1]

gently:[1]

rapping:[1,1]

tapping:[1,1]

at:[1]

my:[1]chamber:[1]

door:[1]

Reduce

Page 50: Distributed Systems In One Lesson

MapReduce

Once:1

upon:1

midnight:1

dreary:1while:1

pondered:1suddenly:1 there:1came:1a:2

gently:1

rapping:2

tapping:2

at:1

my:1chamber:1

door:1

Reduce

Page 51: Distributed Systems In One Lesson

Real-­‐World  MapReduce• Hadoop  • Cloudera,  Hortonworks,  MapR  • Nobody  write  map,  reduce  func6ons  • Hive  (SQL-­‐like  interface)  • Integra6on  with  BI  front-­‐ends

Page 52: Distributed Systems In One Lesson

Hadoop

• MapReduce  API  • Job  Tracker,  Task  Tracker  • Distributed  Filesystem  (HDFS)  • Enormous  ecosystem

Page 53: Distributed Systems In One Lesson

Spark

Page 54: Distributed Systems In One Lesson

Spark

• ScaEer/gather  paradigm  (similar  to  MapReduce)  • More  general  data  model  (RDDs)  • More  general  programming  model  (transform/ac6on)  • Storage  agnos6c

Page 55: Distributed Systems In One Lesson

Spark  Architecture2000

4000

6000

8000

A000

C000

E000

0000

Spark Client Spark

Context

Driver

Page 56: Distributed Systems In One Lesson

Spark Client Spark

ContextJob

Spark  Architecture

Page 57: Distributed Systems In One Lesson

Spark Client Spark

Context

Job

Cluster Manager

Spark  Architecture

Page 58: Distributed Systems In One Lesson

Spark Client Spark

Context

Cluster Manager

JobTaskTaskTaskTask

Spark  Architecture

Page 59: Distributed Systems In One Lesson

Spark Client Spark

Context

Job

TaskTask

Task

Task

Spark  ArchitectureCluster Manager

Page 60: Distributed Systems In One Lesson

Executor

Task Task

Cache

Spark  Worker  Node

Page 61: Distributed Systems In One Lesson

Executor

Task Task

Cache

Executor

Task Task

CacheRDD RDD

RDD

RDDRDD

RDD

Spark  Worker  Node

Page 62: Distributed Systems In One Lesson

What’s  an  RDD?

Page 63: Distributed Systems In One Lesson

What’s  an  RDD?

This is

Page 64: Distributed Systems In One Lesson

What’s  an  RDD?• Bigger  than  a  computer  • Read  from  an  input  source  • Output  of  a  pure  func6on  • Immutable  • Typed  • Ordered  • Lazily  evaluated  • Par66oned  • Collec6on  of  things

Page 65: Distributed Systems In One Lesson

Distributed,  how?

Page 66: Distributed Systems In One Lesson

5444 5676 0686 8389: List(xn-1, xn-2, xn-3)

4532 4569 7030 1191: List(xn-4, xn-5, xn-6)

5444 5607 6517 6027: List(xn-7, xn-8, xn-9)

4532 4577 0122 2189: List(xn-10, xn-11, xn-12)

Hash these

Distributed,  how?

Page 67: Distributed Systems In One Lesson

Distributed  CoordinaCon

Page 68: Distributed Systems In One Lesson

Distributed  CoordinaCon

• NTP  • Epidemic  Protocols  • Paxos

Page 69: Distributed Systems In One Lesson

NTP

• Distributed  6me  synchroniza6on  • A  network  of  6ered  clocks  • Synchroniza6on  algorithm  • O`en  accurate  to  ±10ms

Page 70: Distributed Systems In One Lesson

Gossip2000

4000

6000

8000

A000

C000

E000

0000

Page 71: Distributed Systems In One Lesson

Paxos

• Distributed  log  protocol  • Assumes  distributed,  unreliable  nodes  • Performs  leader  elec6on

Page 72: Distributed Systems In One Lesson
Page 73: Distributed Systems In One Lesson
Page 74: Distributed Systems In One Lesson

Review

• Storage  • Transac6ons  • Computa6on  • Coordina6on

Page 75: Distributed Systems In One Lesson

ThankYou!@tlberglund