Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh...

37
Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy Zwaenepoel, EPFL

Transcript of Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh...

Page 1: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks

Jiaqing Du, EPFLSameh Elnikety, Microsoft ResearchAmitabha Roy, EPFLWilly Zwaenepoel, EPFL

Page 2: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

2

Key-Value Data Store API

• Read operation– value = get( key )

• Write operation– put( key, value)

• Read transaction– <value1, value2, …> = mget ( key1, key2, … )

Page 3: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

3

Partitioning

• Divide data set into several partitions. • A server manages each partition.

Partition 1 Partition 2 Partition N…

Page 4: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

4

• Data set is partitioned

Inside a Data Center

Partition 1

Application

Partition 2 Partition N…

client

Application Application

client client…Application

tier

Data tier

Page 5: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

5

Geo-Replication

Data Center E

Data Center B Data Center C

Data Center F

• Data close to end users• Tolerates disasters

Data Center A

Page 6: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

6

Scalable Causal Consistency in Orbe

• Partitioned and replicated data store• Parallel asynchronous update propagation• Efficient implementation of causal consistency

Partition 1 Partition 2 Partition N…

Partition 1 Partition 2 Partition N…Replica A

Replica B

Page 7: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

7

Consistency Models

• Strong consistency– Total order on propagated updates– High update latency, no partition tolerance

• Causal consistency– Propagated updates are partially ordered– Low update latency, partition tolerance

• Eventual consistency– No order among propagated updates– Low update latency, partition tolerance

Page 8: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

8

• If A depends on B, then A appears after B.

Causal Consistency (1/3)

Photo

Comment: Great weather! Comment: Great weather!

Update

Propagation

Alice

Alice

Page 9: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

9

• If A depends on B, then A appears after B.

Causal Consistency (2/3)

Photo

Comment: Nice photo! Comment: Nice photo!

Update

Propagation

Alice

Bob

Page 10: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

10

Causal Consistency (3/3)

• Partitioned and replicated data stores

Partition 1 Partition 2 Partition N…

Partition 1 Partition 2 Partition N…Replica A

Replica B

Client

Read(A) Read(B) Write(C, A+B)

Propagate (C)How to guarantee A and B appear first?

Page 11: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

11

Existing Solutions

• Version vectors– Only work for purely replicated systems

• COPS [Lloyd’11]– Explicit dependency tracking at client side– Overhead is high under many workloads

• Our work– Extends version vectors to dependency matrices– Employs physical clocks for read-only transactions– Keeps dependency metadata small and bounded

Page 12: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

12

Outline

• DM protocol• DM-Clock protocol• Evaluation• Conclusions

Page 13: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

13

Dependency Matrix (DM)

• Represents dependencies of a state or a client session • One integer per server• An integer represents all dependencies from a partition

Partition 1 Partition 2

Replica A

Partition 1 Partition 2

Replica B

9 5

00DM

first 9 updates

first 5 updates07Partition 3

Partition 3

Page 14: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

14

DM Protocol: Data Structures

Partition 1 of Replica A

Client Dependency matrix (DM)

0 0

0 00 0DM =

Page 15: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

15

DM Protocol: Data Structures

Partition 1 of Replica A

Client

3 8

Dependency matrix (DM)

Version vector(VV)

0 0

0 00 0

VV =

DM =

Page 16: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

16

DM Protocol: Data Structures

Partition 1 of Replica A

Client

3 8Item A, rid = A, ut = 2, dm =

Item B, rid = B, ut = 5, dm =

Dependency matrix (DM)

Version vector(VV)

Update timestamp(UT)

Source replica id(RID)

0 0

0 00 0

1 4

0 00 0

0 5

0 01 0

VV =

DM =

Page 17: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

17

DM Protocol: Read and Write

• Read item– Client <-> server– Includes read item in client DM

• Write item– Client <-> server– Associates client DM to updated item– Resets client DM (transitivity of causality)– Includes updated item in client DM

Page 18: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

18

0 0

0 01 0

4 0

0 00 0

Replica A

Partition 1

(v, rid = A, ut = 4)

Example: Read and Write

Partition 2

Client

DM = DM = DM =

read(photo) write(comment, )

VV = [7, 0]

(ut = 1)

VV = [0, 0] VV = [1, 0]

Partition 3VV = [0, 0]

0 0

0 00 0

4 0

0 00 0

Page 19: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

19

DM Protocol: Update Propagation

• Propagate an update– Server <-> server– Asynchronous propagation– Compares DM with VVs of local partitions

Page 20: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

20

Replica A

Partition 1

Example: Update Propagation

Replica BPartition 1

Partition 2

Partition 2

VV = [7, 0]

VV = [0, 0]

VV = [3, 0]

VV = [0, 0]

VV = [4, 0]

VV = [1, 0]

check dependency

VV = [1, 0]

Partition 3VV = [0, 0]

Partition 3VV = [0, 0]

replicate(comment, ut = 1, )0 00 04 0

Page 21: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

21

Complete and Nearest Dependencies

• Transitivity of causality– If B depends on A, C depends on B, then C depends on A.

• Tracking nearest dependencies– Reduces dependency metadata size– Does not affect correctness

C: write Comment 2

A: writePhoto

B: writeComment 1

Complete Dependencies

Nearest Dependencies

Page 22: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

22

DM Protocol: Benefits

• Keeps dependency metadata small and bounded– Only tracks nearest dependencies by

resetting the client DM after each update– Number of elements in a DM is fixed– Utilizes sparse matrix encoding

Page 23: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

23

Outline

• DM protocol• DM-Clock protocol• Evaluation• Conclusions

Page 24: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

24

Read Transaction on Causal SnapshotAlbum: PublicBob 1Photo

Bob 2

Album: Public Only close friends!

Bob 3

Photo

Bob 4

Replica A Replica B

Album: Public

Photo

Album: Public Only close friends!

Photo

Mom 1

Mom 2

Page 25: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

25

• Provides causally consistent read-only transactions• Requires loosely synchronized clocks (NTP)• Data structures

DM-Clock Protocol (1/2)

Timestamps from physical clocksPartition 0

Client

3 8Item A, rid = A, ut = 2, dm = , put = 27

Item B, rid = B, ut = 5, dm = , put = 35

0 0

0 00 0

1 4

0 00 0

0 5

0 01 0

VV =

DM = PDT = 0

Page 26: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

26

DM-Clock Protocol (2/2)

• Still tracks nearest dependencies• Read-only transaction– Obtains snapshot timestamp from local physical clock– Reads latest versions created “before” snapshot time

• A cut of the causal relationship graph

A0

B2

C1

B0

C0

D3

E0

snapshot timestamp

Page 27: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

27

Outline

• DM protocol• DM-Clock protocol• Evaluation• Conclusions

Page 28: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

28

Evaluation

• Orbe– A partitioned and replicated key-value store– Implements the DM and DM-Clock protocols

• Experiment Setup– A local cluster of 16 servers– 120 ms update latency

Page 29: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

29

Evaluation Questions

1. Does Orbe scale out?2. How does Orbe compare to eventual consistency?3. How does Orbe compare to COPS

Page 30: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

30

Throughput over Num. of PartitionsWorkload: Each client accesses two partitions.

Orbe scales out as the number of partitions increases.

Page 31: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

31

Throughput over Varied Workloads Workload: Each client accesses three partitions.

Orbe incurs relatively small overhead for tracking dependencies under many workloads.

Page 32: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

32

Orbe Metadata Percentage

Dependency metadata is relatively small and bounded.

Page 33: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

33

Orbe Dependency Check Messages

The number of dependency check messages is relatively small and bounded.

Page 34: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

34

Orbe & COPS: Throughput overClient Inter-Operation Delays

Workload: Each client accesses three partitions.

Page 35: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

35

Orbe & COPS: Number of Dependencies per Update

Orbe only tracks nearest dependencies when supporting read-only transactions.

Page 36: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

36

In the Paper

• Protocols– Conflict detection– Causal snapshot for read transaction– Garbage collection

• Fault-tolerance and recovery• Dependency cleaning optimization• More experimental results– Micro-benchmarks & latency distribution– Benefits of dependency cleaning

Page 37: Orbe: Scalable Causal Consistency Using Dependency Matrices & Physical Clocks Jiaqing Du, EPFL Sameh Elnikety, Microsoft Research Amitabha Roy, EPFL Willy.

37

Conclusions

• Orbe provides scalable causal consistency– Partitioned and replicated data store

• DM protocol– Dependency matrices

• DM-Clock protocol– Dependency matrices + physical clocks– Read-only transactions (causally consistency)

• Performance– Scale out, low overhead, comparison to EC & COPS