Replication: optimistic approaches

49
Replication: optimistic approaches Marc Shapiro with Yasushi Saito (HP Labs) Cambridge Distributed Systems Group

description

Replication: optimistic approaches. Marc Shapiro with Yasushi Saito (HP Labs). Cambridge Distributed Systems Group. Motivations for this work. Peer-to-peer, decentralised write sharing Lessons and commonalities Understand limitations Different solutions: spectrum or discrete points? - PowerPoint PPT Presentation

Transcript of Replication: optimistic approaches

Page 1: Replication: optimistic approaches

Replication: optimistic approaches

Marc Shapirowith Yasushi Saito (HP Labs)

Cambridge Distributed Systems Group

Page 2: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 2

Motivations for this work

Peer-to-peer, decentralised write sharing

Lessons and commonalities

Understand limitations

Different solutions: spectrum or discrete points?

Simple formal model

Page 3: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 3

Optimistic replication

Replicas of shared objects on sitesWithout synchronisation:

peer-to-peer read and update!

Consistency: a posteriori, offlineMerge independent updates

Applications:high latency networksdisconnected operationcooperative work

Improves availability & performance

Page 4: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 4

Example: cooperative engineering with CVS

CVS: developing shared code

Local, disconnected replica: no interference

Conflicts:Write same file = syntacticOverlap in file = violates edit semanticsDoesn’t compile, test = violates

application semantics

Both sides of a conflict are excluded

Manual repair

Page 5: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 5

Example: Bayou

General-purpose databaseAny replica can update, log actions

action = { dependency check, operation, merge-procedure }

Optimistic replication:epidemic exchange logs{ roll-back, replay }*; commitdep-check: semantic check for conflict merge-proc: semantic repair

Page 6: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 6

Basic vocabulary

While isolated: tentative updates

When connected, reconcile:Propagate & collect updates(Conceptually) Restart from initial stateReplay updates (if possible)

Overriding goal: consistency

Page 7: Replication: optimistic approaches

1. Consistency

Study component issues of consistency

Page 8: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 8

What is consistency?

Consistent with user intentsapply operationsaccording to user scenario

Consistent with data invariantsdependent actionspre- and post-conditionsconflict resolution

Replicas consistent with each otherconverge towards same values

Page 9: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 9

Consistency: problem taxonomy1. Objects & updates

Internal vs. external consistency Value / value log / operation log Single master / multi-master

2. Detecting dependence vs. concurrency

3. Concurrency control

4. Laziness of concurrency control Pessimistic / advanced concurrency /

optimistic

5. Convergence

Page 10: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 10

Operation-based reconciliation

Updates: concurrent, unsynchronised

Local log of actions = operation descriptionsobject identifier, method, arguments

Multi-log collects local + remote logs

Reconciliation schedule: merge multi-log & run sequentially

Scheduling issues:Include vs. excludeExecution order

Page 11: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 11

Operation-based model

0

0

1

2

0

0

4

3

Page 12: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 12

Dependence vs. concurrency

Two actions are either have a dependency or commutative / concurrent

Dependent actions:do not conflictmust be scheduled in dependence order

Concurrent actionspotentially conflict

Dependence / concurrency detection is a fundental mechanism

Page 13: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 13

Concurrency control

Concurrent & no conflict commute: execute both, arbitrary order

Conflict detection options

Conflict resolution options

Page 14: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 14

Convergence

Liveness: sites receive same/all actions

Safety: given same actions, sites compute the same value

Stability: actions eventually not undone

Page 15: Replication: optimistic approaches

2. Dependency & Concurrency

Mechanisms to detect if actions are dependent or concurrent

Page 16: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 16

Scalar clocks and timestamps

Wall clock, Lamport clockTotal orderTotal order, consistent with

causal dependenceSchedule in timestamp orderCan’t detect concurrency

Page 17: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 17

Happens-before

e1 precedes e2 in processe1 sends, e2 receives

e1 e2

(e1 e2) (e2 e1) e1 || e2

e1 || e2: e1 does not cause e2

e1 e2: e1 might cause e2

Partial order, consistent with causal dependence

Schedule consistent with

Page 18: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 18

Syntactic vs. semantic mechanisms

Scalar timestampsno concurrency detectionvery conservative approx.

of causalityVector timestamps

detect concurrencyconservative approx. of

causality

Alternative: explicit semantic constraints

Page 19: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 19

Locks as semantic constraints

Read(x) depends onprevious Write(x) in same process, orpreviously-received Write(x), whichever

is laterWrite(x) depends on

previous Read(*) in same processMore semantic information than Happens-

BeforeStep in the right direction, but still too coarse

Page 20: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 20

IceCube: Primitive constraints

Declarative (“static”):MustHave: a b

if as and ab then bs(not necessarily contiguous nor in

order)Order: a b

if a, bs and ab then a before b in s(not necessarily both nor contiguous)

Within log, across logs

Imperative (dynamic): preCondition (State)

Page 21: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 21

Log constraints

parcelpredecessor-

successor

alternatives

Express user intents:Predecessor/successor: a b b a

b uses effect of a; “a causes b”Parcel: a b b a

transactionAlternatives: a b b a

Page 22: Replication: optimistic approaches

3. Concurrency control & scheduling

Policies for dealing with concurrent actions

Page 23: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 23

Optimistic concurrency control & scheduling

Two actions are either:dependent

schedule in dependence orderconcurrent and non-conflicting or

commutative schedule in any order

concurrent and conflicting schedule in non-conflicting order or exclude one, the other, or both

Page 24: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 24

Concurrency control

Concurrent & no conflict commute: execute both, arbitrary order

Conflict detection options:2 concurrent actions conflictonly if operate on same objectonly if both writeonly if violate semantic invariant

Conflict resolution options:exclude bothexclude 1st, include 2nd (or vice-versa)execute both in favorable order(rewrite and execute both)

Page 25: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 25

What is a conflict?

1 site executes code + pre/post-conditionsPre/post-conditions often unknownDependency between successive actions

Schedule execution must satisfy pre/post-conditionsViolation conflict

pre(x0) post(x0, f(x0))

x1:= f(x0)

pre(x0) post(x’1, g(x0))

x’1:= g(x0)

pre(x1) post(x1, g(x1))

x2:= g(x1)

Page 26: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 26

Thomas’ Write Rule

Pre- / post-conditions unknownScalar clocks

no concurrency detectimplicit concurrency controlschedule in clock ordera later action excludes earlier ones

Lost updates

Delete ambiguity: “tombstone” state

Page 27: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 27

Value-based Version Vector concurrency control

Pre- / post-conditions unknown

Independent objectsactions to different objects commuteVV = per-object vector timestampany concurrent writes to object conflict

Resolution:ManualValues: “Resolver” per data type

Page 28: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 28

Bayou scheduling

Disjoint databases; 1 primary / database

Transaction: single database

Action = { dependency check, operation, merge-procedure }

Optimistic replication:epidemic exchange logs{ roll-back, replay }*; commit

Conflict dependency check fails

merge-procedure

Page 29: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 29

Bayou dependency checks

Write-write conflicts: on replay check that data unchanged

Read-write conflicts: check input datacan detect concurrent updatessemantic: only relevant changes

Application-specific checksbank account balance > £100fine grain

Page 30: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 30

IceCube: Object constraints

Shared data type advertises static semanticsmutually exclusive a b b a

best order (e.g. bank: credits before debits) a b

Only between concurrent actions

Also: dynamic constraints

commutebestorder

mutuallyexclusive

Page 31: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 31

IceCube scheduling

Insight: conflict: choice of which action to excludemaximise value

Page 32: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 32

IceCube execution model

0 1

0 2

0

0

0

0

0

8

11

4

5

6

log constraints

log constraintsobjectconstraints

0 9

0 10

0 7

dynamic constraints

Page 33: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 33

Search vs. syntactic order

0

50

100

150

200

250

5 40 75 110 145 180 215 250

Number of actions

Solu

tion

siz

e OptimalConcatenateIceCubeSingle log

Page 34: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 34

Performance of IceCube heuristics

0

500

1000

1500

2000

2500

3000

1000

2000

3000

4000

5000

6000

7000

8000

9000

1000

0

Number of actions

Ex

ec

uti

on

tim

e (

ms

)

Total

Page 35: Replication: optimistic approaches

4. Convergence

Can a peer-to-peer system converge?

Hard in the general case

Formalise to understand limitations, trade-offs

Page 36: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 36

Convergence

Liveness: sites receive same/all operations epidemic multicastquickly

Safety: sites compute the same valueequivalent schedules

Stability: actions eventually not undonestable schedulesUsers, external world dependencyGarbage collection

Page 37: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 37

Schedule soundness & equivalence

s sound:Closed for MustHave

as ab bsConsistent with Order

(a,b s ab) a before b in sEquivalence: s t

s, t soundas atordering is irrelevant!

Page 38: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 38

Stability

Peer-to-peer, indefinite tentative update + advisory reconciliation OK

But stability needed:Users, external world depend on itGarbage collect multilog

Stable: eventually decisions not changedcommitted: definitely included in all

schedulesaborted: definitely excluded

Page 39: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 39

Correctness of stability

Actions known to be stable at site i:stablei = committedi abortedi

Live: action a, site i: a stablei

Safe: site i, schedule si:

si sound committedi si site i,k: committedi abortedk =

Safety invariant: strong, global!

Page 40: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 40

Maintaining disjointness

site i,k: committedi abortedk = Different possibilities

Unilateral abortTWR, Holliday 2000

Unilateral commitDeterministic abort / commit rule

TWR Primary (only one) site decides

Bayou, CVSConsensus before deciding

Deno, Holliday 2000-2002

Page 41: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 41

Maintaining soundness site i, schedule si:

si sound committedi si

When aborting a, also abort actions that MustHave a

When committing a, also abort uncommitted actions that are ‘Order’ed before a

Maintain both soundness and disjointness.Peer-to-peer commitment is hard!

Page 42: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 42

Stability with TWR

Independent objects

Independent writes (no MustHave nor Order)

All sites take same decision:Given two writes to same object, abort

the earlierWhether concurrent or notWrite stable when seen by all sites

Disjointness: committedi =

Soundness: no MustHave (no transactions)

Page 43: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 43

Stability in Bayou

Databases:DisjointIndependent: no multi-DB transaction1 primary / database

Log constraints: transactions, time order

Disjointness: Only 1 site decides about a: the primary for the database that a updates

Soundness: whole transaction commits or aborts

Page 44: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 44

Holliday’s pre-commit protocol

Log constraints: multi-object transactionshappens-before order

Read transactions commit locally

Read-Write transactions: consensus to commitconvert locks to intentionspre-commit, votecommit if quorum ‘yes’abort if anti-quorum ‘no’ or conflict with

committed

Page 45: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 45

Trade-offs

Deterministic rulefast, inflexible

Partition + primarysingle point of failureno MustHave across partition boundaries

Consensusslowscalabilityimpossibility of consensus in asynchronous

systems with failure

Page 46: Replication: optimistic approaches

5. Conclusions

Page 47: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 47

Need for OR not going away

“Network technology improving: keep everything consistent pessimistically.”

True, but:Constant latency; unavailable bandwidthMobile access unbounded latencyIncreasing numbers of replicas

“Conflicts are rare.”

True, but:Do occurVery high cost

Page 48: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 48

OR pros & cons

Peer-to-peer read/write sharing

OR accepts more updates:Performance despite latencyAvailability despite failures

Increased complexitySemantic informationNot transparent

Bottleneck moved to commitHard to make peer-to-peerUnless (unacceptable?) restrictions

Unavoidable

Page 49: Replication: optimistic approaches

Bases de Données Avancées -- 2002-10-23 Replication: optimistic approaches 49

The end