Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency...

50
Consistency and Replication

Transcript of Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency...

Page 1: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Consistency and Replication

Page 2: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Outline

• Introduction (what’s it all about)

• Data-centric consistency

• Client-centric consistency

• Replica management

• Consistency protocols

Page 3: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Why Replicate?• Reliability

– If one goes down, the others can stay up.– Corrupted data

• Performance– Divide the work (single server vs multiple servers)– Place data closer to place it is used.

• What is the challenge?– Consistency (when updates are needed)– Consider a web cache in your browser.

Page 4: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Costs• As a scaling technique (for performance problem)• Trade-off: keep copies up to date require bandwidth.

P

Access replica N times per second

Update replica M times per second

• As a scaling technique, may not always be applicable. What if N << M?

Page 5: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

• Assume synchronous replication

• A dilemma:– Scalability can be alleviated by replication and caching.– Keep copies consistent is scaling problem (and require

bandwidth too)• E.g, consistency requires global synchronization!

– real solution is to relax consistency requirements.

WAN

Withdraw $50

Withdraw $50

Page 6: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Recap on Synchronization• Do some synch mechanisms apply here?

– Prefect clock synch?– Message ordering?

• Total ordered multicast• Causal relationship?

– Mutual exclusion• Centralized or decentrilized?• Distributed • A leader or multiple leaders

Page 7: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.
Page 8: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Outline

• Introduction (what’s it all about)

• Data-centric consistency

• Client-centric consistency

• Replica management

• Consistency protocols

Page 9: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Consistency Models

• Enforcing absolute ordering is too expensive, especially with replication and caching.

• So we need to allow for “mis-ordering”.– We could just do it casually. Tell programmers, “Well, you

always see things in exact order”.– They would say, “What do you mean?”

• So we need an exact, very precise way of specifying the kinds of inconsistencies that the application might see.

• That is the purpose and point of having consistency models.

Page 10: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Data Stores• Consistency is viewed as read/write ops on shared data.• A consistency model is a contract between the processes and

the data store (Shared memory, database, file systems).• A read operation on a data item returns a value of last write.

(nothing can claim as a best solution)

Page 11: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Example:

• A warehouse data item• Distributed reads/writes.

• A middleware provide a function call that ensures the consistency model

Page 12: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Outline

• Data-centric consistencyContinuous Consistency

Sequential Consistency

Causal Consistency

Grouping Operations

Page 13: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.
Page 14: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Continuous Consistency (1)

A 3 ops, B 2 ops

A not see 1 B’s op, the value is 5

B not see 3 A’s op, the MAX value is 6

Page 15: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

So application decides the consistency or tolerate level of inconsistency

Page 16: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Continuous Consistency (2) Choosing the appropriate granularity for a conit.

e.g, two replica can differ in one item

(a) Two updates lead to update propagation.

Page 17: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Continuous Consistency (3)

• But here (b), no update propagation is needed (yet).

Page 18: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Outline

• Data-centric consistencyContinuous Consistency

Sequential Consistency

Causal Consistency

Grouping Operations

Page 19: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Notation

• Processes execute to the right as time progresses.• The notation W1(x)a means that the process wrote the

value ‘a’ to the variable x.• The notation R2(x)a means that the process read the

value ‘a’ from the variable x.• The subscript is often dropped.

Page 20: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Sequential Consistency• The result of any execution is the same as if

the (read and write) operations by all processes on the data store were executed in some sequential order and the operations of each individual process appear in this sequence in the order specified by its program.

– There is some global order.

– We’re talking about interleaved executions: there is some total ordering for all operations taken together.

• not time

Page 21: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Program A

A-OP1A-OP2A-OP3

Program B

B-OP1B-OP2B-OP3

Global Order 1

A-OP1A-OP2A-OP3B-OP1B-OP2B-OP3

Global Order 2

A-OP1B-OP1A-OP2B-OP2B-OP3A-OP3

Global Order 3

A-OP1B-OP1A-OP2B-OP3B-OP2A-OP3

no

Page 22: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Sequential Consistency (3)

Write b happened before write a

-- sequential

Write b happened before write a

-- not sequential

Page 23: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Sequential Consistency (4)

• 6! = 720 possible execution sequence. • 90 valid

Page 24: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Sequential Consistency (5)

• Figure 7-7. Four valid execution sequences for the processes of Fig. 7-6. The vertical axis is time.

Four sample valid interleaving. Reorder with signature: p1, p2, p3

64 of them (not all are allowed) under sequence consistency

The contract says: given the program, these many outputs are correct.

Page 25: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Causal Consistency

• For a data store to be considered causally consistent, it is necessary that the store obeys the following condition:

• Writes that are potentially causally related must be seen by all processes in the same order. Concurrent writes may be seen in a different order on different machines.– causally related: If event b is caused by event a, then a must

be first, then b – Concurrent - not casually related– Weaker than sequential consistency– Deal with writes

Page 26: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Causal Consistency (2)

• W2(x)b and W1(x)c are concurrent• This sequence is allowed with a causally-consistent store,

but not with a sequentially consistent store.

Page 27: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Causal Consistency

• Causally consistent?

a and b are related, so incorrect

a and b are not related, so correct

• W1(x)a and W2(x)b causally related

Page 28: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Outline

• Data-centric consistencyContinuous Consistency

Sequential Consistency

Causal Consistency

Grouping Operations

Page 29: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Grouping Operations

• Instead of read and write ops, what about many operations that applications perform under control of syn– E.g, Mutual exclusion. – How do we handle consistency in Multi Thread programs?– Use locks.– E.g, ENTER_CS and LEAVE_CS

• As viewed by an external, data-centric process, what do locks do?– Many read and write: turn non-atomic operations into atomic

ones (functionally).– In other words, they group them.

Page 30: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Synchronization Variables

• Operations are grouped via synchronization variables (locks).

• Each sync var protects an associated data.• Each kind of sync var has some associated properties.

• Each sync var has a current owner,• A non-owner needs to send msg to the owner for

ownership (and data value)• A sync var can be owned by many processors

nonexclusively.

Page 31: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Grouping Operations• Entry Consistency: Necessary criteria for correct

synchronization:1. An acquire access of a synchronization variable is not allowed to

perform until all updates to guarded shared data have been performed with respect to that process. - Given up an ownership means finishing previous updates

2. Before exclusive mode access to synchronization variable by a process is allowed to perform with respect to that process, no other process may hold the synchronization variable, not even in nonexclusive mode. - Writing must be exclusive

3. After exclusive mode access to a synchronization variable has been performed, any other process’ next nonexclusive mode access to that synchronization variable may not be performed until it has performed with respect to that variable’s owner. - otherwise, no guarantee on consistency if one does a nonexclusive

mode

Page 32: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Grouping Operations (2)

• A valid event sequence for entry consistency.

Page 33: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Summary

• Data-centric consistency– Continues consistency– Consistent ordering of operations

• Sequential consistency• Causal consistency• Grouping operations

Page 34: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Outline

• Introduction (what’s it all about)• Data-centric consistency• Client-centric consistency

– Eventual consistency– Monotonic reads– Monotonic writes– Read your writes– Writes follow reads

• Replica management• Consistency protocols

Page 35: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Weaker Models

• Sometimes strong models are needed, if the result of race conditions are very bad.– Banks

• Sometimes the result of races are just inefficiency, or inconvenience, etc.– DNS, web caches,

• How strong is Orbitz’s model?– If it shows a ticket available, is it really?– How does it prevent two people from reserving the

same seat?

Page 36: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Eventual Consistency

• One kind of weaker model is eventual consistency– It eventually becomes consistent if updates

are not frequent. • Updates eventually propagate to all

– Write-write conflicts are relatively easy to solve (infrequent, by a small portion of nodes)

– Read-write conflicts are handled.

Page 37: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Mobile users (short time)

• How well does EC work for mobile clients? – If replica is location related

• Client-centric is for this. Consistent for a single client.

Page 38: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Outline

• Data-centric consistency

• Client-centric consistency

Goal: perhaps avoid system wide consistency, by concentrating on what specific clients want, instead of what should be maintained by

servers.Eventual Consistency

Monotonic Reads

Monotonic Writes

Read Your Writes

Writes Follow Reads

Page 39: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Data Stores• Local read/write• Eventually propagate to all

Page 40: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Notation

• xi[t] is the version of x at local copy Li at time t.

• Version xi[t] is the result of a series of write operations at Li that took place since initialization. This is WS(xi[t]).

• If operations in WS(xi[t]) have also been performed at local copy Lj at a later time t2, we write WS(xi[t1];xj[t2]).

Page 41: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Monotonic Reads

• A data store is said to provide monotonic-read consistency if the following condition holds:

– If a process reads the value of a data item x any successive read operation on x by that process will always return that same value or a more recent value.

Page 42: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Monotonic Reads

• For one processor p

WS(x1) sent to L2, is monotonic

WS(x1) not sent to L2, not monotonic

Page 43: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.
Page 44: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Monotonic Writes

• In a monotonic-write consistent store, the following condition holds:

– A write operation by a process on a data item x is completed before any successive write operation on x by the same process.

• if needed, a new write will wait for old ones to finish,

Page 45: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Monotonic Writes

WS(x1) sent to L2, is monotonic

WS(x1) not sent to L2, not monotonic

Page 46: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.
Page 47: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Read Your Writes

• A data store is said to provide read-your-writes consistency, if the following condition holds:

• The effect of a write operation by a process on data item x will always be seen by a successive read operation on x by the same process.– No matter where the location of the read is

• Suppose your web browser has a cache.– You update your web page on the server.– Before refresh your browser… – After refresh your browser.– Do you have read-your-writes consistency?

Page 48: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Read Your Writes (2)

W(x1) is part of WS (x1,x2), is read your writes

The read doesn’t include the W(x1), not R-Y-W

• i.e. updating your Web page and guaranteeing that your Web browser shows the newest version instead of its cached copy.

Page 49: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Writes Follow Reads (1)

• A data store is said to provide writes-follow-reads consistency, if the following holds:

• A write operation by a process on a data item x following a previous read operation on x by the same process is guaranteed to take place on the same or a more recent value of x that was read

Example: See reactions to posted articles only if you have the original posting (a read “pulls in” the corresponding write operation).

Respond to blog article, or chatting group.

Page 50: Consistency and Replication. Outline Introduction (what’s it all about) Data-centric consistency Client-centric consistency Replica management Consistency.

Writes Follow Reads (2)

is writes follow reads

Not writes follow reads