Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do...

32
1 ”Do you want a database that goes down or one that serves wrong data?" Why distributed databases suck, and what to do about it - Regaining consistency

Transcript of Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do...

Page 1: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

1

”Do you want a database that goes

down or one that serves wrong data?"

Why distributed databases suck, and what to do about it

- Regaining consistency

Page 2: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

2

■ NoSQL team lead at Trifork, Aarhus, Denmark

■ Working with databases since '97

■ NoSQL since 2008

■ Danish Shared Medication Record

■ Migrating data from MySQL to Riak

■ Devel Riak clients

■ NoSQL architect on various international projects

About the speaker

RuneSkouLarsen

Page 3: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

3

■ Part 1: Working with eventual consistency

■ NoSQL persistence landscape

■ What is consistency

■ Eventual vs. sequential consistence

■ Conflicts and how to handle them

■ CRDT's

■ Consistency models of current OLTP databases

■ Part 2: Stronger consistency in distributed, fault tolerant systems

■ Consensus

■ Delta consistency

■ Dynamic delta Consistency

Agenda

Page 4: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

4

Polyglot persistence landscape

In-memory

Neo4jVoltDBRedis

OLTP

RiakCassandraVoldemortCouchBase

Analytics

Hadoop

EasyDB

MongoDBCouchDB

Page 5: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

5

■ Redundancy

■ Availability

■ Scaling

■ Getting closer to your users

Why distributed databases?

Page 6: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

6

■Consistency:

All nodes see the same

data at the same time

■Eventual consistency → Autonomous consistency

■Sequential consistency → Bureaucratic consistency

What is Concistency

Page 7: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

7

■ Eventual consistency

Support disconnected operations

– Better to read a stale value than nothing

– Better to save writes somewhere than nothing

Potentially anomalous application behavior

– Stale reads and conflicting writes…

■ Sequential consistency

Requires highly available connections

Not suitable for certain scenarios:

– Disconnected clients (e.g. your phone)

– Apps might prefer potential inconsistency to loss of availability

When to be Consistent with what

Page 8: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

8

Conflicting updates

AsynchronousSynchronization

User A User B

A B

A B

Page 9: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

9

■Assign timestamp to all objects

■Simple but fragile – depends on precise synchronization of timers

■Data is lost

Last Write Wins (LWW)

AsynchronousSynchronization

User A User B

At=t0

Bt=t1

At=t0

Bt=t1

Page 10: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

10

Google Spanner

‘As a distributed-systems developer, you’re taught from — I want to say childhood — not to trust time. What we did is find a way that we could trust time — and understand what it meant to trust time.’

— Andrew Fikes

Page 11: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

11

■Assign vector clock to objects

■Ancestors are removed – descendants remain

Detecting conflicts using Vector Clocks (1)

AsynchronousSynchronization

User A User B

Avclock=a:1

Bvclock=a:1,b:1

Avclock=a:1

Avclock=a:1

Bvclock=a:1,b:1

Page 12: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

12

■Spawn siblings when causality chain is broken

Detecting conflicts using Vector Clocks (2)

AsynchronousSynchronization

User A User B

Avclock=a:1

Bvclock=b:1

Avclock=a:1

Bvclock=b:1

Page 13: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

13

■Keep both values as siblings

■User does the merging

■The only solution if you need to do ”intelligent” merging or start outside processes.

Semantic resolution

AsynchronousSynchronization

User A User B

A B

A B

C

Page 14: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

14

■Datastructure intrinsically merges objects

■Limited applicability

Conflict-free Replicated DataTypes

AsynchronousSynchronization

User A User B

A B

A BAB AB

Page 15: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

15

■ Convergent (CvRDT)

■ State is replicated

■ Moves towards one value

■ Commutative (CmRDT)

■ Operations to the state are replicated

■ The order of operations is insignificant

a*b = b*a

■ CvRDT and CmRDT can emulate eachother

Conflict-free Replicated Data Types

Page 16: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

16

CRDT examples: G-set and 2P-Set

RIP

Tombstone

Page 17: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

17

■ CRDTs: Consistency without concurrency control

2009

INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE

■ A comprehensive study of Convergent and

Commutative Replicated Data Types

2011

INSTITUT NATIONAL DE RECHERCHE EN INFORMATIQUE ET EN AUTOMATIQUE

■ Sean Cribbs - Eventually Consistent Data Structures

http://vimeo.com/43903960

CRDT References

Page 18: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

18

■ Last Write Wins

■ Easy

■ Data is lost

■ Depends on timestamps

■ Semantic resolution

■ Requires application/user involvement

■ Generic solution

■ Conflict-free Data Types

■ Data structure has built-in convergence

■ Limited ability to model real-world problems

Methods for handling conflicts

Page 19: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

19

■ Last write wins

Riak

CouchDB/CouchBase

Cassandra

■ User resolvable conflicts

Riak

Voldemort

CouchDB/CouchBase (but unreliable)

■ Active anti-entropy

Riak (Soon)

Consistency models of OLTP databases

■ Hinted handoff with sloppy quorums (highest write-availability)

Riak

Cassandra

■ Strong consistency (read you own writes + strict quorums)

Riak

Voldemort

Cassandra

CouchBase

MongoDB

Traditional SQL databases (Oracle, MySQL, etc.)

Page 20: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

20

AtomicConsistentIsolatedDurable

”Consistency pH”

availabilityConsistency

BasicallyAvailableSoft stateEventual Consistency

vs

Page 21: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

21

Consensus

Consensus

■ Protocol for agreeing on a decision

■ More than half the nodes must be in agreement (n/2+1)

■ Tolerates remaining nodes being down/slow/un-updated.

availabilityConsistency

Page 22: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

22

Example: Ensuring idempotence using consensus

■ Communication protocols are unreliable and requests can be resent even when they have already completed.

■ Clients assign requestID.

■ If a request is resent, we should return the first answer instead of processing it again.

■ vnodes serialize writes in Riak.

■ We use Riak. N=3, PW=quorum to ensure strict quorums.(*)

(*) Riak has a bug in the P checks, but we have deemed it insignificant to our use.

Page 23: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

23

Requests

Example: Ensuring idempotence using consensus

Doctor systemPharmacy

system

Requests Requests

Requests

Requestidempotence

Proxy instance

Page 24: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

24

Example: Ensuring idempotence using consensus

Doctor system

reqid=xyz

Pharmacy system

reqid=xyz Down

reqid=xyz

Requestidempotence

Proxy instance

← We tolerate one node down at a time

Asuming n<=nodes:n=3: quorum=2, maxdown=1n=4: quorum=3, maxdown=1n=5: quorum=3, maxdown=2n=6: quorum=4, maxdown=2n=7: quorum=4, maxdown=3

Page 25: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

25

Example: Ensuring idempotence using consensus

Doctor system

reqid=xyz

Pharmacy system

reqid=xyz

reqid=xyz

Requestidempotence

Proxy instance

Page 26: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

26

Delta consistency

Consensus

■ An update will propagate through the system and all replicas will be consistent after a fixed time period δ

■ Easy to understand for customer

availabilityConsistency

Delta consistency

Page 27: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

27

Example: Delta Consistency with prescription replication

We guarentee that prescriptions are replicated from Oracle to Riak in 20 minutes.

OracleMaster

OracleMView

Riak Riak

Riak

Drug medication server

Prescriptionserver

Max 20 minutes

Page 28: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

28

Dynamic Delta consistency

Consensus

■ Same as Delta Consistency, but users can monitor directly how far behind we are

■ Define one or more authorities, and track how far behind they are.

■ All responses are added information on updatedness of data for each authority.

■ Useful when delay is normally low (sub-second), but can be high in times of degraded service.

■ Useful for CQRS or temporarily offline systems

■ Pro/Con: Users have to understand what data delay means.

availabilityConsistency

Delta consistency

Dynamic Delta consistency

Page 29: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

29

■ When beginning a sync, note the time on the authority

■ After completing a sync, store the time of last sync on one or boths sides.

■ Expose updatedness of data.

Example: Dynamic Delta Consistency using mobile device

Mobiledevice

Riak Riak

Riak

Riak Relayserver

RiakSync

Page 30: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

30

■ Commands trigger async events

■ Events update views

■ Expose the oldest waiting event as updated_until on view, or now if no events are in queue.

Example: Dynamic Delta Consistency using CQRS

View

Eventlog

Page 31: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

31

■ Setup is multiple datacenters – everybody replicates with everybody at intervals.

”full sync”

■ When a full sync is done, save the sync data in each data center

Example: DC1 done syncing with DC2

– sync started at time t.

■ When a datacenter is internally consistent (no pending handoffs for instance), it can expose the time of sync with the other authorities as updated_until timestamp.

Example: Dynamic Delta Consistency using multiple authorities

DC1

DC2

DC3

Page 32: Why distributed databases suck, and what to do …...Why distributed databases suck, and what to do about it - Regaining consistency 2 NoSQL team lead at Trifork, Aarhus, Denmark Working

32

Thank you!

RuneSkouLarsen