Cassandra summit LWTs

58
LWTs in practice Christopher Batey @chbatey The Last Pickle

Transcript of Cassandra summit LWTs

Page 1: Cassandra summit  LWTs

LWTs in practice

Christopher Batey@chbatey

The Last Pickle

Page 2: Cassandra summit  LWTs
Page 3: Cassandra summit  LWTs
Page 4: Cassandra summit  LWTs

Overview

Review of Cassandra’s consistency model

What are LWTs?

Why do we need them?

How do they work?

How do you use them?

Page 5: Cassandra summit  LWTs

Writes

Client A

C

Page 6: Cassandra summit  LWTs

Concurrent writes

Client A

C

Client B

C

Page 7: Cassandra summit  LWTs

QUORUM based consistency

Client A

C

Client B

C

Page 8: Cassandra summit  LWTs

QUORUM based consistency

Client A

C

Client B

C

Page 9: Cassandra summit  LWTs

QUORUM based consistency

Client A

C

Client B

C

Page 10: Cassandra summit  LWTs

Voucher example

CREATE TABLE vouchers_mutable (

name text PRIMARY KEY,

sold int

)

Page 11: Cassandra summit  LWTs

Read and write race condition with Quorum

● Client A read the number of ticket sales at 299● Client B read the number of ticket sales at 299● Client A sells ticket 300● Client B sells ticket 300

Page 12: Cassandra summit  LWTs

Compare and set

Page 13: Cassandra summit  LWTs

Enter Light Weight Transactions

● Client A read the number of ticket sales at 299● Client B read the number of ticket sales at 299● Client A sells ticket 300 if total sold is 299● Client B sells ticket 300 if total sold is 299

Page 14: Cassandra summit  LWTs

Examples

Page 15: Cassandra summit  LWTs

Uniqueness

CREATE TABLE users (

user_name text PRIMARY KEY,

email text,

password text

)

INSERT INTO users (user_name, password, email )

VALUES ( 'chbatey', 'different',

'[email protected]' ) IF NOT EXISTS;

Page 16: Cassandra summit  LWTs

Finite resource

CREATE TABLE vouchers_mutable (

name text PRIMARY KEY,

sold int

)

UPDATE vouchers_mutable SET sold = 1

WHERE name = 'free tv' IF sold = 0;

Page 17: Cassandra summit  LWTs

Immutable events

CREATE TABLE vouchers (

name text,

when timeuuid,

who text,

PRIMARY KEY (name, when)

);

Page 18: Cassandra summit  LWTs

Batches + LWTs

CREATE TABLE vouchers (

name text,

when timeuuid,

sold int static,

who text,

PRIMARY KEY (name, when)

);

INSERT INTO vouchers (name, sold) VALUES

( 'free tv', 0);

Page 19: Cassandra summit  LWTs

Batches + LWTs

BEGIN BATCH

UPDATE vouchers SET sold = 1 WHERE name = 'free tv' IF sold = 0

INSERT INTO vouchers (name, when, who) VALUES ( 'free tv', now(), 'chris')

APPLY BATCH ;

[applied]

-----------

True

Page 20: Cassandra summit  LWTs

Batches + LWTs

BEGIN BATCH

UPDATE vouchers SET sold = 1 WHERE name = 'free tv' IF sold = 0

INSERT INTO vouchers (name, when, who) VALUES ( 'free tv', now(), 'charlie')

APPLY BATCH ;

[applied] | name | when | sold

-----------+---------+------+------

False | free tv | null | 1

Page 21: Cassandra summit  LWTs

How they work

Page 22: Cassandra summit  LWTs

LWTs be puzzling

1. Why does a LWT have two consistency levels?2. What is this SERIAL consistency I keep hearing about?3. What are SERIAL reads?4. Why does my LWT fail but the value still get written?5. Why are they so damn slow?

Page 23: Cassandra summit  LWTs

Consensus for a partition

Page 24: Cassandra summit  LWTs

Consensus for a partition

Page 25: Cassandra summit  LWTs

Stages of a LWT

● Prepare and promise● Read existing value● Propose and accept● Commit

Page 26: Cassandra summit  LWTs

Consensus for a partition

Promised

I want the value to be 5, as long as it currently 4

prepare proposeAccepted

commitCommittedCondition

met

read

Page 27: Cassandra summit  LWTs

Prepare and promise 1

Client

partition table promised accepted committed

Prepare 1

Page 28: Cassandra summit  LWTs

Prepare and promise 1

Client

partition table promised accepted committed

A vouchers 1

Page 29: Cassandra summit  LWTs

Prepare and promise 2

Client

partition table promised accepted committed

A vouchers 1Prepare 2

Page 30: Cassandra summit  LWTs

Prepare and promise 2

Client

partition table promised accepted committed

A vouchers 2

Page 31: Cassandra summit  LWTs

Prepare and promise - rejection

Client

partition table promised accepted committed

A vouchers 2Prepare 1

Page 32: Cassandra summit  LWTs

Prepare and promise - rejection

Client

partition table promised accepted committed

A vouchers 2

Rejected- ClientRequest.CASWrite.contentions

Prepare 1

Page 33: Cassandra summit  LWTs

Prepare and promise - trace

Parsing insert into users (user_name, password, email ) values ( 'chbatey', 'chrisrocks', '[email protected]' ) if not exists; [SharedPool-Worker-1] | 2016-08-22 12:38:44.132000 | 127.0.0.1 | 1125

Sending PAXOS_PREPARE message to /127.0.0.3 [MessagingService-Outgoing-/127.0.0.3] | 2016-08-22 12:38:44.141000 | 127.0.0.1 | 10414

Sending PAXOS_PREPARE message to /127.0.0.2 [MessagingService-Outgoing-/127.0.0.2] | 2016-08-22 12:38:44.142000 | 127.0.0.1 | 10908

Promising ballot fb282190-685c-11e6-71a2-e0d2d098d5d6 [SharedPool-Worker-1] | 2016-08-22 12:38:44.147000 | 127.0.0.3 | 4325

Page 34: Cassandra summit  LWTs

Prepare and promise - trace

Promising ballot fb282190-685c-11e6-71a2-e0d2d098d5d6 [SharedPool-Worker-1] | 2016-08-22 12:38:44.147000 | 127.0.0.3 | 4325

Promising ballot fb282190-685c-11e6-71a2-e0d2d098d5d6 [SharedPool-Worker-3] | 2016-08-22 12:38:44.166000 | 127.0.0.1 | 35282

Page 35: Cassandra summit  LWTs

Read

LOCAL_SERIAL => LOCAL_QUORUM

SERIAL => QUORUM

ClinetRequest.CASWrite.conditionNotMet

Page 36: Cassandra summit  LWTs

Propose and accept

Client

partition table promised accepted committed

A vouchers 1Propose 1

Page 37: Cassandra summit  LWTs

Propose and accept

Client

partition table promised accepted committed

A vouchers 1 1Propose 1

Page 38: Cassandra summit  LWTs

Propose and accept - rejection

Client

partition table promised accepted committed

A vouchers 2Propose 1

Rejected- ClientRequest.CASWrite.contentions

Page 39: Cassandra summit  LWTs

Propose and accept - trace

Sending PAXOS_PROPOSE message to /127.0.0.2 [MessagingService-Outgoing-/127.0.0.2] | 2016-08-22 12:38:44.196000 | 127.0.0.1 | 65606

Sending PAXOS_PROPOSE message to /127.0.0.1 [MessagingService-Outgoing-/127.0.0.1] | 2016-08-22 12:38:44.196000 | 127.0.0.1 | 65606

PAXOS_PROPOSE message received from /127.0.0.1 [MessagingService-Incoming-/127.0.0.1] | 2016-08-22 12:38:44.197000 | 127.0.0.1 | 65986

Sending PAXOS_PROPOSE message to /127.0.0.3 [MessagingService-Outgoing-/127.0.0.3] | 2016-08-22 12:38:44.197000 | 127.0.0.1 | 66139

Page 40: Cassandra summit  LWTs

Propose and accept - trace

Accepting proposal Commit(fb282190-685c-11e6-71a2-e0d2d098d5d6, [lwts.users] key=chbatey columns=[[] | [email password]]\n Row: EMPTY | [email protected], password=chrisrocks) [SharedPool-Worker-2] | 2016-08-22 12:38:44.199000 | 127.0.0.1 | 67804

Page 41: Cassandra summit  LWTs

Commit

● The normal consistency is now used for the commit

Page 42: Cassandra summit  LWTs

Consensus for a partition

Promised

I want the value to be 5, as long as it currently 4

prepare proposeAccepted

commitCommittedCondition

met

read

Page 43: Cassandra summit  LWTs

SERIAL reads

o.a.c.s.StorageProxy.readWithPaxos

● For a single partition● Runs a prepare and ensures all replicas have the latest commit● Then runs the read at either Q or LQ

Page 44: Cassandra summit  LWTs

Write timestamps

Client A

C

2

1

3

C

Page 45: Cassandra summit  LWTs

Write timestamps

Client A

C

2

1

3

C

Page 46: Cassandra summit  LWTs

Write timestamps

Client A

C

2

1

3

C

T = 1

T = 2

Page 47: Cassandra summit  LWTs

Some numbers

Page 48: Cassandra summit  LWTs

Setup

4 * i2xLarge

RF = 3

10 clients trying to buy 1000 vouchers each - 10k total operations

Contention: all clients buying the same voucher (same partitoin)

No contention: all clients after different vouchers (different partition)

Page 49: Cassandra summit  LWTs

Mutable field

CREATE TABLE vouchers_mutable (

name text PRIMARY KEY,

sold int

) UPDATE vouchers_mutable SET sold = 1

WHERE name = 'free tv' IF sold = 0;

UPDATE vouchers_mutable SET sold = 1

WHERE name = 'free tv'1)

2)

Page 50: Cassandra summit  LWTs

Histogram

Page 51: Cassandra summit  LWTs

Histogram

Page 52: Cassandra summit  LWTs

Batches

CREATE TABLE vouchers (

name text,

when timeuuid,

sold int static,

who text,

PRIMARY KEY (name, when)

);

BEGIN BATCH

UPDATE vouchers SET sold = 1 WHERE name = 'free tv' IF sold = 0

INSERT INTO vouchers (name, when, who) VALUES ( 'free tv', now(), 'charlie')

APPLY BATCH ;

Page 53: Cassandra summit  LWTs

Histogram

Page 54: Cassandra summit  LWTs

Histogram

Page 55: Cassandra summit  LWTs

Summary

LWT Batch Contention Incorrect results 99th %ile (milliseconds)

N N Y 87% Lost 48

Y N Y 0% Lost1% Unknown 81% CNM

191

Y N N 0% Lost0% Unknown0% CNM

52

Y Y Y 0% Lost<1% Unknown82% CNM

192

Page 56: Cassandra summit  LWTs

Summary

Page 57: Cassandra summit  LWTs

Summary

● LWTs are expensive● They are more complex and less mature than the regular read and write path● Might be a lot easier than bringing in a second technology

Page 58: Cassandra summit  LWTs

Questions?Christopher Batey

@chbatey

The Last Pickle