Mixing Causal Consistency and Asynchronous...

Post on 29-Apr-2018

225 views 5 download

Transcript of Mixing Causal Consistency and Asynchronous...

Mixing Causal Consistency and

Asynchronous Replication for Large

Neo4j Clusters

Dr. Jim Webber

Chief Scientist, Neo4j

Leads to a social graph

Graphs underpin ML to

defeat some cancers

NASA’s Orion mission to Mars:

2 years shaved from planning schedule

MotivationWhy do we need clusters of

Neo4j?

Massive

Throughput

Data Redundancy

Data Redundancy

Data Redundancy

Data Redundancy

High Availability

High Availability

High Availability

Error!

503: Service Unavailable

High Availability

Error!

503: Service Unavailable

High Availability

Error!

503: Service Unavailable

High Availability

Error!

503: Service Unavailable

High Availability

Error!

503: Service Unavailable

Data RedundancyMassive Throughput High Availability

Data RedundancyMassive Throughput High Availability

3.0

Data RedundancyMassive Throughput High Availability

3.0

Bigger Clusters Consensus CommitBuilt-in load

balancing

3.1Causal

Clustering

MotivationCan you reason about

consistency?

Register

Login

You need

to login in

to continue

your

purchase!

Register

Login

You need

to login in

to continue

your

purchase!

Username:

Password:

Create Account

Register

Login

You need

to login in

to continue

your

purchase!

Username:

jim_w

Password:

********

Create Account

Register

Login

You need

to login in

to continue

your

purchase!

Username:

Password:

Login

Username:

jim_w

Password:

********

Login

Purchase

Login

Successful

Try again

No account

found!Username:

jim_w

Password:

********

Login

𝙓

Three critical issues:Fault Tolerance, Scale, and

Correctness

Design Trade-offAvailability Reliability

Roles for safety and scaleDivide and conquer

complexity

Read

Replicas

Core

• Small group of Neo4j databases

• Fault-tolerant Consensus Commit

• Responsible for data safety

Core

Writing to the Core Cluster

Neo4j

Driver

Neo4j

Cluster

Writing to the Core Cluster

Neo4j

Driver

CREATE (:User {...})

Neo4j

Cluster

Writing to the Core Cluster

Neo4j

Driver

CREATE (:User {...})

Neo4j

Cluster

Writing to the Core Cluster

Neo4j

Driver

CREATE (:User {...})

Neo4j

Cluster

Writing to the Core Cluster

Neo4j

Driver

CREATE (:User {...})

Neo4j

Cluster

Writing to the Core Cluster

Neo4j

Driver

CREATE (:User {...})

Neo4j

Cluster

Writing to the Core Cluster

Neo4j

Driver

Success

Neo4j

Cluster

Writing to the Core Cluster

Neo4j

Driver

Success

Neo4j

Cluster

Raft ProtocolNon-Blocking Consensus for

Humans

Raft Protocol

https://github.com/ongardie/raftscope

Raft in a Nutshell

• Raft keeps logs tied together (geddit?)

• Logs contain entries for both the data and cluster membership

• Entries are appended and subsequently committed if a simple majority agree

• Implication: majority agree with the log as proposed

• Anyone can call an election: highest term (logical clock) wins, followed by

highest committed, followed by highest appended

• Appended but uncommitted entries can be truncated, but this is safe

(transaction aborted)

@heidiann360

Consensus Log → Committed Transactions →

Updated Graph

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10 11 12 13

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

Transaction log: the same

transactions appear in the same

order on all members

Consensus log: stores both

committed and uncommitted

transactions

Uncommitted

entries may

differ

between

members

Tra

nsa

ctio

ns a

re o

nly

ap

pe

nd

ed to

th

e

tra

nsa

ctio

n lo

g w

he

n c

om

mitte

d

acco

rdin

g to

Ra

ft

Tra

nsa

ctio

ns a

re a

pp

lied

, u

pd

ating th

e

gra

ph

Neo4j Raft

implementation

• Small group of Neo4j databases

• Fault-tolerant Consensus Commit

• Responsible for data safety

Core

• For massive query throughput

• Read-only replicas• Not involved in Consensus

Commit • Disposable, suitable for

auto-scaling

Read

Replicas

Propagating updates to the Read Replicas

Neo4j

Driver

Neo4j

Cluster

Propagating updates to the Read Replicas

Neo4j

Driver

Neo4j

Cluster

Write

Propagating updates to the Read Replicas

Neo4j

Driver

Neo4j

Cluster

Write

Reading from the Read Replicas

Neo4j

Driver

Neo4j

Cluster

Read

Updating the graph Querying the graph

Read

Replic

as

Cor

e Updating the graph

Queries, analysis,

reporting

ESTATE=$(neo-workbench estate add database -p Local -b core-block -s 3)

neo-workbench estate add database -p Local -b edge-block -s 10 $ESTATE

neo-workbench database install -m Core \

--package-uri file:///Users/jim/Downloads/neo4j-enterprise-3.1.1-unix.tar.gz \

-b core-block $ESTATE

neo-workbench database install -m Read_Replica \

--package-uri file:///Users/jim/Downloads/neo4j-enterprise-3.1.1-unix.tar.gz \

-b edge-block $ESTATE

neo-workbench database start $ESTATE

Building an AppComputer science meets

software dev

App

Server

Neo4j

Driver

Bolt

protocol

Java<dependency><groupId>org.neo4j.driver</groupId><artifactId>neo4j-java-driver</artifactId>

</dependency>

Python

pip install neo4j-driver

.NET

PM> Install-Package Neo4j.Driver

JavaScript

npm install neo4j-driver

https ://neo4j.com/developer/language-guides

bolt://

GraphDatabase.driver( "bolt://aServer" )

bolt+routing://

GraphDatabase.driver( "bolt+routing://aCoreServer" )

GraphDatabase.driver( "bolt+routing://aCoreServer" )

Bootstrap: specify any core server to route load across the whole clus ter

bolt+routing://

Application S erver

Neo4j Driver

Max

J im

J ane

Mark

R outed write s tatements

driver = GraphDatabase.driver( "bolt+routing://aCoreServer" );

try ( Session session = driver.session( AccessMode.WRITE ) )

{

try ( Transaction tx = session.beginTransaction() )

{

tx.run( "MERGE (user:User {userId: {userId}})",

parameters( "userId", userId ) );

tx.success();

}

}

R outed read queries

driver = GraphDatabase.driver( "bolt+routing://aCoreServer" );

try ( Session session = driver.session( AccessMode.READ ) )

{

try ( Transaction tx = session.beginTransaction() )

{

tx.run( "MATCH (user:User {userId: {userId}})-[*]-(:Product) RETURN *",

parameters( "userId", userId ) );

tx.success();

}

}

C ons is tenc y modelsC an you read what you write?

C luster members s lightly “ahead” or “behind” of each other

0 1 2 3 4 5 6 7 8 9 1011

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

If I query this server I won’t

see the updates from transaction

.

If I query this server, I’ll see all updates from all

committed transactions

11

11

Register

Login

You need

to login in

to continue

your

purchase!

Register

Login

You need

to login in

to continue

your

purchase!

Username:

Password:

Create Account

Register

Login

You need

to login in

to continue

your

purchase!

Username:

jim_w

Password:

********

Create Account

Register

Login

You need

to login in

to continue

your

purchase!

Username:

Password:

Login

Username:

jim_w

Password:

********

Login

Purchase

Login

Successful

Try again

No account

found!Username:

jim_w

Password:

********

Login

𝙓

Username:

jim_w

Password:

********

A few moments later...

Login

Purchase

Login

SuccessfulUsername:

jim_w

Password:

********

Login

A few moments later...

Q Why didn’t this work?A E ventual C ons istency

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

C reate Account

App S erver

ADriver

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

CREATE (:User)

C reate Account

App S erver

ADriver

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

CREATE (:User)

C reate Account

App S erver

ADriver

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

CREATE (:User)

C reate Account

App S erver

ADriver

11

0 1 2 3 4 5 6 7 8 9 10 11CREATE (:User)

C reate Account

App S erver

ADriver

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

11

0 1 2 3 4 5 6 7 8 9 10 11CREATE (:User)

C reate Account

App S erver

ADriver

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

11

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

CREATE (:User)

C reate Account

App S erver

ADriver

11

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

CREATE (:User)

C reate Account

App S erver

ADriver

MATCH (:User)Login

App S erver

BDriver

11

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

CREATE (:User)

C reate Account

App S erver

ADriver

MATCH (:User)Login

App S erver

BDriver

11

Bookmark

S ess ion token

S tring (for portability)

Opaque to application

R epresents ultimate user’s most

recent view of the graph

More capabilities to come

L et’s try again, with C ausal C ons istency

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

C reate Account

App S erver

ADriver

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

CREATE (:User)

C reate Account

App S erver

ADriver

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

CREATE (:User)

C reate Account

App S erver

ADriver

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

CREATE (:User)

C reate Account

App S erver

ADriver

11

0 1 2 3 4 5 6 7 8 9 10 11CREATE (:User)

C reate Account

App S erver

ADriver

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

11

0 1 2 3 4 5 6 7 8 9 10 11CREATE (:User)

C reate Account

App S erver

ADriver

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

0 1 2 3 4 5 6 7 8 9

11

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

CREATE (:User)

C reate Account

App S erver

ADriver

11

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9

CREATE (:User)

C reate Account

App S erver

ADriver

MATCH (:User)Login

App S erver

BDriver

11

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

CREATE (:User)

C reate Account

MATCH (:User)Login

App S erver

A

App S erver

B

Driver

Driver

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

CREATE (:User)

C reate Account

MATCH (:User)Login

App S erver

A

App S erver

B

Driver

Driver

11

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

CREATE (:User)

C reate Account

MATCH (:User)Login

App S erver

A

App S erver

B

Driver

Driver

11

O btain bookmark

try ( Session session = driver.session( AccessMode.WRITE ) )

{

try ( Transaction tx = session.beginTransaction() )

{

tx.run( "CREATE (user:User {userId: {userId}, passwordHash:

{passwordHash})",

parameters( "userId", userId, "passwordHash", passwordHash ) );

tx.success();

}

String bookmark = session.lastBookmark();

}

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

CREATE (:User)

C reate Account

MATCH (:User)Login

App S erver

A

App S erver

B

Driver

Driver

11

Obtain bookmark

Use a bookmark

try ( Session session = driver.session( AccessMode.READ ) )

{

try ( Transaction tx = session.beginTransaction( bookmark ) )

{

tx.run( "MATCH (user:User {userId: {userId}}) RETURN *",

parameters( "userId", userId ) );

tx.success();

}

}

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10 11

0 1 2 3 4 5 6 7 8 9 10

0 1 2 3 4 5 6 7 8 9 10

CREATE (:User)

C reate Account

MATCH (:User)Login

App S erver

A

App S erver

B

Driver

Driver

11

Use bookmark

Takeaways

• Different roles for safety and scale

• Supports large clusters, multi-DC

• Drivers work with cluster to route to best instances

• All wrapped in causal consistency: your always read (at least) your own writes

Thanks for lis tening

Dr. J im WebberChief S cientis t, Neo4j