Comparing NoSQL Databases for Operational Workloads

How to Compare NoSQL Databases

Tradeoffs between performance and reliability

Ben EngberThumbtack Technology

Sponsored by

Who are we and what do we want? Consulting company with focus on

scalability Long background of “no SQL”

Production deployments across many NoSQL vendors

Engineering staff of 50 Ongoing research teams

Advise people on which solutions to use

Advertised Features

MongoDB Flexibility

JSON documents Dynamic schema

Power Secondary indexes Dynamic queries Rich updates Easy aggregation

Speed/Scaling Ease of use

Cassandra Elastic scalability Linear performance Flexible, dynamic

schema Multiple datacenter and

cloud readiness Tunable data consistency Basic transaction

support http://www.mongodb.org/about/introduction/ http://www.datastax.com/what-we-offer/products-services/

datastax-enterprise/apache-cassandra

http://www.mongodb.org/about/introduction/

http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra

http://www.datastax.com/what-we-offer/products-services/datastax-enterprise/apache-cassandra

Why use NoSQL at all? “Because I’ve heard of it” “I want rapid application development” “I want to do something with Big Data”

Operational Workload High throughput Multi-user (concurrency) Integrity and consistency Small, simple, operations

Analytic Workload Ad hoc analysis Batch operation on sets Map-Reduce Machine learning, predictive

analytics, etc.

Landscape of Operational NoSQL DBs

What’s the difference between an indexed “value” or “column” and a document?

Couchbase 1.x 2.x

Aerospike 2.x 3.x

What are we really asking?

“I want to support a large transaction volume” “I want to distribute my data tier” “I want simpler handling of failover” “I want to scale my data tier horizontally”

Key-Value Stuff

What about other queries?Shard

s A,B,C

Shards

D,E,F

Shards

G,H,I

Shards J,K,L

So, we’re focusing on scaleHow should we measure operational data?

Test a bunch of databases Start with a nice simple workload

(key value storage) Use a standard client (YCSB) Then move on to

secondary indexes even more databases failover

The Plan – Start Simple

Running a database iseasy – running it correctly is hard Memory sizing, problem sizing, etc. Consistency tradeoffs Eviction Hardware utilization

These databases work in very different ways

CAP Theorem

Consistency / Availability is somewhat academic

Your application needs both HTTP Caches

These databases are tunable

What to think about instead? Consistency

Immediate/Eventual Convergence Isolation

Durability Data loss Failover

Latency Availability (downtime)

Fast Reliable

Most NoSQL databases can sit in multiple places on this spectrum

There is a spectrum of choices

Choose the databases we hear about most often

Create standard baseline scenarios

Measure raw performance for various scenarios

Examine how they fail over

How do databases achieve these guarantees?6 nodes6 “shards” — A, B, C, D, E, FReplication factor of 3

2 Scenarios: “Fast” and “Reliable”

Master-Slave (MySQL, MongoDB)

Node 1

Master: A

Node 2

Slave: A

Node 3

Slave: A

Node 4

Master: B

Node 5

Slave: B

Node 6

Slave: B

Client 1Write master

Read master

Write row A quickly

Client 2Write master and observe

Read master

Write row B durably

Shard Master(Couchbase)

Node 1

Master: ASlave: B,C

Node 2

Master: BSlave: C,D Node 3

Master: CSlave: D,E

Node 6

Master: FSlave: A,B

Node 5

Master: ESlave: F,A

Node 4

Master: DSlave: E,F

ClientWrite master

Read master

Write row A quickly

ClientWrite master and observe

Read master

Write row D durably

Tunable Quorum(Cassandra, Riak)

Node 2

B,C,D

Node 3

C,D,E

Node 4

D,E,F

Node 1

A,B,C

Node 6

F,A,BNode 5

E,F,A

Client 6Write quorumRead quorum

Client 5Write oneRead all

Client 4Write allRead one

Client 2Write oneRead one



Read/Write row A quickly

Read/Write row D consistently

Transactional Consensus(Aerospike, FoundationDB, Cassandra 2.0)

Node 1

A,B,C

Node 2

B,C,D

Node 3

C,D,E

Node 6

F,A,BNode 5

E,F,A

Node 4

D,E,F

Client 2Fire and forget



Read/Write row A quickly

Client 4Transactional

Client 5Transactional

Read/Write row D ACIDly

Quick and Dirty Conclusion Systems like MongoDB and Couchbase

trade speed for Durability Systems like Cassandra and Riak and

Aerospike trade speed for Consistency Systems like Aerospike and

FoundationDB trade speed for ACID (or parts of it)

Consistency“In distributed data systems like Cassandra, [consistency] usually means that once a writer has written, all readers will see that write.”

Row-level (CAS) Multi-key Long running

transactions

ACIDity

Old Value

New Value

Old Value

Reliability Spectrum

Aerospike (fast)

Cassandra (fast)

Couchbase (fast)

MongoDB (fast)

Aerospike (reliable)

Cassandra (reliable)

MongoDB(reliable)

Replication Model

async async async async sync sync sync

Consistency Model

eventual eventual immediate immediate immediate immediate immediate

Data loss on node failure

yes yes yes yes no no no

Availability on no quorum

available available available available available unavailable unavailable

Data loss on replica set failure

25% 25% 25% 50% 25% 25% 50%

Create Baselines

Fast Reliable Asynchronous

replication Asynchronous writes to

disk Data set fits in RAM Immediate or Eventual

Consistency

Synchronous replication Synchronous or

asynchronous writes to disk

Data set larger than RAM

Immediate Consistency (+)

Performance Tests1. Install a database on a 4-node cluster

(replication factor of 2)

2. Load a sizable dataset (500M rows) to SSD (“reliable”)

3. Determine maximum load

4. Perform a stepwise load for latency

5. Repeat for read-heavy and balanced read-write

6. Repeat steps 3-6 for a dataset that fits into RAM (“fast”)

Balanced Read-Heavy0

100,000

200,000

300,000

400,000

500,000

600,000

700,000

800,000

900,000

1,000,000

AerospikeCassandraMongoDBCouchbase 1.8Couchbase 2.0

Balanced Read-Heavy0

50,000

100,000

150,000

200,000

250,000

300,000

350,000

AerospikeCassandraMongoDB

Maximum Throughput

SSD / Synchronous RAM / Asynchronous

Latency Scenarios

0 50,000 100,000 150,000 200,0000

2.5

5

7.5

10

Balanced Workload Read Latency (Full view)

Aerospike

Cassandra

MongoDB

Throughput, ops/sec

Avera

ge L

ate

ncy,

ms

0 50,000 100,000 150,000 200,0000

4

8

12

16

Balanced Workload Update Latency (Full view)

Aerospike

Cassandra

MongoDB

Throughput, ops/sec

Avera

ge L

ate

ncy,

ms

SSD / Synchronous RAM / Asynchronous

0 100,000 200,000 300,000 400,0000

5

10

15

20

Balanced Workload Read Latency (Full view)

Aerospike

Couchbase 1.8

Couchbase 2.0

Cassandra

MongoDB

Throughput, ops/sec

Avera

ge L

ate

ncy,

ms

0 100,000 200,000 300,000 400,0000

2

4

6

8

Balanced Workload Update Latency (Full view)

Aerospike

Couchbase 1.8

Couchbase 2.0

Cassandra

MongoDB

Throughput, ops/sec

Avera

ge L

ate

ncy,

ms

Cluster Availability

Things to consider: Replication delay Cluster downtime Data loss

Things to test: Graceful shutdown kill -9 Split brain

Cluster4 Nodes100% LoadRAMAsynchronous“Fast”

MongoDB

Aerospike

Couchbase

Aerospike

Couchbase

Cassandra

Cluster4 Nodes75% LoadRAMAsynchronous“Fast”

Aerospike

Cassandra6 node (QUORUM)

Cluster4 Nodes75% LoadDiskSynchronous“Reliable”

Cassandra4 node (ALL)

Aerospike Cassandra Couchbase MongoDB0

2000

4000

6000

8000

10000

12000

14000

Downtimes on node down

min

/max d

ow

nti

me (

ms)

Do they fail over?

50%

of m

ax th

roug

hput

75%

of m

ax th

roug

hput

100%

of m

ax th

roug

hput

0

5000

10000

15000

20000

25000

30000

35000

Downtime on node restore

AerospikeCassandraCouchbaseMongoDB

media

n d

ow

nti

me (

ms)

Takeaways NoSQL databases are converging feature-

wise For operational workloads, key-value is king The Speed – Reliability spectrum is complex Consider your application’s likely needs Pick the best usability features within this

window

Questions or Advice?

Thumbtack Technology Ben [email protected]

http://thumbtack.net/whitepapers

@bengberhttp://www.thumbtack.net

Benchmarks and detailed discussion of methodology at:

mailto:[email protected]

http://www.thumbtack.net/

Comparing NoSQL Databases for Operational Workloads

Technology

Transcript of Comparing NoSQL Databases for Operational Workloads