Тестирование производительностиNoSQL БД
Денис Нелюбин
Thumbtack Technology Inc.
Ultra-HighPerformanceNoSQLBenchmarking.pdf
A.K.A.: Citrusleaf
Creator: Aerospike, August 2012
License: Proprietary, Community edition
Category: Key-value, Complex data types + Secondary indexes (from v.3.0)
A.K.A.: CouchDB + Membase
Creator: Couchbase, Inc. (CouchOne + Membase), January 2012
License: Apache 2.0,Proprietary (Enterprise edition)
Category: Key-value, Document + Secondary indexes (from v.2.0)
A.K.A.: Apache Cassandra
Creator: Facebook, July 2008
License: Apache 2.0
Category: Key-value, BigTable, Column-oriented
A.K.A.: Mongo
Creator: 10gen (MongoDB, Inc.), March 2010
License: AGPL, Commercial license
Category: Document-oriented
A.K.A.: Yahoo! Cloud Serving Benchmark
Creator: Yahoo! Research, June 2010
License: Apache 2.0
Category: NoSQL benchmark
YCSB
YCSB
Data set
● key: "user" + 64-bit Fowler-Noll-Vo hash● value: 10 fields of random data
Load
● insert N records
Run
● update and read on N records by the key
YCSB
YCSB
Does NOT do, does NOT check:
● join● secondary index● where clause● partial update
Why YCSB?
● applicable to any database ● popular● de-facto standard
Hardware
Servers:
4 * (8 * Xeon + 32GB RAM + 4 * 120GB SSD)
Clients:
8 * (4 * i5 + 4GB RAM)
Single client is not enough
Hardware: CPU
8 cores Xeon ≈ 4 cores i5 *
*(unproved)
Hardware: Network
1 Gbps is not enough
1 Gbit/sec / 1 KB of data ≈ 100 000 ops/sec
Single IO queue on single CPU is not enough# cat /proc/interrupts | grep eth
90: 0 0 0 0 IR-PCI-MSI-edge eth0
91: 275107859 0 0 0 IR-PCI-MSI-edge eth0-TxRx-0
92: 227858040 0 0 0 IR-PCI-MSI-edge eth0-TxRx-1
93: 242082684 0 0 0 IR-PCI-MSI-edge eth0-TxRx-2
94: 230651008 0 0 0 IR-PCI-MSI-edge eth0-TxRx-3
95: 217273950 0 0 0 IR-PCI-MSI-edge eth0-TxRx-4
96: 240149262 0 0 0 IR-PCI-MSI-edge eth0-TxRx-5
97: 194736879 0 0 0 IR-PCI-MSI-edge eth0-TxRx-6
98: 270089080 0 0 0 IR-PCI-MSI-edge eth0-TxRx-7
Hardware: SSDOverprovisioning
● hdparm● fdisk
http://en.wikipedia.org/wiki/Write_amplification
OS (GNU/Linux)
ulimit
● nofile > 4k
RAID (RAID 0?)
● mdadm● lvm
Read-ahead
● minimal
http://upload.wikimedia.org/wikipedia/commons/a/a4/Gnu-linux-on-white.png
Test
Data sets:
● RAM: 50M * 100 byte ≈ 5GB● SSD: 500M * 100 byte ≈ 50GB● replication factor = 2
Workloads:
● Heavy Write: 50% update / 50% read● Mostly Read: 5% update / 95% read
http://kushsrivastava.files.wordpress.com/2012/11/test.gif
Consistency:
● Sync replication● Async replication
Insert, RAM
Insert, SSD
Heavy Update, RAM
Heavy Update, SSD
Heavy Update, Latency
Mostly Read, RAM
Mostly Read, SSD
Speed
Insert
Couchbase*
Aerospike
Cassandra
MongoDB
Update
Couchbase*
Aerospike
Cassandra
MongoDB
Read
Aerospike
Couchbase*
MongoDB
Cassandra
* in memory or on smaller data set
Failover test
● 50%, 75%, 100% of max throughput● Heavy Update
● 10min warmup● kill -9● 10min without one node● service start● 20min after restore
Aerospike, Sync, SSD, 50%
Cassandra, Async, SSD, 50%
Couchbase, Async, RAM, 100%
MongoDB, Async, SSD, 50%
Replication
MongoDB Cassandra Couchbase/Aerospike
Data storing reliability
Cassandra
MongoDB
Aerospike
Couchbase
archive
live data
fast cache, eviction
cache (async only)
Capacity
Cassandra
MongoDB
Aerospike
Couchbase*
packed archive
unpacked live data
indexes in RAM + SSD
metadata and cache in RAM
* was able to take only 200M records
Deployment
Couchbase
Aerospike
Cassandra
MongoDB
four clicks
powerful config
config+config+calculator
shards of replica-sets
Managing
Couchbase
MongoDB
Cassandra
Aerospike
superduperwebconsole
commands and docs *
exists **
raw ***
***
***
use MMS (MongoDB Management Service)use DataStax productstry AMC (Aerospike Monitoring Console)
Unique features
Aerospike
● SSD support, speed
Couchbase
● good web console, easy deployment
Cassandra
● writes faster than reads ;)
MongoDB
● documents
TroublesomesAerospike
● eviction● secret config options● long start
Couchbase
● big data ● strange client behaviour● long start● long shutdown
http://www.spreadshirt.com/here-comes-trouble-women-s-t-shirts-C3376A9069098
Troublesomes
Cassandra
● need to think about the config ;)
MongoDB
● mongos have to be restarted ● replica-set is too surviving ;)
http://www.spreadshirt.com/here-comes-trouble-women-s-t-shirts-C3376A9069098
When to use: Aerospike
Big Fast Cache
http://x-celestia-x.deviantart.com/art/I-am-the-best-Rainbow-Dash-358472521
When to use: Couchbase
In-memory Cache with Persistence
http://zutheskunk.deviantart.com/art/MLP-Resource-Shadowbolt-Female-02-238973870
When to use: Cassandra
Big-Data Archive
http://www.deviantart.com/art/Zecora-324988216
When to use: MongoDB
Universal DB for Web
http://www.deviantart.com/art/Trixie-221583239
Not only YCSB
● From scratch, inspired by YCSB● More tests
○ Secondary indexes (cardinality, overhead)○ Aggregation (average value)○ Collection data types (stack, array, wide row)
Thanks
Denis Nelubin <[email protected]>Alexey Remnev <[email protected]>
Top Related