Let the Tiger Roar! - MongoDB 3.0 + WiredTiger

38
Let the Tiger Roar! MongoDB 3.0 Jon Rangel Consulting Engineer, MongoDB [email protected] @j0nrang3l

Transcript of Let the Tiger Roar! - MongoDB 3.0 + WiredTiger

Let the Tiger Roar!

MongoDB 3.0

Jon Rangel

Consulting Engineer, MongoDB

[email protected]

@j0nrang3l

Agenda

• MongoDB 3.0

• Pluggable Storage Engine API

• Storage Engines

– MMAPv1

– WiredTiger

– WT vs MMAPv1

• Recap of Improvements

MongoDB 3.0

http://www.pixelresort.com/wp-content/uploads/2013/06/3point0.jpg

A lot of good things come with 3

• USB

• 3G

• Tricycle

MongoDB 3.0

• Pluggable Storage Engine API

• Storage Engines

• Large Replica Sets

• Big Polygon

• Security Enhancements – SCRAM

• Audit Trail

• Simplified Operations – Ops Manager

• Tools Rewrite

MongoDB 3.0 is a bag full of goodies!

Check out the 3.0 Release Notes:http://docs.mongodb.org/v3.0/release-notes/3.0/

Storage Engine

http://files.ecomagination.com/wp-content/uploads/2012/08/PowerHaul-Engine-Green_844x680.jpg

How does MongoDB persist data?

• <= MongoDB 2.6

– One unique mechanism using Memory Mapped Files

– "mmapv1" Storage Engine

• MongoDB 3.0 has a few more options

– mmapv1 – default

– wiredTiger– (in_memory – experimental only)

Pluggable Storage Engine API

http://www.livingincebuforums.com/ipb/uploads/monthly_10_2011/post-198-0-67871200-1318223706.jpg

Storage Engine API

• Allows to "plug-in" different storage engines

– Different work sets require different performance

characteristics

– mmapv1 is not ideal for all workloads

– More flexibility

• Can mix storage engines on same replica

set/sharded cluster

• Opportunity to integrate further ( HDFS, native

encrypted, hardware optimized …)

MMAPv1

https://angrytechnician.files.wordpress.com/2009/05/memory.jpg

MMAPv1

MMAPv1

• Improved concurrency control

• Great performance on read-heavy workloads

• Data & Indexes memory mapped into virtual

address space

• Data access is paged into RAM

• OS evicts using LRU

• More frequently used pages stay in RAM

WiredTiger

What is WiredTiger?

• Storage engine company founded by BerkeleyDB alums

• Recently acquired by MongoDB

• Available as a storage engine option in MongoDB 3.0

Why is WiredTiger Awesome

• Document-level concurrency

• Disk Compression

• Consistency without journaling

• Better performance on certain workloads

– write heavy

Improving Concurrency

• 2.2 – Global Lock

• 2.4 – Database-level Locking

• 3.0 MMAPv1 – Collection-level Locking

• 3.0 WT – Document-level

– Writes no longer block all other writes

– Higher level of concurrency leads to more

CPU usage

Compression

• WT uses snappy compression by default

• Data is compressed on disk

• 2 supported compression algorithms

– snappy: default. Good compression, relatively low

overhead

– zlib: Better

• Indexes are compressed using prefix

compression

– Allows compression in memory

Consistency without Journaling

• MMAPv1 uses write-ahead log (journal) to

guarantee consistency

• WT doesn't have this need: no in-place updates

– Write-ahead log committed at checkpoints

• 2GB or 60sec by default – configurable!

– No journal commit interval: writes are written to

journal as they come in

– Better for insert-heavy workloads

• Replication guarantees the durability

MMAPv1 vs. WT

How to run MMAPv1

• MMAPv1 is the default

jrangel@rangel:~$ mongod --dbpath data

2015-02-16T23:41:19.758+0000 I CONTROL [initandlisten] MongoDB starting : pid=66913 port=27017 dbpath=data 64-bit host=rangel

2015-02-16T23:41:19.759+0000 I CONTROL [initandlisten] db version v3.0.0-rc8

...

2015-02-16T23:41:19.759+0000 I CONTROL [initandlisten] options: { storage: { dbPath: "data" } }

2015-02-16T23:41:19.770+0000 I JOURNAL [initandlisten] journal dir=data/journal

2015-02-16T23:41:19.770+0000 I JOURNAL [initandlisten] recover : no journal files present, no recovery needed

2015-02-16T23:41:19.787+0000 I JOURNAL [durability] Durability thread started

2015-02-16T23:41:19.788+0000 I JOURNAL [journal writer] Journal writer thread started

2015-02-16T23:41:19.789+0000 I INDEX [initandlisten] allocating new ns file data/local.ns, filling with zeroes...

2015-02-16T23:41:19.819+0000 I STORAGE [FileAllocator] allocating new datafile data/local.0, filling with zeroes...

2015-02-16T23:41:19.819+0000 I STORAGE [FileAllocator] creating directory data/_tmp

2015-02-16T23:41:19.895+0000 I STORAGE [FileAllocator] done allocating datafile data/local.0, size: 64MB, took 0.076 secs

2015-02-16T23:41:19.916+0000 I NETWORK [initandlisten] waiting for connections on port 27017

How to run WT

• mongod now has --storageEngine option

jrangel@rangel:~$ mongod --dbpath data --storageEngine wiredTiger

2015-02-16T23:49:07.342+0000 I CONTROL [initandlisten] MongoDB starting : pid=66958 port=27017 dbpath=data 64-bit host=rangel

2015-02-16T23:49:07.342+0000 I CONTROL [initandlisten] db version v3.0.0-rc8

...

2015-02-16T23:49:07.342+0000 I CONTROL [initandlisten] options: { storage: { dbPath: "data", engine: "wiredTiger" } }

2015-02-16T23:49:07.342+0000 I STORAGE [initandlisten] wiredtiger_open config:

create,cache_size=4G,session_max=20000,eviction=(threads_max=4),statistics=(fast),log=(enabled=true,archive=true,path=journal

,compressor=snappy),checkpoint=(wait=60,log_size=2GB),statistics_log=(wait=0),

2015-02-16T23:49:07.386+0000 I NETWORK [initandlisten] waiting for connections on port 27017

MMAPv1 Database Files

• MMAPv1 persists data to files per databases

– Indexes

– Data

WT Database Files

• Each collection & indexes stored in own file

Playing nice together

• Cannot

– Can't copy database files

– Can't just restart w/ same dbpath

• Yes we can!

– Initial sync from replica set works perfectly!

– mongodump/restore

• Rolling upgrade of replica set to WT:

– Shutdown secondary

– Delete dbpath

– Relaunch with --storageEngine=wiredTiger

– Rotate primary

Other WT configuration options

• Compression: --wiredTigerCollectionBlockCompressor

• YAML format for configuration

Gotchas!

• No 32-bit Support

– WT is 64bit only

• system.indexes & system.namespaces

deprecated

– Explicit commands: db.getIndexes() db.getCollectionNames()

https://tingbudongchine.files.wordpress.com/2012/08/lemonde1.jpeg

Small Demo

Recap

Benefits

Wider Range of Use Cases

How: Flexible Storage Architecture

• Fundamental rearchitecture, with new pluggable storage engine API

• Same data model, same query language, same ops

• But under the hood, many storage engines optimized for many use

cases

Single View Content Management

Real-Time Analytics Catalog

Internet of Things (IoT)Messaging

Log Data Tick Data

Up to 95% Lower Operational Overhead

How: MongoDB Ops Manager

• The best way to run MongoDB

• Automates core management

tasks

• Single-click provisioning, scaling,

upgrades, administration

• Monitoring, with charts,

dashboards & alerts on 100+

metrics

• Backup and restore, with point-in-

time recovery

7x-10x Performance, 50%-80% Less Storage

How: WiredTiger Storage Engine

• Same data model, same query

language, same ops

• Write performance gains driven

by document-level concurrency

control

• Storage savings driven by native

compression

• 100% backwards compatible

• Non-disruptive upgradeMongoDB 3.0MongoDB 2.6

Performance

http://www.humanandnatural.com/data/media/178/badan_jaran_desert_oasis_china.jpg

Please go and test it!

3.0.0-RC8: https://www.mongodb.org/downloads

https://jira.mongodb.org/

http://www.tinypm.com/blog/wp-content/uploads/2015/01/hammer.jpg

http://www.humanandnatural.com/data/media/178/badan_jaran_desert_oasis_china.jpg

Questions?

[email protected]

@j0nrang3l

http://www.mandywalker.com.au/wp-content/uploads/2013/07/Wall-with-Tools.jpg

We are hiring!

http://www.mongodb.com/careers/positions/consulting-engineer-emea