MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
-
Upload
mongodb -
Category
Data & Analytics
-
view
1.204 -
download
1
Transcript of MongoDB 3.0 and WiredTiger (Event: An Evening with MongoDB Dallas 3/10/15)
What’s New in MongoDB 3.0
Jake Angerman Sr. Solutions Architect, MongoDB
Agenda
Agenda
• Pluggable Storage Engines • WiredTiger Storage Engine
– Document-Level Locking Concurrency Control – Compression – Installation & Upgrade
• Other New Stuff in 3.0 • Public Service Announcement • There will be a test at the end
Pluggable Storage Engines
How does MongoDB persist data?
• MongoDB <= 2.6 – MMAPv1 Storage Engine – Uses Memory Mapped Files
• MongoDB 3.0 – MMAPv1
• still the default • now with collection-level locking!
– WiredTiger
Storage Engine
Content Repo
IoT Sensor Backend Ad Service Customer
Analytics Archive
MongoDB Query Language (MQL) + Native Drivers
MongoDB Document Data Model
MMAP V1 WT In-Memory ? ?
Supported in MongoDB 3.0 Future Possible Storage Engines
Man
agem
ent
Sec
urity
Example Future State
Experimental
Storage Engine API
• Allows to "plug-in" different storage engines – Different working sets require different performance
characteristics – MMAPv1 is not ideal for all workloads – More flexibility: you can mix storage engines on same
replica set/sharded cluster • Opportunity to integrate further (HDFS, native encrypted,
hardware optimized …)
WiredTiger
History
• Authors Former Members of Berkeley DB team – WT product and team acquired by MongoDB – Standalone Engine already in use in large
deployments including Amazon
Why is WiredTiger Awesome
• Document-level concurrency • Compression • Consistency without journaling • Better performance on certain workloads
– write heavy • Vertically scalable
– Allows full hardware utilization – More tunable
Document-Level Concurrency
• Uses algorithms to minimize contention between threads – One thread yields on write contention to same document – Atomic update replaces latching/locking
• Writes no longer block all other writers • CPU utilization directly correlates with
performance
50%-80% Less Storage via Compression
• Better storage utilization • Higher I/O scalability • Multiple compression options
– Snappy (default) - Good compression benefits with little CPU/performance impact
– zlib - Extremely good compression at a cost of additional CPU/degraded performance
– None • Data and journal compressed on disk • Indexes compressed on disk and in memory • No more cryptic field names in documents!
WiredTiger Internals
Filesystem Layout
• Data stored as conventional B+ tree on disk • Each collection and index stored in own file • WT fails to start if MMAPv1 files found in
dbpath • No in-place updates
– Rewrites document every time, reuses space – No more padding factor!
• Journal has own folder under dbpath • You can now store indexes on separate
volumes!
Cache
• WT uses two caches – WiredTiger cache stores uncompressed data
• ideally, working set fits in WT cache – File system cache stores compressed data – WT cache uses higher value of 50% of
system memory or 1GB (by default)
Supported Platforms
• Supported Platforms – Linux – Windows – Mac OSX
• Non-Supported Platforms – NO Solaris (yet) – NO 32Bit (ever)
Gotchas
• Deprecate MMAPv1-specific catalog metadata – system.indexes & system.namespaces – System metadata should be accessed via
explicit commands going forward db.getIndexes() db.getCollectionNames()
• Cold start penalty – due to separate WiredTiger cache
How to Run WiredTiger
How Do I Install It?
• If starting from scratch add 1 additional flag when launching mongod: --storageEngine=wiredTiger
How Do I Upgrade to it? • 2 ways:
1. Mongodump/Mongorestore 2. Initial sync a new replica member running
WT • Note: you can run replicas with mixed
storage engines • CANNOT copy raw data files!
– WT will fail to start if wrong data format in dbpath
Other New Stuff in 3.0
Native Auditing for Any Operation
• Essential for many compliance standards (e.g., PCI DSS, HIPAA, NIST 800-53, European Union Data Protection Directive)
• MongoDB Native Auditing – Construct and filter audit trails for any operation
against the database, whether DML, DCL or DDL – Can filter by user or action – Audit log can be written to multiple destinations
50 Node Replica Sets
Enhanced Query Language and Tools
• All Tools rewritten in GO – Smaller Package Size – More rapid iteration – Faster Loading and Export
• Easier Query Optimization – Explain 2.0
• Improved Logging System – Faster Debugging
• Aggregation Framework Improvements • Geospatial Index Improvements
Single-click provisioning, scaling & upgrades, admin tasks Monitoring, with charts, dashboards and alerts on 100+ metrics Backup and restore, with point-in-time recovery, support for sharded clusters
MMS & Ops Manager 1.6
The Best Way to Manage MongoDB Up to 95% Reduction in Operational Overhead
A Public Service Announcement
Please Upgrade to the Latest Version
• 2.4.14 • 2.6.9
25% off discount code: JakeAngerman
Questions?