What's New?! MongoDB 3.4 - Percona - 3... · Understanding MongoDB Enterprise Server vs. Percona...
Transcript of What's New?! MongoDB 3.4 - Percona - 3... · Understanding MongoDB Enterprise Server vs. Percona...
MongoDB 3.4:What's New?!
For Community Server and Percona Server for MongoDB
2
Agenda
● Who am I● What are the types of MongoDB softwares in the ecosystem● What did we get up to 3.2?● Things new or improved in 3.4● Understanding MongoDB Enterprise Server vs. Percona Server for
MongoDB● Questions
Who is David Murphy
4
MongoDB Practice Manager @ Percona
Past key roles
● Electronic Arts NoSQL / MySQl Architect● ObjectRocket / Rackspace MongoDB Lead / Architect
MySQL since 3.22 (yes that was a very long time ago)
Mongo Master Alumni and Contributor
Using MongoDB since 1.6
Software in the Ecosystem
6
What server software exists?
• MongoDB Community Server
• MongoDB Enterprise Advanced Server
• Percona Server for MongoDB
• TokuMX ( End of Lifed)
7
What server software exists?
• MongoDB Community Server
• MongoDB Enterprise Advanced Server
• Percona Server for MongoDB
• TokuMX ( End of Lifed)
8
What server software exists?
• MongoDB Community Server
• MongoDB Enterprise Advanced Server
• Percona Server for MongoDB
• TokuMX ( End of Lifed)
9
What server software exists?
• MongoDB Community Server
• MongoDB Enterprise Advanced Server
• Percona Server for MongoDB
• TokuMX ( End of Lifed)
What Features We Built up to 3.2?Recap on what was in MongoDB’s ecosystem prior to 3.4’s release
11
Replica Set elections
A history:
• Before 3.0
• New Protocol in 3.2
• Common Challenges
12
Document validation
• Able to validate documents during updates and insertions
• Specify on per-collection basis using the validator option, which takes a document that specifies the validation rules or expressions
• Can use any expressions except
• $near, $nearSphere, $text and $where
•
13
Sharding to this point
What sharding gives you:
• Parallel queries
• Easily scaling the number of nodes
• Automatic balancing for changes in load and data distribution*
• Zero single points of failure
• Automatic failover and recovery*
14
Indexes on all data matching a filter
• Index can be much smaller
• Drives you to need more indexes which can hurt performance
• Allows very compact state type indexes that only incur index updates when those fields activate
• Size of object in index to scan are reduced
• Challenge - Can only have 1 index per query pattern
Partial indexes
15
Joins in MongoDB?
• Arrays could always unwind, but needed to be stored in a single document
• Pipeline still has the 16MB limit, unless using $out
• Can join whole document together reducing main document size
Left outer Join ($lookup)
16
Mixing storage engines in Replica Sets
Different Workloads and Different needs:
• Sometimes your need fast reads
• Other times you need a balance
• Maybe you also need just need storage but not persistence
• Any node of a replica-set can be any single engine
New Things in 3.4Building on 3.2, what was added or improved?
18
Improved initial sync
• Slow on large nodes with >200G of data
• Even worse with many or complex indexes
• Network issues would cause it to start over
• Applies indexes and then the data, and uses multiple streams to copy the data
19
Collation - new feature!
Common reasons:
• Not all languages order characters the same way
• Sometimes with product sets, special ordering is wanted as well
• I want to spread my users out so big clients don’t share resources
• No specific need, but want to be future-ready
20
Shard/ReplSet tagging is now Zones
Common Reasons:
• Tags worked both with replica set and sharding
• With a replica set you could set some nodes to be better hardware for a use case than others
• With sharding based on shard key ranges you could make data stay in the EU or US for example
• In 3.4, these have been renamed to the simpler name of zones (but it’s the same effect)
21
Views - new feature!
● Views are the ability to store and run aggregations
• Appear as a collection
• Can apply security to the virtual collection vs raw data
• Not materialized, but still useful, use $out for materialized
22
Aggregation - recursive $lookup
Logical extensions of $lookup ($graphLookup)
• Performs a recursive search on a collection with options for restricting the search by
- recursion depth
- query filter
• From (source) cannot be sharded
• 100MB limit, ignores allowDiskUse
• No specific need but want to be future ready
23
Aggregation - faceted search
• Processes multiple aggregations inside the same stage
- Results stored 1 array per aggregation , with each stage being and element
- Practically, this means you could search by buckets of product ages colors, and other such things - with each characterization being a new subdocument in the result vs. multiple aggregations the application needs to pull together
• Restrictions include- $facet (nested)- $out- $geoNear- $indexStats- $collStats
24
Decimal type added
Adds real decimal type for precision and maths, no more need be said hopefully.
Need new driver to use this in your apps
25
General improvements
● Parallel balancing
● Read consistency
● Intra node communication compression
● Log redaction
26
Replaces the need to use Facebook’s “Flashback” tool (written by Parse)
• Required root access to do tcpdumps of incoming traffic
- Means must be in front of ALL mongos nodes if sharded, but also that SSL is not allowed
• Very useful to:- try sharding with a new key- Using a new index- Check for performance issues with new version- Verifying bug fixes with a new version
MongoReplay - new tool
MongoDB Enterprise Advanced vs. Percona Server for MongoDB
How do they differ, and are they compatible?
28
MongoDB Enterprise Advanced Server
Features (not in Community Server or open source)
● LDAP, External Auth● Audit Trail● Encrypted WiredTiger Engine● In Memory Engine● Bi Connector● Compass● Ops Manager / Atlas
29
Percona Server for MongoDB
Features (open source)
● LDAP, External Auth● Audit Trail● Hot binary backups for WiredTiger and MongoRocks● MongoRocks Engines (LSM)● In Memory Engine● Percona Monitoring and Management (more metric than Ops Manager)● Improved profiling● Percona Toolkit for MongoDB
○ pt-mongodb-summary○ pt-mongodb-query-digest
30
Questions???
● Twitter - dmurphy_data● Email - [email protected]● Github - dbmurphy
Percona is looking for rockstars! Be sure to talk to Jennifer @ Percona’s booth
Do you have any areas or benchmarks you want Percona to talk about in our Performance Blog? Email me and let me know.
31
Rate My Session