Post on 01-Nov-2014
description
Senior Director of Performance Engineering, MongoDB
Alvin Richards
#MongoDBDays
5 Things you need to know about Administering MongoDB
Senior Director of Performance Engineering, MongoDB
Alvin Richards
#MongoDBDays
Or…A Dev and Ops "Love Story"
Before we start…
• I'm assuming you have at least – installed MongoDB– Have done some basic Admin
• I'm not assuming you have– Been running in production for any length of time
Dev are from Mars and Ops are from Venus…
DEV OPS
5 Things you need to know
1. Know Thy Beast
2. Protecting data
3. Scaling Writes & Reads
4. Monitoring
5. Backing up & Restoring
#1Know thy beast
Architecture
• Shipped as Binaries
• tarball, zip file• Packages(rpm, yum)• MSI (for Windows)
• Runs a background process mongod (or mogod.exe)
• Uses Memory Mapped Files
Memory Mapped files – dude whoa
• To MongoDB memory is an infinite-ish pool
• 64 TB Linux / Windows 2012 R2 / Windows 8.1• 4 TB earlier Windows versions
• MongoDB accesses memory locations in virtual memory
• O/S may have to page in that data from storage
• O/S may have to page out data from RAM
• Physical RAM acts as a LRU cache
Collection 1
Index 1
Virtual Address Space 1
Collection 1
Index 1
This is your virtual memory size (mapped)
Physical RAM
Collection 1
Index 1 This is your
resident memory size
Virtual Address Space 1
Physical RAM
DiskCollection 1
Index 1
Virtual Address Space 1
Physical RAM
DiskCollection 1
Index 1
100 ns
10,000,000 ns
=
=
Virtual Address Space 1
• Working Set– Active Data & Indexes in Memory
• Sorting
• Aggregation & Map/Reduce
• Connections
What Memory is used for
Working Set Exceeds Physical Memory
> db.serverStatus( { workingSet: 1 } )
…
"workingSet" : {
"note" : "thisIsAnEstimate",
"pagesInMemory" : 378,
"computationTimeMicros" : 4104,
"overSeconds" : 4646
},
Where can I find this info?
> mongostat
…
Insert query update delete getmore command flushes mapped vsize res
*0 *0 *0 *0 0 1|0 0 160m 2.75g 5m
*0 *0 *0 *0 0 1|0 0 160m 2.75g 5m
> iostat
> sar
> vmstat
Windows: Perfmon
Where can I find this info?
#2Protecting data
Deployment - Single Node
Good
• Write ahead journaling for crash protection
Bad
• Cannot survive loss of one node
Primary – A
Oppps
http://i.ytimg.com/vi/PzWK9QCYgH8/hqdefault.jpg
Replica Set – Initialize
Replica Set – Failure
Replica Set – Failover
Replica Set – Recovery
Replica Set – Recovered
Oops… Multiple Failures
http://f1-grandprix.com/wp-content/uploads/wppa/565.jpg
Deployment–Single Data Center
Good
• Automated failover
• Tolerates server failures
• Tolerates rack failures
• Number of replicas defines failure tolerance
Bad
• Cannot survive loss of Data Center
Primary – A
Secondary – B
Secondary – C
Deployment - Active/Active Data Center
Good
• Tolerates server, rack, data center failures, network partitions
Bad
• Complexity but "MMS Automation to the rescue" ™
Data Center - US
US-A
Data Center - AsiaData Center - Europe
US-B
EU-C
EC-D
ASIA-E
> rs.status(){
"set" : "foo","date" : ISODate("2014-09-08T01:29:58Z"),"myState" : 1,"members" : [
{"_id" : 0,"name" : "vero.local:31000","health" : 1,"state" : 1,"stateStr" : "PRIMARY","uptime" : 28,
Useful commands
> rs.printReplicationInfo()
configured oplog size: 192MBlog length start to end: 0secs (0hrs)oplog first event time: Sun Sep 07 2014 18:17:52 GMT-0700 (PDT)oplog last event time: Sun Sep 07 2014 18:29:52 GMT-0700 (PDT)now: Sun Sep 07 2014 18:30:49 GMT-0700 (PDT)
Useful commands
Application Considerations
Dev Ops
Tunable Data Durability
Memory
Journal Secondary
Other Data Center
RDBMS
w=1
j=true
w="majority"w=nw="myTag"
Latency
Durability
async
sync
#3Scaling Writes & Reads
#3.1Scaling Writes
Sharding - Range distribution
shard01 shard02 shard03
sh.shardCollection("test.tweets", {user: 1} , false)
Sharding - Range distribution
shard01 shard02 shard03
a-i j-r s-z
Shard Key Options
• Single or Compound Key
• Regular or Hashed key
• Tagged
Sharding – Auto Splits
shard01 shard02 shard03
a-i ja-jz s-z
k-r
Sharding – Auto Splits
shard01 shard02 shard03
a-i ja-ji s-z
ji-js
js-jw
jz-r
Sharding - Auto Balancing
shard01 shard02 shard03
a-i ja-ji s-z
ji-js
js-jw
jz-r
js-jw
jz-r
Sharding - Goal Equilibrium
shard01 shard02 shard03
a-i ja-ji s-z
ji-js
js-jw
jz-r
Sharding - Find by Shard Key
shard01 shard02 shard03
a-i ja-ji s-z
ji-js
js-jw
jz-r
find({user: "alvin"})
Sharding - Find by Attribute
shard01 shard02 shard03
a-i ja-ji s-z
ji-js
js-jw
jz-r
find({email: "alvin@10gen.com"})
Application Considerations
Dev Ops
Shard Key
• What key to use?
• Plain or Hashed?
• What's the distribution & usage?
• Predicate & Sorts– Use the Shard key?– Scatter Gather + Sort?
> sh.status()... shards:
{ "_id" : "shard0000", "host" : "localhost:29001", "tags" : [ "dc-emea" ] }
{ "_id" : "shard0001", "host" : "localhost:29002", "tags" : [ "dc-asia" ] }
{ "_id" : "shard0002", "host" : "localhost:29003", "tags" : [ "dc-na" ] } databases:
{ "_id" : "scaleout", "partitioned" : true, "primary" : "shard0000" }
scaleout.blogsshard key: { "loc" : 1 }chunks: shard0000 1{ "loc" : { "$minKey" : 1 } } -->>
{ "loc" : { "$maxKey" : 1 } } on : shard0000 Timestamp(1, 0)
Useful commands
// Commands to start/Stop the balancer
> sh.startBalancer()
> sh.stopBalancer()
// Disable balancing for a specific collection
> sh.disableBalancing("students.grades")
// Runt the balancer during a specific time window
> db.settings.update({ _id : "balancer" },
{ $set : { activeWindow : { start : "<start-time>", stop : "<stop-time>" } } }, true )
Useful commands
#3.2Scaling Reads
Immediate vs. Eventual Consistency
• Do you want to read your own writes?– Use Sharding to Scale Reads
• Are you tolerant to read older versions of data?
– May read from Secondary's
Application Considerations
Dev Ops
PrimaryThread #1
Insert
Update
Read
Read
v1
✔
✔
v2
Immediate Consistency
Eventual Consistency
Primary Secondary
Thread #1
Insert
Read
v1
Thread #2
v1
✖
v1 does not exist
reads v1Update
Read
v2
v2
✖
Eventual Consistency with a Write Concern
Primary Secondary
Thread #1
Insert (w=2)
Read
v1
Thread #2
v1 reads v1
✔
✖
v1 does not exist
#4Monitoring
Pick your tool dejour
• Plugins for– Cacti, Zabix, Munin, NewRelic….
• MongoDB Monitoring (MMS)– mms.mongodb.com– Focused on MongoDB cluster well being
MMS
KPIs you care about
• MongoDB– MongoDB ops/sec– Read/Write queues– Index Misses– Locked %– Flush times– Replication Lag
KPIs you care about
• Storage– IOPS– Size– Data & Loading
Patterns
• Memory– Working Set
• CPU– Speed– Cores
• Network– Latency– Throughput
Example
MongoDB ops/sec
Page Faults / Sec
#5Backup & Restore
Options
• mongodump / mongorestore
• File system snaphot / rsync
• MMS Backup– Automated Cluster Backup / Restore– Snapshots– Point in Time recovery– OpLog based
MMS Backup & Restore
Other talks today
• 10.25 – Running MongoDB with confidence with MMS
• 12.45 – MongoDB Sharding
• 1.05 – Securing Your MongoDB Deployment
• 4.10 – Diagnostics and Debugging
Get MongoDB Certified
• 20% off our October exam with the code
• MongoDBSeattle20
• TODAY ONLY
• university.mongodb.com
A happy ever after?
Dev Ops
alvin@mongodb.com / @jonnyeight
Senior Director of Performance Engineering, MongoDB
Alvin Richards
#MongoDBDays
Thank You