Deployment Preparedness
Transcript of Deployment Preparedness
Technical Director, 10gen
Alvin Richards
#MongoDBTokyo
Deployment Preparedness
http://bit.ly/QlJULZ
Plan A because there is no Plan B
Part One
Before you deploy…
Prototype
Test
Monitor
CapacityPlanning
OpsPlaybook
Reinventing the wheel
Essentials
• Disable NUMA
• Pick appropriate file-system (xfs, ext4)
• Pick 64-bit O/S– Recent Linux kernel, Win2k8R2
• More RAM– Spend on RAM not Cores
• Faster Disks– SSDs vs. SAN
– Separate Journal and Data Files
Key things to consider
• Profiling– Baseline/Blue print: Understand what should happen
– Ensure good Index usage
• Monitoring– SNMP, munin, zabix, cacti, nagios
– MongoDB Monitoring Service (MMS)
• Sizing– Understand Capability (RAM, IOPs)
– Understand Use Cases + Schema
What is your SLA?
• High Availability?– 24x7x365 operation?
– Limited maintenance window?
• Data Protection?– Failure of a Single Node?
– Failure of a Data Center?
• Disaster Recovery?– Manual or automatic failover?
– Data Center, Region, Continent?
Build & Test your Playbook
• Backups
• Restores (backups are not enough)
• Upgrades
• Replica Set Operations
• Sharding Operations
Part Two
Under the cover…
How to see metrics
• mongostat
• MongoDB plug ins for– munin, zabix, cacti, ganglia
•Hosted Services– MMS - 10gen
– Server Density, Cloudkick
• Profiling
Operation Counters
Metrics in detail: opcounters
• Counts: Insert, Update, Delete, Query, Commands
• Operation counters are mostly straightforward: more is better
• Some operations in a replica set primary are accounted differently in a secondary
• getlastError(), system.status etc are also counted
Resident Memory counter
Metrics in detail: resident memory
• Key metric: to a very high degree, the performance of a mongod is a measure of how much data fits in RAM.
• If this quantity is stably lower than available physical memory, the mongod is likely performing well.
• Correlated metrics: page faults, B-Tree misses
Page Faults counter
Virtual
Address
Space 1
Physical
RAM
DiskCollection 1
Index 1
100 ns
10,000,000 ns
=
=
Metrics in detail: page faults
• This measures reads or writes to pages of data file that aren't resident in memory
• If this is persistently non-zero, your data doesn't fit in memory.
• Correlated metrics: resident memory, B-Tree misses, iostats
Working Set
> db.blogs.stats() {
"ns" : "test.blogs","count" : 1338330,"size" : 46915928,"avgObjSize" : 35.05557523181876,"storageSize" : 86092032,"numExtents" : 12,"nindexes" : 2,"lastExtentSize" : 20872960,"paddingFactor" : 1,"flags" : 0,"totalIndexSize" : 99860480,"indexSizes" : {
"_id_" : 55877632,"name_1" : 43982848
},"ok" : 1
}
Size of data
Size on disk (and
in memory!)
Size of all
indexes
Average
document size
Size of each
index
Lock % counter
Metrics in detail: lock percentage and queues
• By itself, lock % can be misleading: a high lock percentage just means that writing is happening.
• But when lock % is high and queued readers or writers is non-zero, then the mongod probably at its write capacity.
• Correlated metrics: iostats
Mon Dec 3 15:05:37 [conn81]
getmore scaleout.nodes query: { ts: { $lte: new Date(1354547123142) } }
cursorid:8607875337747748011 ntoreturn:0 keyUpdates:0 numYields: 216locks(micros) r:615830 nreturned:27055 reslen:4194349 551ms
Log file
// explain() shows the plan used by the operation
> db.c.find(<query>).explain()
// hint() forces a query to use a specific index// x_1 is the name of the index from db.c.getIndexes()
> db.c.find( {x:1} ).hint("x_1")
explain, hint
B-Tree Counter
Metrics in detail: B-Tree
• Indicates b-tree accesses including page fault service during an index lookup
• If misses are persistently non-zero, your indexes don't fit in RAM. (You might need to change or drop indexes, or shard your data.)
• Correlated metrics: resident memory, page faults, iostats
B-Trees' strengths
• B-Tree indexes are designed for range queries over a single dimension
• Think of a compound index on { A, B } as being an index on the concatenation of the A and B values in documents
• MongoDB can use its indexes for sorting as well
B-Trees' weaknesses
• Ranges queries on the first field of a compound index are suboptimal
• Range queries over multiple dimensions are suboptimal
• In both these cases, a suboptimal index might be better than nothing, but best is to try to see if you can't change the problem
Indexing dark corners
• Some functionality can't currently always use indexes:
– $where JavaScript clauses
– $mod, $not, $ne
– regex
• Negation maybe transformed into a range query– Index can be used
• Complicated regular expressions scan a whole index
Other tricks
> db.c.find( {unused_key: 1} ).explain()> db.c.find( {unused_key: 1} )
.hint( {random_index:1} )
.explain()
# cat /data/db/* > /dev/null
// New in 2.2> db.runCommand( { touch: "blogs",
data: true, index: true } )
Warming the Cache
Journal on another disk
•The journal's write load is very different than the data files
– journal = append-only
– data files = randomly accessed
•Putting the journal on a separate disk or RAID (e.g., with a symlink) will minimize any seek-time related journaling overhead
--directoryperdb
• Allows storage tiering– Different access patterns
– Different Disk Types / Speeds
• use --directoryperdb
• add symlink into database directory
// Change logging level to get more info
> db.adminCommand({ setParameter: 1, logLevel: 1 })
> db.adminCommand({ setParameter: 1, logLevel: 0 })
Dynamically change log level
http://bit.ly/QlJULZ
Because you now have a Plan B