Deployment Preparedness

33
Technical Director, 10gen Alvin Richards #MongoDBTokyo Deployment Preparedness

Transcript of Deployment Preparedness

Page 1: Deployment Preparedness

Technical Director, 10gen

Alvin Richards

#MongoDBTokyo

Deployment Preparedness

Page 2: Deployment Preparedness

http://bit.ly/QlJULZ

Plan A because there is no Plan B

Page 3: Deployment Preparedness

Part One

Before you deploy…

Page 4: Deployment Preparedness

Prototype

Test

Monitor

CapacityPlanning

OpsPlaybook

Reinventing the wheel

Page 5: Deployment Preparedness

Essentials

• Disable NUMA

• Pick appropriate file-system (xfs, ext4)

• Pick 64-bit O/S– Recent Linux kernel, Win2k8R2

• More RAM– Spend on RAM not Cores

• Faster Disks– SSDs vs. SAN

– Separate Journal and Data Files

Page 6: Deployment Preparedness

Key things to consider

• Profiling– Baseline/Blue print: Understand what should happen

– Ensure good Index usage

• Monitoring– SNMP, munin, zabix, cacti, nagios

– MongoDB Monitoring Service (MMS)

• Sizing– Understand Capability (RAM, IOPs)

– Understand Use Cases + Schema

Page 7: Deployment Preparedness

What is your SLA?

• High Availability?– 24x7x365 operation?

– Limited maintenance window?

• Data Protection?– Failure of a Single Node?

– Failure of a Data Center?

• Disaster Recovery?– Manual or automatic failover?

– Data Center, Region, Continent?

Page 8: Deployment Preparedness

Build & Test your Playbook

• Backups

• Restores (backups are not enough)

• Upgrades

• Replica Set Operations

• Sharding Operations

Page 9: Deployment Preparedness

Part Two

Under the cover…

Page 10: Deployment Preparedness

How to see metrics

• mongostat

• MongoDB plug ins for– munin, zabix, cacti, ganglia

•Hosted Services– MMS - 10gen

– Server Density, Cloudkick

• Profiling

Page 11: Deployment Preparedness

Operation Counters

Page 12: Deployment Preparedness

Metrics in detail: opcounters

• Counts: Insert, Update, Delete, Query, Commands

• Operation counters are mostly straightforward: more is better

• Some operations in a replica set primary are accounted differently in a secondary

• getlastError(), system.status etc are also counted

Page 13: Deployment Preparedness

Resident Memory counter

Page 14: Deployment Preparedness

Metrics in detail: resident memory

• Key metric: to a very high degree, the performance of a mongod is a measure of how much data fits in RAM.

• If this quantity is stably lower than available physical memory, the mongod is likely performing well.

• Correlated metrics: page faults, B-Tree misses

Page 15: Deployment Preparedness

Page Faults counter

Page 16: Deployment Preparedness

Virtual

Address

Space 1

Physical

RAM

DiskCollection 1

Index 1

100 ns

10,000,000 ns

=

=

Page 17: Deployment Preparedness

Metrics in detail: page faults

• This measures reads or writes to pages of data file that aren't resident in memory

• If this is persistently non-zero, your data doesn't fit in memory.

• Correlated metrics: resident memory, B-Tree misses, iostats

Page 18: Deployment Preparedness

Working Set

> db.blogs.stats() {

"ns" : "test.blogs","count" : 1338330,"size" : 46915928,"avgObjSize" : 35.05557523181876,"storageSize" : 86092032,"numExtents" : 12,"nindexes" : 2,"lastExtentSize" : 20872960,"paddingFactor" : 1,"flags" : 0,"totalIndexSize" : 99860480,"indexSizes" : {

"_id_" : 55877632,"name_1" : 43982848

},"ok" : 1

}

Size of data

Size on disk (and

in memory!)

Size of all

indexes

Average

document size

Size of each

index

Page 19: Deployment Preparedness

Lock % counter

Page 20: Deployment Preparedness

Metrics in detail: lock percentage and queues

• By itself, lock % can be misleading: a high lock percentage just means that writing is happening.

• But when lock % is high and queued readers or writers is non-zero, then the mongod probably at its write capacity.

• Correlated metrics: iostats

Page 21: Deployment Preparedness

Mon Dec 3 15:05:37 [conn81]

getmore scaleout.nodes query: { ts: { $lte: new Date(1354547123142) } }

cursorid:8607875337747748011 ntoreturn:0 keyUpdates:0 numYields: 216locks(micros) r:615830 nreturned:27055 reslen:4194349 551ms

Log file

Page 22: Deployment Preparedness

// explain() shows the plan used by the operation

> db.c.find(<query>).explain()

// hint() forces a query to use a specific index// x_1 is the name of the index from db.c.getIndexes()

> db.c.find( {x:1} ).hint("x_1")

explain, hint

Page 23: Deployment Preparedness

B-Tree Counter

Page 24: Deployment Preparedness

Metrics in detail: B-Tree

• Indicates b-tree accesses including page fault service during an index lookup

• If misses are persistently non-zero, your indexes don't fit in RAM. (You might need to change or drop indexes, or shard your data.)

• Correlated metrics: resident memory, page faults, iostats

Page 25: Deployment Preparedness

B-Trees' strengths

• B-Tree indexes are designed for range queries over a single dimension

• Think of a compound index on { A, B } as being an index on the concatenation of the A and B values in documents

• MongoDB can use its indexes for sorting as well

Page 26: Deployment Preparedness

B-Trees' weaknesses

• Ranges queries on the first field of a compound index are suboptimal

• Range queries over multiple dimensions are suboptimal

• In both these cases, a suboptimal index might be better than nothing, but best is to try to see if you can't change the problem

Page 27: Deployment Preparedness

Indexing dark corners

• Some functionality can't currently always use indexes:

– $where JavaScript clauses

– $mod, $not, $ne

– regex

• Negation maybe transformed into a range query– Index can be used

• Complicated regular expressions scan a whole index

Page 28: Deployment Preparedness

Other tricks

Page 29: Deployment Preparedness

> db.c.find( {unused_key: 1} ).explain()> db.c.find( {unused_key: 1} )

.hint( {random_index:1} )

.explain()

# cat /data/db/* > /dev/null

// New in 2.2> db.runCommand( { touch: "blogs",

data: true, index: true } )

Warming the Cache

Page 30: Deployment Preparedness

Journal on another disk

•The journal's write load is very different than the data files

– journal = append-only

– data files = randomly accessed

•Putting the journal on a separate disk or RAID (e.g., with a symlink) will minimize any seek-time related journaling overhead

Page 31: Deployment Preparedness

--directoryperdb

• Allows storage tiering– Different access patterns

– Different Disk Types / Speeds

• use --directoryperdb

• add symlink into database directory

Page 32: Deployment Preparedness

// Change logging level to get more info

> db.adminCommand({ setParameter: 1, logLevel: 1 })

> db.adminCommand({ setParameter: 1, logLevel: 0 })

Dynamically change log level

Page 33: Deployment Preparedness

http://bit.ly/QlJULZ

Because you now have a Plan B