Ops Jumpstart: MongoDB Administration 101

Senior Director of Performance Engineering, MongoDB

Alvin Richards

#MongoDBDays

5 Things you need to know about Administering MongoDB

Alvin Richards

#MongoDBDays

Or…A Dev and Ops "Love Story"

Before we start…

• I'm assuming you have at least – installed MongoDB– Have done some basic Admin

• I'm not assuming you have– Been running in production for any length of time

Dev are from Mars and Ops are from Venus…

DEV OPS

5 Things you need to know

1. Know Thy Beast

2. Protecting data

3. Scaling Writes & Reads

4. Monitoring

5. Backing up & Restoring

#1Know thy beast

Architecture

• Shipped as Binaries

• tarball, zip file• Packages(rpm, yum)• MSI (for Windows)

• Runs a background process mongod (or mogod.exe)

• Uses Memory Mapped Files

Memory Mapped files – dude whoa

• To MongoDB memory is an infinite-ish pool

• 64 TB Linux / Windows 2012 R2 / Windows 8.1• 4 TB earlier Windows versions

• MongoDB accesses memory locations in virtual memory

• O/S may have to page in that data from storage

• O/S may have to page out data from RAM

• Physical RAM acts as a LRU cache

Collection 1

Index 1

Virtual Address Space 1

Collection 1

Index 1

This is your virtual memory size (mapped)

Physical RAM

Collection 1

Index 1 This is your

resident memory size

Physical RAM

DiskCollection 1

Index 1

Physical RAM

DiskCollection 1

Index 1

100 ns

10,000,000 ns

• Working Set– Active Data & Indexes in Memory

• Sorting

• Aggregation & Map/Reduce

• Connections

What Memory is used for

Working Set Exceeds Physical Memory

> db.serverStatus( { workingSet: 1 } )

"workingSet" : {

"note" : "thisIsAnEstimate",

"pagesInMemory" : 378,

"computationTimeMicros" : 4104,

"overSeconds" : 4646

Where can I find this info?

> mongostat

Insert query update delete getmore command flushes mapped vsize res

*0 *0 *0 *0 0 1|0 0 160m 2.75g 5m

> iostat

> vmstat

Windows: Perfmon

Where can I find this info?

#2Protecting data

Deployment - Single Node

• Write ahead journaling for crash protection

• Cannot survive loss of one node

Primary – A

http://i.ytimg.com/vi/PzWK9QCYgH8/hqdefault.jpg

Replica Set – Initialize

Replica Set – Failure

Replica Set – Failover

Replica Set – Recovery

Replica Set – Recovered

Oops… Multiple Failures

http://f1-grandprix.com/wp-content/uploads/wppa/565.jpg

Deployment–Single Data Center

• Automated failover

• Tolerates server failures

• Tolerates rack failures

• Number of replicas defines failure tolerance

• Cannot survive loss of Data Center

Primary – A

Secondary – B

Secondary – C

Deployment - Active/Active Data Center

• Tolerates server, rack, data center failures, network partitions

• Complexity but "MMS Automation to the rescue" ™

Data Center - US

Data Center - AsiaData Center - Europe

ASIA-E

> rs.status(){

"set" : "foo","date" : ISODate("2014-09-08T01:29:58Z"),"myState" : 1,"members" : [

{"_id" : 0,"name" : "vero.local:31000","health" : 1,"state" : 1,"stateStr" : "PRIMARY","uptime" : 28,

Useful commands

> rs.printReplicationInfo()

configured oplog size: 192MBlog length start to end: 0secs (0hrs)oplog first event time: Sun Sep 07 2014 18:17:52 GMT-0700 (PDT)oplog last event time: Sun Sep 07 2014 18:29:52 GMT-0700 (PDT)now: Sun Sep 07 2014 18:30:49 GMT-0700 (PDT)

Useful commands

Application Considerations

Dev Ops

Tunable Data Durability

Memory

Journal Secondary

Other Data Center

j=true

w="majority"w=nw="myTag"

Latency

Durability

#3Scaling Writes & Reads

#3.1Scaling Writes

Sharding - Range distribution

shard01 shard02 shard03

sh.shardCollection("test.tweets", {user: 1} , false)

Sharding - Range distribution

a-i j-r s-z

Shard Key Options

• Single or Compound Key

• Regular or Hashed key

• Tagged

Sharding – Auto Splits

a-i ja-jz s-z

Sharding – Auto Splits

a-i ja-ji s-z

Sharding - Auto Balancing

a-i ja-ji s-z

Sharding - Goal Equilibrium

a-i ja-ji s-z

Sharding - Find by Shard Key

a-i ja-ji s-z

find({user: "alvin"})

Sharding - Find by Attribute

a-i ja-ji s-z

find({email: "alvin@10gen.com"})

Dev Ops

Shard Key

• What key to use?

• Plain or Hashed?

• What's the distribution & usage?

• Predicate & Sorts– Use the Shard key?– Scatter Gather + Sort?

> sh.status()... shards:

{ "_id" : "shard0000", "host" : "localhost:29001", "tags" : [ "dc-emea" ] }

{ "_id" : "shard0001", "host" : "localhost:29002", "tags" : [ "dc-asia" ] }

{ "_id" : "shard0002", "host" : "localhost:29003", "tags" : [ "dc-na" ] } databases:

{ "_id" : "scaleout", "partitioned" : true, "primary" : "shard0000" }

scaleout.blogsshard key: { "loc" : 1 }chunks: shard0000 1{ "loc" : { "$minKey" : 1 } } -->>

{ "loc" : { "$maxKey" : 1 } } on : shard0000 Timestamp(1, 0)

Useful commands

// Commands to start/Stop the balancer

> sh.startBalancer()

> sh.stopBalancer()

// Disable balancing for a specific collection

> sh.disableBalancing("students.grades")

// Runt the balancer during a specific time window

> db.settings.update({ _id : "balancer" },

{ $set : { activeWindow : { start : "<start-time>", stop : "<stop-time>" } } }, true )

Useful commands

#3.2Scaling Reads

Immediate vs. Eventual Consistency

• Do you want to read your own writes?– Use Sharding to Scale Reads

• Are you tolerant to read older versions of data?

– May read from Secondary's

Dev Ops

PrimaryThread #1

Insert

Update

Immediate Consistency

Eventual Consistency

Primary Secondary

Thread #1

Insert

Thread #2

v1 does not exist

reads v1Update

Eventual Consistency with a Write Concern

Primary Secondary

Thread #1

Insert (w=2)

Thread #2

v1 reads v1

v1 does not exist

#4Monitoring

Pick your tool dejour

• Plugins for– Cacti, Zabix, Munin, NewRelic….

• MongoDB Monitoring (MMS)– mms.mongodb.com– Focused on MongoDB cluster well being

KPIs you care about

• MongoDB– MongoDB ops/sec– Read/Write queues– Index Misses– Locked %– Flush times– Replication Lag

KPIs you care about

• Storage– IOPS– Size– Data & Loading

Patterns

• Memory– Working Set

• CPU– Speed– Cores

• Network– Latency– Throughput

Example

MongoDB ops/sec

Page Faults / Sec

#5Backup & Restore

Options

• mongodump / mongorestore

• File system snaphot / rsync

• MMS Backup– Automated Cluster Backup / Restore– Snapshots– Point in Time recovery– OpLog based

MMS Backup & Restore

Other talks today

• 10.25 – Running MongoDB with confidence with MMS

• 12.45 – MongoDB Sharding

• 1.05 – Securing Your MongoDB Deployment

• 4.10 – Diagnostics and Debugging

Get MongoDB Certified

• 20% off our October exam with the code

• MongoDBSeattle20

• TODAY ONLY

• university.mongodb.com

A happy ever after?

Dev Ops

alvin@mongodb.com / @jonnyeight

Alvin Richards

#MongoDBDays

Thank You

Ops Jumpstart: MongoDB Administration 101

Technology

Transcript of Ops Jumpstart: MongoDB Administration 101

DFTMAX Jumpstart

Ops Jumpstart: MongoDB Management Service

MongoDB Europe 2016 - MongoDB, Ops Manager & Docker at SNCF

SystemCPI Jumpstart

Webinar: General Technical Overview of MongoDB for Ops Teams

Project Jumpstart!

MongoDB Days UK: Ops Best Practices with MongoDB Cloud Manager

MongoDB Days UK: Jumpstart: Schema Design

Jumpstart: Silverlight

SharePoint Jumpstart

Jumpstart! Numeracy

JumpStart Booklet

Jumpstart Cabling

mongodb training | mongodb online training | mongodb training and certification | mongodb course

MongoDB Europe 2016 - MongoDB Atlas

Azure JumpStart

jumpstart ps1100

JumpStart 2013

Business Jumpstart: The Right (and Wrong) Use Cases for MongoDB

MongoDB Europe 2016 - Ops Manager and Cloud Manager