MongoDB Management Service (MMS): Session 01: Getting Started with MMS
Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track
-
Upload
mongodb -
Category
Technology
-
view
1.158 -
download
0
description
Transcript of Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track
![Page 1: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/1.jpg)
Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track
Angshuman Bagchi ([email protected])Technical Services Engineer
![Page 2: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/2.jpg)
Agenda
• What is MMS Monitoring?• What are Alerts?• How to pick an Alert?• Five recommended Alerts• Wrap up
![Page 3: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/3.jpg)
What is MMS Monitoring?
![Page 4: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/4.jpg)
![Page 5: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/5.jpg)
![Page 6: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/6.jpg)
Who uses MMS?
![Page 7: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/7.jpg)
What are MMS alerts?
![Page 8: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/8.jpg)
Source:http://www.cleanfunnypics.com/no-its-not-empty/#axzz2pqknJJbC
![Page 9: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/9.jpg)
![Page 10: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/10.jpg)
How to pick an Alert?
![Page 11: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/11.jpg)
• Is there an absolute limit to alert on?• What is normal (baseline) ?• What is worrying (warning) ?• What is a definite problem (critical) ?• Likelihood of false positives ?
... there is no magic formula
![Page 12: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/12.jpg)
Five recommended alerts
• Host Recovering (All, but by definition Secondary)
• Replication Lag (Secondary)• Connections (All mongos, mongod)• Lock % (Primary, Secondary)• Replica (Primary, Secondary)
![Page 13: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/13.jpg)
Host Recovering
• General alert triggered if any instance enters RECOVERING mode
• Required for all use-cases• All Replica Sets should have this. • Sometimes, during maintenance this
may be expected
![Page 14: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/14.jpg)
Host Recovering
![Page 15: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/15.jpg)
Replication Lag
• No secondary should be behind• Secondary reads affected• All Replica Sets should have this• Only exception is configured slaveDelay
![Page 16: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/16.jpg)
Replication Lag
Absolute Limit?Yes, about 1 or 2s. To prevent false positives absolute threshold > 240s should be alerted
Normal Lag is ideally 0s
Worrying < 60s, some false positives
Critical > 240s
False positives Above 240s likelihood low.
![Page 17: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/17.jpg)
Example: replication lag
150,000s of lag ~ almost 2 days of lag!
![Page 18: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/18.jpg)
Example: replication lag
• Secondaries under specified vs primaries• Access patterns between primary /
secondaries• Insufficient bandwidth• Foreground index builds on secondaries
“…when you have eliminated the impossible, whatever remains, however improbable, must be the truth…” -- Sherlock Holmes
Sir Arthur Conan Doyle, The Sign of the Four
![Page 19: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/19.jpg)
Example: replication lag
Example:• ~1500 ops per minute (opcounters)• 0.1 MB per object (average object size,
local db)
~1500 ops/min / 60 seconds * 0.1 MB/op * 8b/B =~ 20 mbps required bandwidth
![Page 20: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/20.jpg)
Connections
• Each connection consumes ~ 1MB and a file descriptor
• 5000 connections => 5GB of RAM• Stability and predictability are key
![Page 21: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/21.jpg)
Pro-Tip: know thyself
You have to recognize normal to know when it isn’t.
Source: http://www.flickr.com/photos/skippy/6853920/
![Page 22: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/22.jpg)
Connections
Absolute Limit? Yes, but this is too high. We need to alert before that
NormalTBD based on deployment, number of nodes, connection pool settings, app servers, load etc. Say, X during peak load
Worrying 50% increase, so, 1.5X
Critical Double, so 2X
![Page 23: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/23.jpg)
Lock %
• Lock contention degrades performance• High lock % starves replication, reads.• Bounds need to be determined
![Page 24: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/24.jpg)
Lock %
Absolute Limit?Yes, >80% occasional degraded performance, 90% major impact regularly
NormalTBD. Write heavy loads see higher values. Normal, say X% during peak load
Worrying Double, so approximately 2X%
Critical TBD. For Prod > 80%
![Page 25: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/25.jpg)
Replica
• Represents oplog window• Depends on
– Rate of operations inserted into oplog– Size of operations– Size of oplog capped collection
• Normal maintenance window X 3 • Resizing the oplog is non-trivial
![Page 26: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/26.jpg)
Replica
Absolute Limit? 50% below Normal
Normal TBD. Say X hours during peak
Worrying 25% below Normal
Critical 50% below Normal
![Page 27: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/27.jpg)
Summary
• Use similar approach for other metrics• Different audiences for alerts
– Worrying alerts ops team– Critical goes out to a wider audience
• Get started with MMS Monitoring and alerts!
![Page 28: Webinar: Five MMS Monitoring Alerts to Keep Your MongoDB Deployment on Track](https://reader035.fdocuments.net/reader035/viewer/2022062616/54b7168f4a7959286f8b4621/html5/thumbnails/28.jpg)
I got alerted … now what?