MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc,...

43
Nov-7-2018 MongoDB HA, what can go wrong?

Transcript of MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc,...

Page 1: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Nov-7-2018

MongoDB HA, what can go wrong?

Page 2: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

● Location: Skopje, Republic of Macedonia

● Education: MSc, Software Engineering

● Experience:○ Lead Database Consultant (since 2016)

○ Database Consultant (2012 - 2016)

○ Web Developer, DBA (2007 - 2012)

● Certifications: C100DBA - MongoDB certified DBA (since 2016)

● Percona speaker since 2016

https://mk.linkedin.com/in/igorle @igorle

About me

© 2018 Pythian. Confidential

Page 3: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Overview• What is replica set, how replication works

• Replication concept

• Replica set features, deployment architectures

• Hidden nodes, Arbiter nodes, Priority 0 nodes

• Production failures

• Monitoring replica set

• QA

© 2018 Pythian. Confidential

Page 4: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Replication

© 2018 Pythian. Confidential

Page 5: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

• Group of mongod processes that maintain the same data set

• Redundancy and high availability

• Increased read capacity (scaling reads)

• Automatic failover

Replica set

© 2018 Pythian. Confidential

# Members # Nodes Required to Elect New Primary Fault Tolerance

3 2 1

4 3 1

5 3 2

6 4 2

7 4 3

Page 6: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Replication concept1. Write operations go to the Primary node2. All changes are recorded into operations log

3. Asynchronous replication to Secondary

4. Secondaries copy the Primary oplog

5. Secondary can use sync source Secondary

© 2018 Pythian. Confidential

1.

Page 7: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Replication concept1. Write operations go to the Primary node

2. All changes are recorded into operations log

3. Asynchronous replication to Secondary

4. Secondaries copy the Primary oplog

5. Secondary can use sync source Secondary

© 2018 Pythian. Confidential

2. oplog

1.

Page 8: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Replication concept1. Write operations go to the Primary node

2. All changes are recorded into operations log

3. Asynchronous replication to Secondary4. Secondaries copy the Primary oplog

5. Secondary can use sync source Secondary

© 2018 Pythian. Confidential

2. oplog

1.

3. 3.

Page 9: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Replication concept1. Write operations go to the Primary node

2. All changes are recorded into operations log

3. Asynchronous replication to Secondary

4. Secondaries copy the Primary oplog

5. Secondary can use sync source Secondary

© 2018 Pythian. Confidential

2. oplog

1.

3. 3.

4. 4.

Page 10: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Replication concept1. Write operations go to the Primary node

2. All changes are recorded into operations log

3. Asynchronous replication to Secondary

4. Secondaries copy the Primary oplog

5. Secondary can use sync source Secondary*

*settings.chainingAllowed (true by default)

© 2018 Pythian. Confidential

2. oplog

1.

3. 3.

4. 4.

5.

Page 11: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Replica set oplog• Special capped collection that keeps a rolling record of all operations that

modify the data stored in the databases

• Idempotent

• Default oplog size

For Unix and Windows systemsStorage Engine Default Oplog Size Lower Bound Upper Bound

In-memory 5% of physical memory 50MB 50GB

WiredTiger 5% of free disk space 990MB 50GB

MMAPv1 5% of free disk space 990MB 50GB

© 2017 Pythian. Confidential

Page 12: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Configuration

© 2018 Pythian. Confidential

Page 13: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Configuration options• 50 members per replica set (7 voting members)

• Arbiter node

• Priority 0 node

• Hidden node

• Delayed node

© 2018 Pythian. Confidential

Page 14: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

• Does not hold copy of data

• Votes in elections

Arbiter node

hidden : true

© 2018 Pythian. Confidential

Arbiter

Page 15: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Priority 0 nodePriority - floating point (i.e. decimal) number between 0 and 1000

• Cannot become primary, cannot trigger election

• Visible to application (accepts reads/writes)

• Votes in elections

© 2018 Pythian. Confidential

Secondarypriority : 0

Page 16: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Hidden node• Not visible to application

• Never becomes primary, but can vote in elections

• Use cases○ reporting ○ backups

hidden : true

© 2018 Pythian. Confidential

hidden: true priority:0

Secondaryhidden : true priority : 0

Page 17: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Delayed node• Must be priority 0 member

• Should be hidden member (not mandatory)

• Mainly used for backups (historical snapshot of data)

• Recovery in case of human error

© 2018 Pythian. Confidential

SecondaryslaveDelay : 3600priority : 0hidden : true

Page 18: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Failures

© 2018 Pythian. Confidential

Page 19: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Small oplog size1. Primary/Secondary node down

○ Node failure

○ Planned maintenance

2. Automatic Failover

…… (several hours later)

3. New Primary overwrites latest oplog

4. Failed Node needs resync

MongoDB >= 3.6: db.adminCommand({replSetResizeOplog: 1, size: 32000})

© 2018 Pythian. Confidential

Page 20: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Arbiter nodes

© 2018 Pythian. Confidential

● Votes in election

● Does not hold copy of data

● If 2 nodes are down, no majority to elect

new Primary

● Fault tolerance is still 1 node

● 4 data nodes + 1 Arbiter makes more sense

Heartbeat

Page 21: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Priority 0 nodes

● Application driver sends writes to Primary

● Reads go to Primary by default

● Secondaries can serve reads

● Read preference

○ primary○ primaryPreferred○ secondary○ secondaryPreferred○ nearest

© 2018 Pythian. Confidential

Page 22: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

• Primary node fails

• Replica set starts election for new Primary

• Zero nodes eligible for Primary

• Application can not send writes

• Database is read only*

*depends on read preference setting

Priority 0 nodes

© 2018 Pythian. Confidential

Page 23: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Hidden nodes

© 2018 Pythian. Confidential

● Application driver sends writes to Primary

● Reads go to Primary by default

● Secondaries cannot serve reads

● Read preference

○ primary

Page 24: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

• Primary node fails

• Replica set starts election for new Primary

• Zero nodes eligible for Primary (priority:0)

• Application can not send writes/reads

• Downtime

Hidden nodes

© 2018 Pythian. Confidential

Page 25: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

• Primary node fails

• Secondary elected as new Primary

• Working set does not fit in memory

• Performance degradation

• Application stalls

Hardware

© 2018 Pythian. Confidential

64GB RAM, 16 CPU

32GB RAM, 8 CPU 32GB RAM, 8 CPU

Page 26: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

• Dataset grows

• No Disk space on Secondary

• mongod process fail

• 2 nodes replica set

• Zero tolerance for failures

Hardware

© 2018 Pythian. Confidential

Disk: 300GB

Disk: 300GB Disk: 200GB

Page 27: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

• All replica set members deployed in single Availability Zone

• Availability Zone #1 goes down

• Downtime

Cloud deployment

© 2018 Pythian. Confidential

AWS

Region #1

Availability Zone #1

Page 28: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

● Availability Zone #1 goes down

○ New Primary elected from AZ #2

● Availability Zone #2 goes down

○ Database is read only

Cloud deployment

© 2018 Pythian. Confidential

AWS

Region #1

Availability Zone #2Availability Zone #1

Page 29: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

• Region #1 goes down

• Downtime

Cloud deployment

© 2018 Pythian. Confidential

AWS

Region #1

AZ #1 AZ #2 AZ #3

Page 30: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

● VM2 goes down

○ Primary node has majority on VM1

● VM1 goes down

○ Database is read only

Virtualization

© 2018 Pythian. Confidential

VMWARE

VM1 VM2

Physical Server

Page 31: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

● Replica set major version upgrade (3.4 > 3.6)

● Driver v3.4 not compatible with DB v3.6

● Application cannot send requests

● Downtime

● Rollback to previous DB version

Version upgrades

© 2018 Pythian. Confidential

MongoDB: 3.6.4 MongoDB: 3.6.4

Page 32: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

● Replica set major version upgrade

● Promote new version as Primary

● Confirm application works

● Forget to upgrade Secondaries

● Start using new features

● New Primary elected

● Application errors

Version upgrades

© 2018 Pythian. Confidential

MongoDB: 3.6 MongoDB: 3.6

MongoDB: 4.0

Page 33: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

● Minor version upgrade

● Promote new version as Primary

● Confirm application works

● Forget to upgrade Secondaries

● Bug fixes in minor release

● New Primary elected

● Application errors

Version upgrades

© 2018 Pythian. Confidential

MongoDB: 3.6.4 MongoDB: 3.6.4

MongoDB: 3.6.8

Page 34: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Version upgrades

© 2018 Pythian. Confidential

MongoDB: 3.6.8 MongoDB: 3.6.8MongoDB: 3.6.8MongoDB: 3.6.8

MongoDB: 3.6.8

MongoDB: 3.6.8

MongoDB: 3.6.8

MongoDB: 3.6.8

MongoDB: 3.6.3

MongoDB: 3.6.8

MongoDB: 3.6.3

MongoDB: 3.6.8

MongoDB: 3.6.8MongoDB: 3.6.8

MongoDB: 3.6.8

MongoDB: 3.6.8MongoDB: 3.6.3

MongoDB: 3.6.8

MongoDB: 3.6.8

MongoDB: 3.6.8

Page 35: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

● Adding index on a collection

● Connect to the Primary node○ db.people.createIndex( { zipcode: 1 }, { background: true } )

DDL operation

© 2018 Pythian. Confidential

Page 36: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

● Stop one Secondary

● Restart on different port

DDL operation

© 2018 Pythian. Confidential

Secondary--port=27777

Page 37: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

● Add the Index

● Rejoin to replica

● Promote Secondary as Primary

● Forget the other nodes

DDL operation

© 2018 Pythian. Confidential

Secondary--port=27777

db.people.createIndex({zipcode:1})

Page 38: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

● Pick one Secondary

● db.fsyncLock()

● Take snapshot

● db.fsyncUnlock()

● Unlock fails

● Secondary starts lagging

● Primary overwrites oplog

● Secondary needs initial sync

Backups

© 2018 Pythian. Confidential

Page 39: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Monitoring replica set• Replica set has no Primary

• Number of unhealthy members is above threshold

• Replication lag is above threshold

• Replica set elected new Primary

• Host of any type has restarted

• Host of type Secondary is recovering

• Host of any type is down

• Host of any time has experienced Rollback

• Monitoring backup status

© 2017 Pythian. Confidential

Page 40: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Summary• Replica set with odd number of voting members

• Hidden or Delayed member for dedicated functions (reporting, backups …)

• Have more than one eligible Primary in the replica set

• Use multi-AZ for Cloud deployments

• Don’t deploy more than one mongod process per node/host

• Run replica set members with same hardware for all nodes

• Run replica set members with same mongo version

• Monitor your replica set status and nodes

• Monitor replication lag and Oplog size

© 2017 Pythian. Confidential

Page 41: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

Questions?

© 2017 Pythian. Confidential

Page 42: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

We’re Hiring!https://www.pythian.com/careers/

© 2017 Pythian. Confidential

Page 43: MongoDB HA, what can go wrong? - Percona...Location: Skopje, Republic of Macedonia Education: MSc, Software Engineering Experience: Lead Database Consultant (since 2016) Database Consultant

© 2018 Pythian. Confidential

Rate my session