Enabling High Availability with Multi-Site, Rack-Aware Replication: Couchbase Connect 2014

12
Enabling High-Availability with multi-site, rack aware replication Alex Ma Principal Solutions Engineer

description

Join us to learn how customers are building globally available applications using Couchbase. This talk will go into detail on Cross data center replication, typical replication topologies, Server groups, best practices and things not to do.

Transcript of Enabling High Availability with Multi-Site, Rack-Aware Replication: Couchbase Connect 2014

Page 1: Enabling High Availability with Multi-Site, Rack-Aware Replication: Couchbase Connect 2014

Enabling High-Availability with multi-site, rack aware replication

Alex MaPrincipal Solutions Engineer

Page 2: Enabling High Availability with Multi-Site, Rack-Aware Replication: Couchbase Connect 2014

read/write/update

Active

SERVER 1

Active

SERVER 2

Active

SERVER 3

APP SERVER 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

APP SERVER 2

Shard 5

Shard 2

Shard 9

Shard

Shard

Shard

Shard 4

Shard 7

Shard 8

Shard

Shard

Shard

Shard 1

Shard 3

Shard 6

Shard

Shard

Shard

Replica Replica Replica

Shard 4

Shard 1

Shard 8

Shard

Shard

Shard

Shard 6

Shard 3

Shard 2

Shard

Shard

Shard

Shard 7

Shard 9

Shard 5

Shard

Shard

Shard

Couchbase Basics

• Docs distributed evenly across servers

• Each server stores both active and replica docs Only one server active at a time

• Client library provides app with simple interface to database

• Cluster map provides map to which server doc is on App never needs to know

• App reads, writes, updates docs

• Multiple app servers can access same document at same time

Page 3: Enabling High Availability with Multi-Site, Rack-Aware Replication: Couchbase Connect 2014

Auto-Failover

SERVER 4 SERVER 5

Replica

Active

Replica

Active

App Server 1

COUCHBASE Client Library

CLUSTER MAP

COUCHBASE Client Library

CLUSTER MAP

App Server 2

Active

SERVER 1

Shard 5

Shard 2

Shard 9Shard

Shard

Shard

Replica

Shard 4

Shard 1

Shard 8Shard

Shard

Shard

Active

SERVER 2

Shard 4

Shard 7 Shard 8

Shard

Shard Shard

Replica

Shard 6

Shard 3 Shard 2

Shard

Shard Shard

Active

SERVER 3

Shard 1

Shard 3

Shard 6Shard

Shard

Shard

Replica

Shard 7

Shard 9

Shard 5Shard

Shard

Shard

• App servers accessing Shards

• Requests to Server 3 fail

• Cluster detects server failedo Promotes replicas of

Shards to activeo Updates cluster map

• Requests for docs now go to appropriate server

• Typically rebalance would follow

Shard 1 Shard 3

Shard

Page 4: Enabling High Availability with Multi-Site, Rack-Aware Replication: Couchbase Connect 2014

Rack-Zone Awareness

©2014 Couchbase, Inc.

• Grouping of servers into server groups so that each group is on a physically separate rack

• Ensures that replica data partitions are not on the same rack as the primary partitions

Servers 1, 2, 3 on Rack 1

Servers 4, 5, 6 on Rack 2

Servers 7, 8, 9 on Rack 3

Cluster has 2 replicas (3 copies of data)

This is a balanced configuration

Page 5: Enabling High Availability with Multi-Site, Rack-Aware Replication: Couchbase Connect 2014

Rack-Zone Awareness

©2014 Couchbase, Inc.

• If a entire server rack fails, data is still available

• If a entire cloud zone or a region fails, data is still available

Page 6: Enabling High Availability with Multi-Site, Rack-Aware Replication: Couchbase Connect 2014

©2014 Couchbase, Inc. — Proprietary and Confidential 6

Ping times EC2 Regions

Page 7: Enabling High Availability with Multi-Site, Rack-Aware Replication: Couchbase Connect 2014

7

Cross Datacenter Replication (XDCR)

• Replicates continuously data FROM source cluster to remote clusters may be spread across geo’s

• Supports unidirectional and bidirectional operation

• Application can read and write from both clusters (active – active replication)

• Automatically handles node addition and removal

• Replication throughput scales out linearly

• Simplified Administration via console, REST, and CLI

©2014 Couchbase, Inc.

Page 8: Enabling High Availability with Multi-Site, Rack-Aware Replication: Couchbase Connect 2014

8

Cross Datacenter Replication (XDCR) – Single Node Type

33 2Managed Cache

Dis

k Q

ueue

Disk

Replication Queue

App Server

Memory-to-Memory Replication to other node

Doc

Doc Doc

XDCR Queue

(New in 3.0) Memory-to-Memory Replication to remote cluster

Doc

©2014 Couchbase, Inc.

Page 9: Enabling High Availability with Multi-Site, Rack-Aware Replication: Couchbase Connect 2014

9

Cross Datacenter Replication (XDCR)

©2014 Couchbase, Inc.

Unidirectional Replication

• Hot spare / Disaster Recovery

• Development/Testing copies

• Integrate to Connector e.g. Solr, ElasticSearch

• Integrate to custom consumer

Page 10: Enabling High Availability with Multi-Site, Rack-Aware Replication: Couchbase Connect 2014

10

Cross Datacenter Replication (XDCR)

©2014 Couchbase, Inc.

Bidirectional Replication

• Multiple Active Masters

• Disaster Recovery

• Datacenter Locality

Page 11: Enabling High Availability with Multi-Site, Rack-Aware Replication: Couchbase Connect 2014

©2014 Couchbase, Inc. — Proprietary and Confidential 11

Putting it all together

Ec2 West - Cluster Ec2 East- Cluster

Group 1West 1-a

Group 2West 1-b

Group 3West 1-c

Group 1East 1-a

Group 2East 1-b

Group 3East 1-c

XDCR

Page 12: Enabling High Availability with Multi-Site, Rack-Aware Replication: Couchbase Connect 2014

Q&A