re:Invent 2012 Optimizing Cassandra

Post on 06-May-2015

1.371 views 7 download

Tags:

description

AWS re:Invent 2012 presentation on Optimizing Cassandra usage at Netflix. Overview of Netflix Open Source projects. Gregg Ulrich, Ruslan Meshenberg.

Transcript of re:Invent 2012 Optimizing Cassandra

Optimizing Cassandra for AWS

Ruslan Meshenberg, Gregg Ulrich - Netflix

Agenda

Netflix

AWS Cassandra

Netflix Inc.

With more than 30 million streaming members in the United States, Canada, Latin America, the United Kingdom,

Ireland, Sweden, Norway, Denmark and Finland, Netflix, Inc. is the world's leading internet subscription service for

enjoying movies and TV series..

Why Cloud?

Jan-

10

Feb-1

0

Mar

-10

Apr-1

0

May

-10

Jun-

10

Jul-1

0

Aug-1

0

Sep-1

0

Oct-

10

Nov-1

0

Dec-1

0

Jan-

110

5

10

15

20

25

Netflix API – Growth in requests

Data Center Capacity

Req

ues

ts i

n B

illi

on

s (p

er d

ay)

Netflix.com is now ~100% Cloud

• Some small back end data sources still in progress• USA specific logistics remains in the Datacenter• Working on SOX, PCI as scope starts to include AWS• All international product is cloud based

What is Cassandra?

• Persistent data store• NoSQL• Distributed key/value store• Tunable eventual consistency

Why did we choose Cassandra?

• Open sourced and written in Java• Multi-region replication• Data model supports wide range of use-cases• Runs on commodity hardware• Enhanced to understand AWS topology• Durable

Durability

• No single point of failure or specialized instances• Multiple copies of data across availability zones• Bootstrapping and hints restore data quickly• All writes appended to a commit log• Asynchronous cross-regional replication

How we configure Cassandra in AWS

1c 1d

1e

1c

1d

1e1c

1d

1e

1c

1d

1e1a 1b

1c

1a

1b

1c1a

1b

1c

1a

1b

1c

S3 S3

us-east-1 eu-west-1

S3

us-west-2

Durability (Quorum)

One instance: Availability zone:

Replica set:

How we configure Cassandra in AWS

• Mostly m2.4xlarge, but migrating to SSDs• Ephemeral storage for better performance• Multiple ASGs per cluster, each with one AZ• Single tenanted clusters• Overprovisioned clusters

Optimizations

• Cassandra enhancements• Client libraries• Operations• Schema and data management

Cassandra enhancements

• Bug fixes• New features• Performance• Security• AWS environment

Making a better Java client

• Multi-region and zone aware• Latency aware load balancer• Fluent API on top of Thrift• Best Practice Recipes

Filling the operational void

• Tomcat webapp for Cassandra administration• AWS-style instance provisioning• Full and incremental backups• JMX metrics collection• Consistent configuration across clusters• REST API for most administrative operations• Security Groups configuration

Managing your data and schema

• Missing UI for Cassandra client users• View and edit schema• Point queries and data updates• High level cluster status and metrics• Manages multiple Cassandra clusters• Integrated access control• Schema auditing

High level cluster status

Data query tool

Schema management tool

Operations

• June 29th AWS partial outage• Observations• Monitoring• Maintenances

From the Netflix tech blog:

“Cassandra, our distributed cloud persistence store which is distributed across all zones and regions, dealt with the loss of one third of its regional nodes without any loss of data or availability.”

June 29th AWS partial outage

• During outage- All Cassandra instances in us-east-1a were inaccessible- nodetool ring showed all nodes as DOWN- Monitoring other AZs to ensure availability- Waited for AWS to resolve the issue

• Recovery – power restored to us-east-1a- Majority of instances rejoined the cluster without issue- Most of remainder required a reboot to fix- The others needed to be replaced, one at a time

Observations: AWS

• Ephemeral drive performance is better than EBS• Instances seldom die on their own• Use as many availability zones as possible• Understand how AWS launches instances• I/O is constrained in most AWS instance types

- Repairs are very I/O intensive- Large size-tiered compactions can impact latency

• SSDs are game changers

23

Observations: Cassandra

• A slow node is worse than a down node• Cold cache increases load and kills latency• Use whatever dials you can find in an emergency

- Remove node from coordinator list- Compaction throttling- Min/max compaction thresholds- Enable/disable gossip

• Leveled compaction performance is very promising• 1.1.x and 1.2.x should address some big issues

24

Monitoring

• Actionable- Hardware and network issues- Cluster consistency

• Cumulative Cassandra trends- Throughput and latency- Key Cassandra metrics (queues, dropped ops, table reads)

• Informational- Schema changes- Log file errors/exceptions- Recent restarts

25

Maintenance

• Repair clusters regularly• Run off-line major compactions to avoid latency

- SSDs will make this unnecessary• Always replace nodes when they fail• Periodically replace all nodes in the cluster• Upgrade to new versions

- Binary (rpm) for major upgrades or emergencies- Rolling AMI push over time

26

Scaling Cassandrahttp://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

0 50 100 150 200 250 300 3500

200000

400000

600000

800000

1000000

1200000

174373

366828

537172

1099837

Client Writes/s by node count – Replication Factor = 3

800K writes per second in production

Disk vs. SSD BenchmarkSame Throughput, Lower Latency, Half Cost

Cassandra

Memcached

Application

Load GenerationLoad Test

Driver

REST service

36x m2.xlarge EVcache

48x m2.4xlarge Cassandra

REST service

15x hi1.4xlarge

Cassandra

Netflix is “all in” with Cassandra

50 Number of production clusters

15 Number of multi-region clusters

4 Max regions, one cluster

101 Total TB of data across all clusters

780 Number of Cassandra nodes

72/32 Largest Cassandra cluster (nodes/data in TB)

250k/800k Max read/writes per second on a single cluster

Future optimizations

• Cassandra as a Service• Fewer clusters, more data• Autoscaling Cassandra• Priam on PEDs• Self maintaining Cassandra clusters

All optimizations are open sourced

• Enhancements committed to open source project• Netflix@github

- Astyanax- Priam- Cassandra Explorers (coming soon)

• Motivations- Give back to Apache licensed OSS community- Help define best practices

Netflix Open Source Center

Conclusion

• Cassandra is high performing and durable in AWS• Cassandra is flexible enough to handle most use-cases• AWS offerings help provide a complete solution• Cassandra performs well in AWS, especially on SSDs• “Just because Netflix does it doesn’t make it right for you”

Follow us

• http://techblog.netflix.com• http://netflix.github.com• Twitter

• @Netflix• @NetflixJobs• @rusmeshenberg (Ruslan)• @eatupmartha (Gregg)

We are sincerely eager to hear your FEEDBACK on this presentation and on re:Invent.

Please fill out an evaluation form when you have a

chance.