High Throughput Analytics with Cassandra & Azure

Post on 15-Jan-2015

2.254 views 5 download

Tags:

description

 

Transcript of High Throughput Analytics with Cassandra & Azure

High Throughput Analytics with Cassandra & Azure

Charles LamannaPrincipal Dev Lead

@clamanna

MetricsHubkeep cloud services up and running for the lowest possible cost

Live Status

Cost Awareness

Alerts and Notifications

Actions and Scaling

$

2000+ customers in 6 months

10/18/2012 12/7/2012 1/26/2013 3/17/2013 5/6/2013 6/25/20130

500

1000

1500

2000

2500

storing data200M data points per hour80,000 data points per second (peak)

Planning for huge data ingestion ratesRequires high scale, real-time data

1,000 data points per minute per VM12 data points per endpoint per minute

Aggregate, analyze and take actions based on this data stream (in near real-time)

Must be cheap, scalable and reliable

Evaluated several technologiesAggregation in memory; good performance, bad COGs

Rolling tables for aggs; good tooling/support, hard to scale

Aggregation on write; easy to scale and good COGs

Cassandra UpsideScales fluidly Grows horizontally – double the nodes, double capacityAdd / remove capacity / nodes with no downtime

Highly availableNo single point of failureReplication factor (i.e. hot copies) is just a config switch

… and by the wayLittle-to-no operations cost

New nodes take minutes to setupNodes just keep running for months on end

“Aggregate on write” – no jobs required!Distributed counters make it easy to do aggregates on write

…and a nice kicker: has *great* perf / COGS in Azure

architecture68 virtual machines (PAAS and IAAS)

Table StorageJobs Worker Role (24 instances)

SQL Database

Portal Web Role

(3 instances)

Cassandra VM Cluster

(32 XL instances)

Web API Web Role

(8 instances)

End User Web Browsers

Monitored Customer Resources

(e.g. websites; SQL databases)

Monitored Virtual Machines

Endpoints Replicated datain multiple

datacenters

ClientsPaaS

IaaS

Services

Avoiding state

• Application logic / code all lives on stateless machines

• Keeps it simple: decreases human operations cost

• Use Azure PAAS offerings (Web and Worker roles)

Table Storage

Jobs Worker Role (24 instances)

SQL Database

Blob storage

Portal Web Role

(3 instances)

Cassandra VM Cluster

(32 XL instances)

Web API Web Role

(8 instances)

Endpoints Replicated datain multiple

datacenters

PaaS

Azure Cloud Services (PAAS)

• Scale horizontally (grew from 1 to 30+ instances)

• Managed by the platform (patched; coordinated recycling; failover; etc.)

• 1 click deployment from Visual Studio (with automatic load balancer swaps)

Web Role Worker Role

Table Storage

Jobs Worker Role (24 instances)

SQL Database

Blob storage

Web API Web Role

(8 instances)

Endpoints Replicated datain multiple

datacenters

Maintains all state for metrics / time series data

32 XL Linux Virtual Machines Portal Web

Role (3 instances)

Cassandra VM Cluster

(32 XL instances)

Cassandra Cluster

IaaS

32 nodes, 8 “pods” of 4 nodes

……..

……….

Exposed via a single endpoint

Exposed via a single endpoint

Exposing the pods• Each pod of 4 nodes

has a single load balanced endpoint

• Clients (on our stateless roles) treats the endpoints as a pool

• Blacklists and skips an endpoint if it starts producing a lot of errors

Where does the data go?

• Data files are on 16 mounted network backed disks (*not* ephemeral disks)

• Data disks are geo-replicated (3 copies local; 1 remote) for “free” DR

• Azure data disks offer great throughput (VMs end up CPU bound)

Our Column Families (CQL 3)

CREATE TABLE oneminute (

rk text,  ck text,  cnt counter,  sum counter,  PRIMARY KEY (rk, ck)

);

Updating values…Realtime “average” values at any granularity, for any time window

updateoneminute/tenminute/oneday

setsum = sum + {sample_value},cnt = cnt + 1

where rk = '{customer+metric}' and ck = '{tags_and_timestamp}'

Reading values…

*ONE* round trip to fetch a metric over time (e.g. CPU over past week)

select * from oneminutewhere rk = ‘{customer_name}' and ck < '{metric_path_start}' and ck >= '{metric_path_end}‘order by ck desc;

Some hard lessons…

• Static private IPs are a must; otherwise, reboots / outages can confuse the cluster when nodes come back up

• Monitor performance carefully; once you tip over, it is hard to rebalance the cluster and add new nodes

• Fit the cluster to the platform: in Azure, match the Upgrade Domains / Fault Domains to preserve uptime during service maintenance / hardware failure

Single node tests..• 4 disks, RAID 0, no read cache

Workload (%write)

Ops / sec Latency median

Latency 95th

Latency 99th

%100 20018 1.5 3.7 7.9%75 8361 85.9 376.6 584.8%25 5412 459.9 759.1 940.1

• 4 disks, RAID 0, read cacheWorkload (%write)

Ops / sec Latency median

Latency95th

Latency99th

%100 19208 1.5 3.8 7.9  18543 1.5 3.6 7.9  18563 1.4 3.6 8.2

%75 7112 195.9 595.8 1099.6  7581 168.9 589.5 985.2  5149 256.5 774.0 1402.9

%25 15358 23.0 110.2 309.1  3742 279.2 563.0 789.7  15376 22.1 98.8 293.3

jbod RAID00

1000

2000

3000

4000

5000

6000

7000

JBOD vs RAID 0 for read-heavy workload

Workload (%write)

Ops / sec

Latency Median

Latency 95th

Latency99th

%100 13638 1.9 4.9 24.0%75 3239 11.2 687.0 1099.3%25 1825 243.6 687.0 808.7

Multi-node load tests..

• 4 Nodes; RF = 3 (Quorom)

• 8 Disks, RAID 0

QUESTIONS & ANSWERS

Charles LamannaCharles.Lamanna@Microsoft.co

m@clamanna