Download - Migrating to Riak at Shareaholic

Transcript
Page 1: Migrating to Riak at Shareaholic

Riak @

Robby [email protected]

@freerobby

Page 2: Migrating to Riak at Shareaholic

Agenda

Shareaholic: Product & Tech

Why Riak: The Search for a Big Data Store

Transitioning to Riak

Riak Use Cases

Deploying to EC2

Page 3: Migrating to Riak at Shareaholic

What’s ?

Page 4: Migrating to Riak at Shareaholic

Browser Tools

Page 5: Migrating to Riak at Shareaholic

Sharing Buttons

Page 6: Migrating to Riak at Shareaholic

Recommendations

Page 7: Migrating to Riak at Shareaholic

Social Analytics

Page 8: Migrating to Riak at Shareaholic

Monthly @

Thousands of developers hitting API

Hundreds of thousands of publishers

Tens of millions of shares & clicks

Hundreds of millions of pageviews & events

Page 9: Migrating to Riak at Shareaholic

Tech @

JRuby on Rails (via Torquebox)

MySQL (Master, Read Slave)

Elastic MapReduce (similar to Hadoop)

Redis

Formerly Mongo, Now Riak

Page 10: Migrating to Riak at Shareaholic

Why Not Mongo?

Working set needs to fit in memory

Global write lock blocks all queriesdespite not having transactions/joins

Standbys not “hot”

Page 11: Migrating to Riak at Shareaholic

Why Riak?

Page 12: Migrating to Riak at Shareaholic

Next @Options:

HBase

Cassandra

Riak

Goals:

Linear scalability

Full-text search

Flexible indexing

Easier Devops

Page 13: Migrating to Riak at Shareaholic

HBasePros

Battle tested

High performance

Cons

Complex Architecture

SPOFs

Requires Hive for Indexing/Querying

Expensive to deployat small scale

Page 14: Migrating to Riak at Shareaholic

CassandraPros

Native secondary indices

Linear scalability

Tunable CAP

Cons

Known users all domain experts

Search requires Lucene

Heavy Weight MapReduce

Page 15: Migrating to Riak at Shareaholic

RiakPros

Operationally simpler

Linear scalability

Integrated search

Secondary indices

Tunable CAP

Vector clocks solve time-sync problems

Cons

Multi-data center replication requires Enterprise product

leveldb puts high strain on CPU

Page 16: Migrating to Riak at Shareaholic

From Mongo to Riak

Page 17: Migrating to Riak at Shareaholic

Migration Goals

No time where database goes “offline”

Product parity throughout migration

Page 18: Migrating to Riak at Shareaholic

Migration Process

1. App writes to Mongo and Riak

2. Verify data integrity

3. Import historical data

4. App reads from Riak

5. Decommission Mongo

Page 19: Migrating to Riak at Shareaholic

Use Cases

Page 20: Migrating to Riak at Shareaholic

Share API

Save shared content

Uses MapReduce topopulate user dashboard

Page 21: Migrating to Riak at Shareaholic

Recommendations

Sets of related pages

Generated on-demand

Page 22: Migrating to Riak at Shareaholic

Publisher Analytics

Generated nightly via Hadoop

Typical stored “document” (JSON)

80kb-1Mb

Page 23: Migrating to Riak at Shareaholic

Riak Successes

Page 25: Migrating to Riak at Shareaholic

Replication: primary/secondary authority

Read failure tolerance: speed/consistency

Write failure tolerance

Tunable CAP @

Page 26: Migrating to Riak at Shareaholic

Full Text Search

Built on Lucene

Make user content searchable

Make arbitrary keys queryable

“Just turn it on”

Hiccup: corrupt merge indexes

Page 27: Migrating to Riak at Shareaholic

Query Example

curl -XPOST http://localhost:8098/mapred -H 'Content-Type: application/json' -d '{ "inputs": { "bucket":"links", "query":"timestamp:[1346350877 TO 1346350937}" //60 second period }, "query":[ {"map":{"language":"javascript","source":"function(riakObject) { return [[Riak.mapValuesJson(riakObject)[0].user_id]]; }"}}, {"reduce":{"language":"javascript", "name":"Riak.reduceMin" // [[2],[5],[9],[13]] => [[2]] }} ]}'

Who’s our oldest user who’s shared something in the last minute?

[[2197]]

Page 28: Migrating to Riak at Shareaholic

Riak on EC2

Page 29: Migrating to Riak at Shareaholic

In a Nutshell

EC2 specs poorly proportioned for leveldb

Multiple AZs in one location works well

Scale vertically for better latency & consistency

Scale horizontally for more throughput/$

Page 30: Migrating to Riak at Shareaholic

Benchmarks

Top Graph: c1.medium (1.7G, 5 CPU)

Middle: m1.large (7.5G, 4 CPU)

Bottom: cc1.4xlarge (23G, 33.5 CPU)

Page 31: Migrating to Riak at Shareaholic

Throughput

Page 32: Migrating to Riak at Shareaholic

Latency (Typical)

Page 33: Migrating to Riak at Shareaholic

Latency (Worst Case)

Page 34: Migrating to Riak at Shareaholic

Calculationsc1.medium (1.7G, 5 CPU)1758 IOPS/$-hrWorst 1% of queries: 300ms/800ms

m1.large (7.5G, 4 CPU)1167 IOPS/$-hrWorst 1% of queries: 110ms/200ms

cc1.4xlarge (23G, 33.5 CPU)872 IOPS/$-hrWorst 1% of queries: 47ms/139ms

Page 35: Migrating to Riak at Shareaholic

Benchmark Takeaways

You can’t go “by spec”

IO is limiting factor

RAM never limiting factor for 1%of keyspace to be in memory

Page 36: Migrating to Riak at Shareaholic

Fin. Questions?Thanks:

Tom Santero

Justin Sheehy

Ryan Zezeski

Reid Draper

#freenode riak crew

We’re Hiring!

Robby Grossman

[email protected]

@freerobby

Page 37: Migrating to Riak at Shareaholic

Fin.