Paris Day Cassandra: Use case

26
©2013 DataStax Confidential. Do not distribute without consent. @chbatey Christopher Batey Use case & Lessons learned: Internet Television

Transcript of Paris Day Cassandra: Use case

©2013 DataStax Confidential. Do not distribute without consent.

@chbateyChristopher Batey

Use case & Lessons learned: Internet Television

@chbatey

Who am I?•Built a a lot of systems with Apache Cassandra at Sky•Work on a testing library for Cassandra•Help out Cassandra users•Twitter: @chbatey

@chbatey

Agenda• Migrating a system designed for 5k users to 5million

users- Many different uses of Cassandra- One example data model• Deployment• Lessons learned• A live migration (mysql to Cassandra)

@chbatey

Internet Television• Functionality- Manage users - millions of them- Manage devices - even more- Entitlements - Who can watch what?- Product catalogs- Event logging for billing and analytics

@chbatey

Internet Television• Non-functional requirements- Don’t kill other systems- Very spiky - Game of Thrones, Champions Leage- Increasing user base - doubling per year- Multi DC an absolute must

@chbatey

5k Users

@chbatey

100k+ Users

@chbatey

Champions League match?

@chbatey

Uh oh :(

@chbatey

A pile of cats

@chbatey

The solution

Micro(ish)services +

@chbatey

Strangulation

@chbatey

The new

@chbatey

Audit log• Very heavy write - many events per user interaction with

the service

@chbatey

Modelling the traditional wayCREATE TABLE customer_events( customer_id text, staff_name text, time timeuuid, event_type text, store_name text, PRIMARY KEY (customer_id)); CREATE TABLE store( name text, location text, store_type text, PRIMARY KEY (store_name)); CREATE TABLE staff( name text, favourite_colour text, job_title text, PRIMARY KEY (name));

@chbatey

Your model should look like your queries

Modelling in CassandraCREATE TABLE customer_events(

customer_id text, staff_id text, time timeuuid, store_type text, event_type text, tags map<text, text>, PRIMARY KEY ((customer_id), time));

Partition Key

Clustering Column(s)

@chbatey

Lesson #1: State kills scalability• There are only two places to put state in a scalable

systems:- The client- Cassandra

@chbatey

Lesson #2: Test & measure• Only you know your work loads- Try different schemas, record the performance• Cassandra stress is your friend

@chbatey

Lesson #3: Use the DataStax docs

@chbatey

Lesson #4: Say no to thrift

http://exponential.io/blog/2015/01/08/cassandra-terminology/

@chbatey

Lesson #5: DevOps• Know the hardware you are deploying Cassandra on• No network storage• AWS => Ephemeral storage

@chbatey

Lesson #6: Run multi node clusters

@chbatey

Live migration (mysql —> Cassandra)• Double write - make the writes idempotent

@chbatey

Summary• Cassandra allows you to run at a scale not possible with

other DBs• The cost? Denormalising + duplicating

@chbatey

Questions?• Think of any questions later? @chbatey