Paris Day Cassandra: Use case
-
Upload
christopher-batey -
Category
Software
-
view
50 -
download
0
Transcript of Paris Day Cassandra: Use case
©2013 DataStax Confidential. Do not distribute without consent.
@chbateyChristopher Batey
Use case & Lessons learned: Internet Television
@chbatey
Who am I?•Built a a lot of systems with Apache Cassandra at Sky•Work on a testing library for Cassandra•Help out Cassandra users•Twitter: @chbatey
@chbatey
Agenda• Migrating a system designed for 5k users to 5million
users- Many different uses of Cassandra- One example data model• Deployment• Lessons learned• A live migration (mysql to Cassandra)
@chbatey
Internet Television• Functionality- Manage users - millions of them- Manage devices - even more- Entitlements - Who can watch what?- Product catalogs- Event logging for billing and analytics
@chbatey
Internet Television• Non-functional requirements- Don’t kill other systems- Very spiky - Game of Thrones, Champions Leage- Increasing user base - doubling per year- Multi DC an absolute must
@chbatey
Modelling the traditional wayCREATE TABLE customer_events( customer_id text, staff_name text, time timeuuid, event_type text, store_name text, PRIMARY KEY (customer_id)); CREATE TABLE store( name text, location text, store_type text, PRIMARY KEY (store_name)); CREATE TABLE staff( name text, favourite_colour text, job_title text, PRIMARY KEY (name));
Modelling in CassandraCREATE TABLE customer_events(
customer_id text, staff_id text, time timeuuid, store_type text, event_type text, tags map<text, text>, PRIMARY KEY ((customer_id), time));
Partition Key
Clustering Column(s)
@chbatey
Lesson #1: State kills scalability• There are only two places to put state in a scalable
systems:- The client- Cassandra
@chbatey
Lesson #2: Test & measure• Only you know your work loads- Try different schemas, record the performance• Cassandra stress is your friend
@chbatey
Lesson #5: DevOps• Know the hardware you are deploying Cassandra on• No network storage• AWS => Ephemeral storage
@chbatey
Summary• Cassandra allows you to run at a scale not possible with
other DBs• The cost? Denormalising + duplicating