Introduction to Apache Cassandra

63
Introduction to Apache Cassandra Luke Tillman (@LukeTillman) Language Evangelist at DataStax

description

Apache Cassandra is a scalable, fault-tolerant database that has found its way into more than 25% of the Fortune 100 and continues to enjoy significant adoption in the marketplace. In this talk we'll introduce you to Cassandra, explore some of its internals, and discuss CQL (the SQL-like query language for Cassandra). We'll finish by talking about how some companies are using it for services you probably interact with in your daily life. You'll leave with all the tools you need to start exploring Cassandra on your own.

Transcript of Introduction to Apache Cassandra

Page 1: Introduction to Apache Cassandra

Introduction to Apache Cassandra

Luke Tillman (@LukeTillman)

Language Evangelist at DataStax

Page 2: Introduction to Apache Cassandra

Who are you?!

• Evangelist with a focus on the .NET Community

• Long-time Developer

• Recently presented at Cassandra Summit 2014 with Microsoft

• Very Recent Denver Transplant

2

Page 3: Introduction to Apache Cassandra

DataStax and Cassandra

• DataStax Enterprise – Apache Cassandra, now

with more QA!

– Easy integrations with Solr, Apache Spark, Hadoop

• Dev and Ops Tooling – DevCenter IDE, OpsCenter

• Open source drivers – Java, C#, Python, C++,

Ruby, NodeJS

3

Page 4: Introduction to Apache Cassandra

• Unlimited, free use of DataStax Enterprise

• No limit on number of nodes or other hidden restrictions

• If you’re a startup, it’s free.

• Requirements:

– < $2M annual revenue, < $20M capital raised

4

www.datastax.com/startups

Page 5: Introduction to Apache Cassandra

1 What is Cassandra?

2 How does it work?

3 Cassandra Query Language (CQL)

4 Who’s using it?

5 Questions

5

Page 6: Introduction to Apache Cassandra

What is Cassandra?

6

Page 7: Introduction to Apache Cassandra

What is Cassandra?

• A Linearly Scaling and Fault Tolerant Distributed Database

• Fully Distributed

– Data spread over many nodes

– All nodes participate in a cluster

– All nodes are equal

– No SPOF (shared nothing)

7

Page 8: Introduction to Apache Cassandra

What is Cassandra?

• Linearly Scaling

– Have More Data? Add more nodes.

– Need More Throughput? Add more nodes.

8

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

Page 9: Introduction to Apache Cassandra

What is Cassandra?

• Fault Tolerant

– Nodes Down != Database Down

– Datacenter Down != Database Down

9

Page 10: Introduction to Apache Cassandra

What is Cassandra?

• Fully Replicated

• Clients write local

• Data syncs across WAN

• Replication Factor per DC

10

US Europe

Client

Page 11: Introduction to Apache Cassandra

Cassandra and the CAP Theorem

• The CAP Theorem limits what distributed systems can do

• Consistency

• Availability

• Partition Tolerance

• Limits? “Pick 2 out of 3”

11

Page 12: Introduction to Apache Cassandra

Cassandra and the CAP Theorem

Consistency • When I ask the same question to any part of the system, I should get the same answer

12

Is he guilty yet? No. No.

No.

Consistent

Page 13: Introduction to Apache Cassandra

Cassandra and the CAP Theorem

Consistency • When I ask the same question to any part of the system, I should get the same answer

13

Is he guilty yet? No. Yes.

Yes.

Not Consistent

Page 14: Introduction to Apache Cassandra

Cassandra and the CAP Theorem

Availability • When I ask a question, I will get an answer

14

Is he guilty yet? Yes.

Available

Page 15: Introduction to Apache Cassandra

Cassandra and the CAP Theorem

Availability • When I ask a question, I will get an answer

15

Is he guilty yet?

I don’t know, we

have to wait for

Dreamy to wake up.

Not Available

Page 16: Introduction to Apache Cassandra

Cassandra and the CAP Theorem

Partition Tolerance • I can ask questions even when the system is having intra-system communication problems.

16

Is he guilty yet?

Tolerant

No.

Team Tyrion Team Cersei

Page 17: Introduction to Apache Cassandra

Cassandra and the CAP Theorem

Partition Tolerance • I can ask questions even when the system is having intra-system communication problems.

17

Is he guilty yet?

Not Tolerant

I’m not sure without

asking them and we’re

not speaking (I’m pretty

sure that one helped kill

my sister).

Team Tyrion Team Cersei

Page 18: Introduction to Apache Cassandra

Cassandra and the CAP Theorem

• Cassandra is an AP system that is Eventually Consistent

18

Is he guilty yet? No.

Wait, he’s

going to take

the black. Yes. No.

Eventually Consistent

Page 19: Introduction to Apache Cassandra

Cassandra and the CAP Theorem

• Cassandra is an AP system that is Eventually Consistent

19

Is he guilty yet? Yes. Yes.

Eventually Consistent

Yes.

Page 20: Introduction to Apache Cassandra

How does it work?

20

Page 21: Introduction to Apache Cassandra

Two knobs control Cassandra fault tolerance

• Replication Factor (server side)

– How many copies of the data should exist?

21

Client

B AD

C AB

A CD

D BC

Write A

RF=3

Page 22: Introduction to Apache Cassandra

Two knobs control Cassandra fault tolerance

• Consistency Level (client side)

– How many replicas do we need to hear from before we acknowledge?

22

Client

B AD

C AB

A CD

D BC

Write A

CL=QUORUM

Client

B AD

C AB

A CD

D BC

Write A

CL=ONE

Page 23: Introduction to Apache Cassandra

Consistency Levels

• Applies to both Reads and Writes (i.e. is set on each query)

• ONE – one replica from any DC

• LOCAL_ONE – one replica from local DC

• QUORUM – 51% of replicas from any DC

• LOCAL_QUORUM – 51% of replicas from local DC

• ALL – all replicas

• TWO

23

Page 24: Introduction to Apache Cassandra

Consistency Level and Speed

• How many replicas we need to hear from can affect how quickly

we can read and write data in Cassandra

24

Client

B AD

C AB

A CD

D BC

5 µs ack

300 µs ack

12 µs ack

12 µs ack

Read A

(CL=QUORUM)

Page 25: Introduction to Apache Cassandra

Consistency Level and Availability

• Consistency Level choice affects availability

• For example, QUORUM can tolerate one replica being down and

still be available (in RF=3)

25

Client

B AD

C AB

A CD

D BC

A=2

A=2

A=2

Read A

(CL=QUORUM)

Page 26: Introduction to Apache Cassandra

Consistency Level and Eventual Consistency

• Cassandra is an AP system that is Eventually Consistent so

replicas may disagree

• Column values are timestamped

• In Cassandra, Last Write Wins (LWW)

26

Client

B AD

C AB

A CD

D BC

A=2

Newer

A=1

Older

A=2

Read A

(CL=QUORUM)

Christos from Netflix: “Eventual Consistency != Hopeful Consistency”

https://www.youtube.com/watch?v=lwIA8tsDXXE

Page 27: Introduction to Apache Cassandra

Writes in the cluster

• Fully distributed, no SPOF

• Node that receives a request is the Coordinator for request

• Any node can act as Coordinator

27

Client

B AD

C AB

A CD

D BC

Write A

(CL=ONE)

Coordinator Node

Page 28: Introduction to Apache Cassandra

Writes in the cluster – Data Distribution

• Partition Key determines node placement

28

Partition Key

id='pmcfadin' lastname='McFadin'

id='jhaddad' firstname='Jon' lastname='Haddad'

id='ltillman' firstname='Luke' lastname='Tillman'

CREATE TABLE users ( id text, firstname text, lastname text, PRIMARY KEY (id) );

Page 29: Introduction to Apache Cassandra

Writes in the cluster – Data Distribution

• The Partition Key is hashed using a consistent hashing function

(Murmur 3) and the output is used to place the data on a node

• The data is also replicated to RF-1 other nodes

29

Partition Key

id='ltillman' firstname='Luke' lastname='Tillman'

Murmur3 id: ltillman Murmur3: A

B AD

C AB

A CD

D BC

RF=3

Page 30: Introduction to Apache Cassandra

Hashing – Back to Reality

• Back in reality, Partition Keys actually hash to 128 bit numbers

• Nodes in Cassandra own token ranges (i.e. hash ranges)

30

B AD

C AB

A CD

D BC

Range Start End

A 0xC000000..1 0x0000000..0

B 0x0000000..1 0x4000000..0

C 0x4000000..1 0x8000000..0

D 0x8000000..1 0xC000000..0

Partition Key

id='ltillman' Murmur3 0xadb95e99da887a8a4cb474db86eb5769

Page 31: Introduction to Apache Cassandra

Writes on a single node

• Client makes a write request

Client

UPDATE users SET firstname = 'Luke' WHERE id = 'ltillman'

Disk

Memory

Page 32: Introduction to Apache Cassandra

Writes on a single node

• Data is appended to the Commit Log

• Cassandra writes are FAST due to log appended storage

Client

UPDATE users SET firstname = 'Luke' WHERE id = 'ltillman'

Commit Log

id='ltillman', firstname='Luke'

Disk

Memory

Page 33: Introduction to Apache Cassandra

Writes on a single node

• Data is written to Memtable

Client

UPDATE users SET firstname = 'Luke' WHERE id = 'ltillman'

Commit Log

id='ltillman', firstname='Luke'

Disk

Memory

Memtable for Users Some

Other

Memtable id='ltillman' firstname='Luke' lastname='Tillman'

Page 34: Introduction to Apache Cassandra

Writes on a single node

• Server acknowledges to client

Client

UPDATE users SET firstname = 'Luke' WHERE id = 'ltillman'

Commit Log

id='ltillman', firstname='Luke'

Disk

Memory

Memtable for Users Some

Other

Memtable id='ltillman' firstname='Luke' lastname='Tillman'

Page 35: Introduction to Apache Cassandra

Writes on a single node

• Once Memtable is full, data is flushed to disk as SSTable (Sorted

String Table)

Client

UPDATE users SET firstname = 'Luke' WHERE id = 'ltillman'

Data Directory

Disk

Memory

Memtable for Users Some

Other

Memtable id='ltillman' firstname='Luke' lastname='Tillman'

Some

Other

SSTable

SSTable

#1 for

Users

SSTable

#2 for

Users

Page 36: Introduction to Apache Cassandra

Compaction

• Compactions merge and unify data in our SSTables

• SSTables are immutable, so this is when we consolidate rows

36

SSTable

#1 for

Users

SSTable

#2 for

Users

SSTable #3 for

Users

id='ltillman' firstname='Lucas' (timestamp=Older)

lastname='Tillman'

id='ltillman' firstname='Luke' lastname='Tillman'

id='ltillman' firstname='Luke' (timestamp=Newer)

Page 37: Introduction to Apache Cassandra

Reads in the cluster

• Same as writes in the cluster, reads are coordinated

• Any node can be the Coordinator Node

37

Client

B AD

C AB

A CD

D BC

Read A

(CL=QUORUM)

Coordinator Node

Page 38: Introduction to Apache Cassandra

Reads on a single node

• Client makes a read request

38

Client

SELECT firstname, lastname FROM users WHERE id = 'ltillman'

Disk

Memory

Page 39: Introduction to Apache Cassandra

Reads on a single node

• Data is read from (possibly multiple) SSTables and merged

• Reads in Cassandra are also FAST but are limited by Disk IO

39

Client

SELECT firstname, lastname FROM users WHERE id = 'ltillman'

Disk

Memory

SSTable #1 for Users

id='ltillman' firstname='Lucas' (timestamp=Older)

lastname='Tillman'

SSTable #2 for Users

id='ltillman' firstname='Luke' (timestamp=Newer)

firstname='Luke' lastname='Tillman'

Page 40: Introduction to Apache Cassandra

Reads on a single node

• Any unflushed Memtable data is also merged

40

Client

SELECT firstname, lastname FROM users WHERE id = 'ltillman'

Disk

Memory

firstname='Luke' lastname='Tillman' Memtable

for Users

Page 41: Introduction to Apache Cassandra

Reads on a single node

• Client gets acknowledgement with the data

41

Client

SELECT firstname, lastname FROM users WHERE id = 'ltillman'

Disk

Memory

firstname='Luke' lastname='Tillman'

Page 42: Introduction to Apache Cassandra

Compaction - Revisited

• Compactions merge and unify data in our SSTables, making

them important to reads (less SSTables = less to read/merge)

42

SSTable

#1 for

Users

SSTable

#2 for

Users

SSTable #3 for

Users

id='ltillman' firstname='Lucas' (timestamp=Older)

lastname='Tillman'

id='ltillman' firstname='Luke' lastname='Tillman'

id='ltillman' firstname='Luke' (timestamp=Newer)

Page 43: Introduction to Apache Cassandra

Cassandra Query Language (CQL)

43

Page 44: Introduction to Apache Cassandra

Data Structures

• Keyspace is like RDBMS Database or Schema

• Like RDBMS, Cassandra uses Tables to store data

• Partitions can have one row (narrow) or multiple

rows (wide)

44

Keyspace

Tables

Partitions

Rows

Page 45: Introduction to Apache Cassandra

Schema Definition (DDL)

• Easy to define tables for storing data

• First part of Primary Key is the Partition Key

CREATE TABLE videos ( videoid uuid, userid uuid, name text, description text, tags set<text>, added_date timestamp, PRIMARY KEY (videoid) );

Page 46: Introduction to Apache Cassandra

Schema Definition (DDL)

• One row per partition (familiar)

CREATE TABLE videos ( videoid uuid, userid uuid, name text, description text, tags set<text>, added_date timestamp, PRIMARY KEY (videoid) );

name ...

Keyboard Cat ...

Nyan Cat ...

Original Grumpy Cat ...

videoid

689d56e5- …

93357d73- …

d978b136- …

Page 47: Introduction to Apache Cassandra

Clustering Columns

• Second part of Primary Key is Clustering Columns

• Clustering columns affect ordering of data (on disk)

• Multiple rows per partition

47

CREATE TABLE comments_by_video ( videoid uuid, commentid timeuuid, userid uuid, comment text, PRIMARY KEY (videoid, commentid) ) WITH CLUSTERING ORDER BY (commentid DESC);

Page 48: Introduction to Apache Cassandra

Clustering Columns – Wide Rows (Partitions)

• Use of Clustering Columns is where the term “Wide Rows”

comes from

48

videoid='0fe6a...'

userid= 'ac346...'

comment= 'Awesome!'

commentid='82be1...' (10/1/2014 9:36AM)

userid= 'f89d3...'

comment= 'Garbage!'

commentid='765ac...' (9/17/2014 7:55AM)

CREATE TABLE comments_by_video ( videoid uuid, commentid timeuuid, userid uuid, comment text, PRIMARY KEY (videoid, commentid) ) WITH CLUSTERING ORDER BY (commentid DESC);

Page 49: Introduction to Apache Cassandra

Inserts and Updates

• Use INSERT or UPDATE to add and modify data

• Both will overwrite data (no constraints like RDBMS)

• INSERT and UPDATE functionally equivalent 49

INSERT INTO comments_by_video ( videoid, commentid, userid, comment) VALUES ( '0fe6a...', '82be1...', 'ac346...', 'Awesome!');

UPDATE comments_by_video SET userid = 'ac346...', comment = 'Awesome!' WHERE videoid = '0fe6a...' AND commentid = '82be1...';

Page 50: Introduction to Apache Cassandra

TTL and Deletes

• Can specify a Time to Live (TTL) in seconds when doing an

INSERT or UPDATE

• Use DELETE statement to remove data

• Can optionally specify columns to remove part of a row

50

INSERT INTO comments_by_video ( ... ) VALUES ( ... ) USING TTL 86400;

DELETE FROM comments_by_video WHERE videoid = '0fe6a...' AND commentid = '82be1...';

Page 51: Introduction to Apache Cassandra

Querying

• Use SELECT to get data from your tables

• Always include Partition Key and optionally Clustering Columns

• Can use ORDER BY and LIMIT

• Use range queries (for example, by date) to slice partitions

51

SELECT * FROM comments_by_video WHERE videoid = 'a67cd...' LIMIT 10;

Page 52: Introduction to Apache Cassandra

Cassandra Data Modeling

• Requires a different mindset than RDBMS modeling

• Know your data and your queries up front

• Queries drive a lot of the modeling decisions (i.e. “table per

query” pattern)

• Denormalize/Duplicate data at write time to do as few queries

as possible come read time

• Remember, disk is cheap and writes in Cassandra are FAST

52

Page 53: Introduction to Apache Cassandra

Cassandra Data Modeling – A Quick Example

• Users need to be looked up by a unique Id, but when logging in,

need to look them up by email address

• Some data is duplicated (email, userid) but that’s OK

53

CREATE TABLE users ( userid uuid, firstname text, lastname text, email text, PRIMARY KEY (userid) );

CREATE TABLE users_by_email ( email text, password text, userid uuid, PRIMARY KEY (email) );

Page 54: Introduction to Apache Cassandra

Who’s using it?

54

Page 55: Introduction to Apache Cassandra

Cassandra Adoption

Page 56: Introduction to Apache Cassandra

Some Common Use Case Categories

• Product Catalogs and Playlists

• Internet of Things (IoT) and Sensor Data

• Messaging (emails, IMs, alerts, comments)

• Recommendation and Personalization

• Fraud Detection

• Time series and temporal ordered data

http://planetcassandra.org/apache-cassandra-use-cases/

Page 57: Introduction to Apache Cassandra

The “Slide Heard Round the World”

• From Cassandra Summit 2014, got a lot of attention

• 75,000+ nodes

• 10s of PBs of data

• Millions ops/s

• One of the largest known Cassandra deployments

57

Page 58: Introduction to Apache Cassandra

Spotify

• Streaming music web service

• > 24,000,000 music tracks

• > 50TB of data in Cassandra

Why Cassandra?

• Was PostgreSQL, but hit scaling

problems

• Multi Datacenter Availability

• Integration with Spark for data

processing and analytics

Usage

• Catalog

• User playlists

• Artists following

• Radio Stations

• Event notifications

58

http://planetcassandra.org/blog/interview/spotify-scales-to-the-top-of-the-charts-with-apache-cassandra-at-40k-requestssecond/

Page 59: Introduction to Apache Cassandra

eBay

• Online auction site

• > 250TB of data, dozens of nodes,

multiple data centres

• > 6 billion writes, > 5 billion reads

per day

Why Cassandra?

• Low latency, high scale, multiple data

centers

• Suited for graph structures using

wide rows

Usage

• Building next generation of

recommendation engine

• Storing user activity data

• Updating models of user interests in

real time

59

http://planetcassandra.org/blog/5-minute-c-interview-ebay/

Page 60: Introduction to Apache Cassandra

FullContact

• Contact management: from multiple

sources, sync, de-dupe, APIs available

• 2 clusters, dozens of nodes, running

in AWS

• Based here in Denver

Why Cassandra?

• Migated from MongoDB after

running into scaling issues

• Operational simplicity

• Resilience and Availability

Usage

• Person API (search by email, Twitter

handle, Facebook, or phone)

• Searched data from multiple sources

(ingested by Hadoop M/R jobs)

• Resolved profiles

60

http://planetcassandra.org/blog/fullcontact-readies-their-search-platform-to-scale-moves-from-mongodb-to-apache-cassandra/

Page 61: Introduction to Apache Cassandra

Instagram

• Photo-sharing, video-sharing and

social networking service

• Originally AWS (Now Facebook data

centers?)

• > 20k writes/second, >15k

reads/second

Why Cassandra?

• Migrated from Redis (problems

keeping everything in memory)

• No painful “sharding” process

• 75% reduction in costs

Usage

• Auditing information – security,

integrity, spam detection

• News feed (“inboxes” or activity feed)

– Likes, Follows, etc.

61

http://planetcassandra.org/blog/instagram-making-the-switch-to-cassandra-from-redis-75-instasavings/

Summit 2014 Presentation: https://www.youtube.com/watch?v=_gc94ITUitY

Page 62: Introduction to Apache Cassandra

Netflix

• TV and Movie streaming service

• > 2700+ nodes on over 90 clusters

• 4 Datacenters

• > 1 Trillion operations per day

Why Cassandra?

• Migrated from Oracle

• Massive amounts of data

• Multi datacenter, No SPOF

• No downtime for schema changes

Usage

• Everything! (Almost – 95% of DB use)

• Example: Personalization

– What titles do you play?

– What do you play before/after?

– Where did you pause?

– What did you abandon watching after 5

minutes?

62

http://planetcassandra.org/blog/case-study-netflix/

Summit 2014 Presentation: https://www.youtube.com/watch?v=RMSNLP_ORg8&index=43&list=UUvP-AXuCr-naAeEccCfKwUA

Page 63: Introduction to Apache Cassandra

Questions?

Follow me for updates or to ask questions later: @LukeTillman

63