HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

31
Cassandra Pretty Cool

description

HR5 alum Stephen Portanova will be presenting on the highly scalable database Cassandra, which is used by Reddit, Netflix, CERN, and The Weather Channel. 'nuff said.

Transcript of HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Page 1: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

CassandraPretty Cool

Page 2: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

HistoryGoogle Big Table

Amazon Dynamo

Page 3: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Today

Page 4: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Why Should You Care● Horizontal Scaling (basically auto sharding)

● Multiple Nodes - Highly Available

● Really Fast Writes

● Not too shabby at reads either - SLICES!!

● Bright Future

Page 5: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

The Cluster

● replication factor (rf)● read consistency (r)● write consistency (w)● clustering - shard on

partition key

Page 6: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

The One Ring

Page 7: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Storage - Vnodes

Page 8: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Data Model

● Wide rows

● Slices Queries

● Denormalization

● Index tables

Page 9: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, PRIMARY KEY(user_id));

Data Model - Simple Key

ROW KEY

Page 10: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Data Model - Simple InsertsINSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party’, ‘[email protected]‘, ‘[email protected]‘, ‘at my place’);

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘999’, ‘wat‘, ‘[email protected]‘, ‘[email protected]‘, ‘is going on?’);

Page 11: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Data Model Simple Inserts Result

Select * from email_app.emails;

111subject to_add cc body

wat horse@ giraffe@ is going on999

subject to_add cc body

party cat@ hippo@ at my place

Page 12: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Mental Model - Nested Hash

111

to cc bodyColumn Values

Row Keys 999

subject to cc bodysubject

Page 13: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party’, ‘[email protected]‘, ‘[email protected]‘, ‘at my place’);

Data Model - Simple Insert - Again

111 subject to_add cc body

party cat@ hippo@ at my place

subject to_add cc body

wat horse@ giraffe@ Is going on?999IDEMPOTENT

Page 14: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, PRIMARY KEY(user_id, subject));

Data Model - Composite Key 1

ROW KEY CLUSTERING KEY

Page 15: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘[email protected]‘, ‘[email protected]‘, ‘at my place’);

Data Model - Composite Insert 1

Same as Before. Right???

Page 16: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Data Model Composite Insert Result

Select * from emails WHERE user_id = 111;

111 party|to_ad party|cc party|body

cat@ hippo@ At my place

Subject

Page 17: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Mental Model - Nested Hash

111

to_add cc bodyColumn Values

Row Key

partyClustering Column

user_id

subject

Page 18: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ’swim’, ‘[email protected]‘, ‘[email protected]‘, ‘in the pool’);

Data Model - Composite Insert 2

Page 19: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Composite Insert 2 Result

Select * from emails WHERE user_id = ‘111’;

111 party|to_add party|cc party|body

cat@ hippo@ at my place

Subject

swim|to_add swim|cc swim|body

cat@ hippo@b in the pool

Sorted by clustering column - “subject”

Page 20: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Mental Model - Nested Sorted Hash

111

party

to cc body

Clustering Column

Column Values

Row Key

swim

to cc body

subject

user_id

Page 21: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Why sorted?

SELECT * FROM emails WHERE user_id = '111' AND (subject) >= ('s') AND (subject) < (‘t’);

111 party|to_add party|cc party|body

cat@ giraffe@ At my place

SLICE QUERIES!!

swim|to_add swim|cc swim|body

cat@ hippo@b in the pool

Page 22: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, PRIMARY KEY((user_id, subject), to_add));

DM - Compound Composite Key

ROW KEY CLUSTERING KEY

Page 23: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘wat‘, ‘[email protected]‘, ‘[email protected]‘, ‘is going on?’);

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘[email protected]‘, ‘[email protected]‘, ‘at my place’);

Composite / Compound Inserts

Page 24: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Composite Insert 2 Result

SELECT * FROM emails WHERE user_id = ‘111’AND subject = ‘party’;

111:partycat@|cc cat@|body

hippo@ At my place

SELECT * FROM emails WHERE user_id = ‘111’;

to_add

Page 25: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Data Model - Composite Insert 1

SELECT * FROM emails WHERE user_id = ‘111’ AND subject = ‘party’;

111:partycat@|cc cat@...|body

giraffe@ At my place

dog@|cc dog@|body

hippo@b all the time

Sorting / slice on - “to_add”

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘[email protected]‘, ‘[email protected]‘, ‘all the time’);

to_add

Page 26: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, PRIMARY KEY((user_id, subject), to_add, cc));

DM - Compound Composite Key 2

ROW KEY CLUSTERING KEYS

Page 27: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Composite / Clustered InsertsINSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘[email protected]‘, ‘[email protected]‘, ‘all the time);

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘[email protected]‘, ‘[email protected]‘, ‘At my place’);

INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘[email protected]‘, ‘[email protected]‘, ‘At my place’);

Page 28: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

DM - Composite / Clustered InsertsSELECT * FROM emails WHERE user_id = ‘111’ AND subject = ‘party’;

111|partycat@|hippo@|body cat@|mouse@|body

at my place at my place

dog@|hippo@|body

all the time

Slice on (to_add) OR (to_add, cc)

Page 29: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Mental Model - Nested Sorted Hash

111|party

cat dog

hippo mouse hippo

body body body

Clustering Columns

Column Values

Row Key

to_add

cc

user_id +subject

Page 30: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Part 2 / 8 of this 7 hour talk

● Denormalization

● Index Column Families

● Cassandra Internals (memtables, SSTables, compaction, repair)

Page 31: HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

Part 8 / 8: The Future

● Continually improving● More and more adoption● Awesome projects● http://www.datastax.

com/documentation/cassandra/2.0/pdf/cassandra20.pdf

● http://planetcassandra.org/