2015 03-31 dom-code_cassandra

49
Storing Data the Cassandra Way Freek van Polen “Simplicity is the ultimate sophistication.” -Leonardo Da Vinci

Transcript of 2015 03-31 dom-code_cassandra

Page 1: 2015 03-31 dom-code_cassandra

Storing Data the Cassandra WayFreek van Polen

“Simplicity is the ultimate sophistication.”

-Leonardo Da Vinci

Page 2: 2015 03-31 dom-code_cassandra

About Me

● Cognitive Artificial Intelligence @ UU● Almende:

– Researcher, self-organization, wireless sensor networks

● Sense:– First employee, developer, backend, team leader

Page 3: 2015 03-31 dom-code_cassandra

Goal of this Talk

● Not to tell you to use Cassandra for everything

Page 4: 2015 03-31 dom-code_cassandra

Goal of this Talk

● Not to tell you to use Cassandra for everything● Not to go into all gory details of Cassandra

Page 5: 2015 03-31 dom-code_cassandra

Goal of this Talk

● Not to tell you to use Cassandra for everything● Not to go into all gory details of Cassandra● Not to show off my drawing skills

Page 6: 2015 03-31 dom-code_cassandra

Goal of this Talk

● Introduce some database theory

Page 7: 2015 03-31 dom-code_cassandra

Goal of this Talk

● Introduce some database theory● Make you enthusiastic about Cassandra

Page 8: 2015 03-31 dom-code_cassandra

Goal of this Talk

● Introduce some database theory● Make you enthusiastic about Cassandra

Use the right tool for the job!

Page 9: 2015 03-31 dom-code_cassandra

SQL vs NoSQL

Page 10: 2015 03-31 dom-code_cassandra

Non relational data

● documents● graphs● key-value● wide-row

{“meetup”: “DomCode”,“name”: “Monthly Meetup March”,“date”: “31­03­2015”,“organizers”: [

“Ross”,“Lucas”

],“location”: “Infi”,“talks”: [

{“speaker”: “Arian”,“topic”: “functional js”

},{

“speaker”: “Freek”,“topic”: “cassandra”,“length”: 45

}]

}

Page 11: 2015 03-31 dom-code_cassandra

Non relational data

● documents● graphs● key-value● wide-row

Page 12: 2015 03-31 dom-code_cassandra

Non relational data

● documents● graphs● key-value● wide-row

“6573672017456”: “tom”

“9810919283853”: “mary”

“1129823948673”: “jane”

“9827712003003”: “dani”

Page 13: 2015 03-31 dom-code_cassandra

Non relational data

● documents● graphs● key-value● wide-row

Stock 1: [+3.23, +5.02, ­2.45, ­34.12, ­12.65, +0.43]Stock 2: [+5.34, +6.34, +9.12, +0.53]Stock 3: [­3.21, ­5.32, ­8.12, ­0.53, ­0.12, +0.43, +0.76, +2.55] 

Page 14: 2015 03-31 dom-code_cassandra

Scalability

Database [A-Z]

Page 15: 2015 03-31 dom-code_cassandra

Scalability

Database [A-Z]

Page 16: 2015 03-31 dom-code_cassandra

Scalability: replication

Database [A-Z]

Copy of Database [A-Z]

Page 17: 2015 03-31 dom-code_cassandra

Scalability: sharding

Database [A-M]

Database [N-Z]

Page 18: 2015 03-31 dom-code_cassandra

Scalability: combined(oh the horror)

Database [A-M] Copy of Database [A-M]

Database [N-Z] Copy of Database [N-Z]

Page 19: 2015 03-31 dom-code_cassandra

Coordinator

Scalability: combined(oh the horror)

Database [A-M] Copy of Database [A-M]

Database [N-Z] Copy of Database [N-Z]

Page 20: 2015 03-31 dom-code_cassandra

CAP theorem

Page 21: 2015 03-31 dom-code_cassandra

CAP theorem

RDBMS

Page 22: 2015 03-31 dom-code_cassandra

ACID & BASE

● Atomicity● Consistency● Isolation● Durability

● Basically● Available● Soft-state ● Eventual consistency

Page 23: 2015 03-31 dom-code_cassandra

Cassandra \o/

Page 24: 2015 03-31 dom-code_cassandra

Cassandra Cluster

● All nodes are equal● All nodes can process any request● Throughput and storage space scales linearly● Graceful degradation

Page 25: 2015 03-31 dom-code_cassandra

Cassandra: Gossip

5

37

1

8 2

46

Page 26: 2015 03-31 dom-code_cassandra

Cassandra: Gossip

5

37

1

8 2

46

Page 27: 2015 03-31 dom-code_cassandra

Cassandra: Replication

5

37

1

8 2

46

Token Ring

Page 28: 2015 03-31 dom-code_cassandra

Cassandra: Replication

5

37

1

8 2

46

[A - D]

[E - H]

[I - K]

[L - N][O - Q]

[R - T]

[U - W]

[X - Z]

Page 29: 2015 03-31 dom-code_cassandra

Cassandra: Replication

5

37

1

8 2

46

[A - D]

[E - H]

[I - K]

[L - N][O - Q]

[R - T]

[U - W]

[X - Z] replication = 3

Page 30: 2015 03-31 dom-code_cassandra

Cassandra: Request Processing

5

37

1

8 2

46

K: 12-24?

Page 31: 2015 03-31 dom-code_cassandra

Cassandra: Request Processing

5

37

1

8 2

46

K: 12-24? K: 12-24?

Page 32: 2015 03-31 dom-code_cassandra

Cassandra: Request Processing

5

37

1

8 2

46

K: 12-24!

K: 12-24!

Page 33: 2015 03-31 dom-code_cassandra

Cassandra: Request Processing

5

37

1

8 2

46

K: 12-24!

Page 34: 2015 03-31 dom-code_cassandra

Cassandra: Request Processing

5

37

1

8 2

46

K:22 = “tom”

Page 35: 2015 03-31 dom-code_cassandra

Cassandra: Request Processing

5

37

1

8 2

46

K:22 = “tom” K:22 = “tom”

Page 36: 2015 03-31 dom-code_cassandra

Cassandra: Request Processing

5

37

1

8 2

46

Got it!

Great success!

Page 37: 2015 03-31 dom-code_cassandra

Cassandra: Request Processing

5

37

1

8 2

46

K:22 = “tom”

K:22 = “tom”

K:22 = “jerry”

Page 38: 2015 03-31 dom-code_cassandra

Cassandra: Request Processing

5

37

1

8 2

46

K:22 = “tom”

K:22 = “tom”

K:22 = “jerry”

K:22 = “tom”

Page 39: 2015 03-31 dom-code_cassandra

Cassandra: Request Processing

5

37

1

8 2

46

K:22 = “tom”

K:22 = “tom”

K:22 = “tom”

Page 40: 2015 03-31 dom-code_cassandra

Cassandra: Tunable Consistency

● Consistency.ANY

● Consistency.ONE

● Consistency.QOURUM

● Consistency.ALL

Page 41: 2015 03-31 dom-code_cassandra

Cassandra: Data Model

CREATE TABLE transactions (account_id  string,date  timestamp,amount  float,other_account  string,PRIMARY KEY (account_id, date)

);

Page 42: 2015 03-31 dom-code_cassandra

Cassandra: Data Model

CREATE TABLE transactions (account_id  string,date  timestamp,amount  float,other_account string,PRIMARY KEY (account_id, date)

);

29-03-2015 15:23:45

29-03-2015 16:56:21

30-03-2015 04:00:00

NLRABOxxx amount: -143.90other: NLINGxxx

amount: -29.30other: NLABNxxx

amount: +2000.00other: NLASNxxx

Page 43: 2015 03-31 dom-code_cassandra

Cassandra Data Model

● Primary Key #1:– Row key

– Unit of partitioning

● Primary Key #2:– Column key

– Order within row

– Order is preserved on disk!

Page 44: 2015 03-31 dom-code_cassandra

Distributed Concurrent Writes

from: https://aphyr.com/posts/294-call-me-maybe-cassandra

Page 45: 2015 03-31 dom-code_cassandra

Distributed Concurrent Writes

● Last write wins● Paxos● Treat data as

immutable

from: https://aphyr.com/posts/294-call-me-maybe-cassandra

Page 46: 2015 03-31 dom-code_cassandra

Distributed Concurrent Writes

● Cassandra trades consistency for availability, partition tolerance, and super easy administration...

● This is not for all use cases

from: https://aphyr.com/posts/294-call-me-maybe-cassandra

Page 47: 2015 03-31 dom-code_cassandra

THE use case: time series data

CREATE TABLE sensor_data (  sensor_id  text,  time  timestamp,  value  text,  PRIMARY KEY (sensor_id, time));

Page 48: 2015 03-31 dom-code_cassandra

Thanks!

Page 49: 2015 03-31 dom-code_cassandra

Cassandra?