2015 03-31 dom-code_cassandra
-
Upload
freek-van-polen -
Category
Software
-
view
56 -
download
0
Transcript of 2015 03-31 dom-code_cassandra
Storing Data the Cassandra WayFreek van Polen
“Simplicity is the ultimate sophistication.”
-Leonardo Da Vinci
About Me
● Cognitive Artificial Intelligence @ UU● Almende:
– Researcher, self-organization, wireless sensor networks
● Sense:– First employee, developer, backend, team leader
Goal of this Talk
● Not to tell you to use Cassandra for everything
Goal of this Talk
● Not to tell you to use Cassandra for everything● Not to go into all gory details of Cassandra
Goal of this Talk
● Not to tell you to use Cassandra for everything● Not to go into all gory details of Cassandra● Not to show off my drawing skills
Goal of this Talk
● Introduce some database theory
Goal of this Talk
● Introduce some database theory● Make you enthusiastic about Cassandra
Goal of this Talk
● Introduce some database theory● Make you enthusiastic about Cassandra
Use the right tool for the job!
SQL vs NoSQL
Non relational data
● documents● graphs● key-value● wide-row
{“meetup”: “DomCode”,“name”: “Monthly Meetup March”,“date”: “31032015”,“organizers”: [
“Ross”,“Lucas”
],“location”: “Infi”,“talks”: [
{“speaker”: “Arian”,“topic”: “functional js”
},{
“speaker”: “Freek”,“topic”: “cassandra”,“length”: 45
}]
}
Non relational data
● documents● graphs● key-value● wide-row
Non relational data
● documents● graphs● key-value● wide-row
“6573672017456”: “tom”
“9810919283853”: “mary”
“1129823948673”: “jane”
“9827712003003”: “dani”
Non relational data
● documents● graphs● key-value● wide-row
Stock 1: [+3.23, +5.02, 2.45, 34.12, 12.65, +0.43]Stock 2: [+5.34, +6.34, +9.12, +0.53]Stock 3: [3.21, 5.32, 8.12, 0.53, 0.12, +0.43, +0.76, +2.55]
Scalability
Database [A-Z]
Scalability
Database [A-Z]
Scalability: replication
Database [A-Z]
Copy of Database [A-Z]
Scalability: sharding
Database [A-M]
Database [N-Z]
Scalability: combined(oh the horror)
Database [A-M] Copy of Database [A-M]
Database [N-Z] Copy of Database [N-Z]
Coordinator
Scalability: combined(oh the horror)
Database [A-M] Copy of Database [A-M]
Database [N-Z] Copy of Database [N-Z]
CAP theorem
CAP theorem
RDBMS
ACID & BASE
● Atomicity● Consistency● Isolation● Durability
● Basically● Available● Soft-state ● Eventual consistency
Cassandra \o/
Cassandra Cluster
● All nodes are equal● All nodes can process any request● Throughput and storage space scales linearly● Graceful degradation
Cassandra: Gossip
5
37
1
8 2
46
Cassandra: Gossip
5
37
1
8 2
46
Cassandra: Replication
5
37
1
8 2
46
Token Ring
Cassandra: Replication
5
37
1
8 2
46
[A - D]
[E - H]
[I - K]
[L - N][O - Q]
[R - T]
[U - W]
[X - Z]
Cassandra: Replication
5
37
1
8 2
46
[A - D]
[E - H]
[I - K]
[L - N][O - Q]
[R - T]
[U - W]
[X - Z] replication = 3
Cassandra: Request Processing
5
37
1
8 2
46
K: 12-24?
Cassandra: Request Processing
5
37
1
8 2
46
K: 12-24? K: 12-24?
Cassandra: Request Processing
5
37
1
8 2
46
K: 12-24!
K: 12-24!
Cassandra: Request Processing
5
37
1
8 2
46
K: 12-24!
Cassandra: Request Processing
5
37
1
8 2
46
K:22 = “tom”
Cassandra: Request Processing
5
37
1
8 2
46
K:22 = “tom” K:22 = “tom”
Cassandra: Request Processing
5
37
1
8 2
46
Got it!
Great success!
Cassandra: Request Processing
5
37
1
8 2
46
K:22 = “tom”
K:22 = “tom”
K:22 = “jerry”
Cassandra: Request Processing
5
37
1
8 2
46
K:22 = “tom”
K:22 = “tom”
K:22 = “jerry”
K:22 = “tom”
Cassandra: Request Processing
5
37
1
8 2
46
K:22 = “tom”
K:22 = “tom”
K:22 = “tom”
Cassandra: Tunable Consistency
● Consistency.ANY
● Consistency.ONE
● Consistency.QOURUM
● Consistency.ALL
Cassandra: Data Model
CREATE TABLE transactions (account_id string,date timestamp,amount float,other_account string,PRIMARY KEY (account_id, date)
);
Cassandra: Data Model
CREATE TABLE transactions (account_id string,date timestamp,amount float,other_account string,PRIMARY KEY (account_id, date)
);
29-03-2015 15:23:45
29-03-2015 16:56:21
30-03-2015 04:00:00
NLRABOxxx amount: -143.90other: NLINGxxx
amount: -29.30other: NLABNxxx
amount: +2000.00other: NLASNxxx
Cassandra Data Model
● Primary Key #1:– Row key
– Unit of partitioning
● Primary Key #2:– Column key
– Order within row
– Order is preserved on disk!
Distributed Concurrent Writes
from: https://aphyr.com/posts/294-call-me-maybe-cassandra
Distributed Concurrent Writes
● Last write wins● Paxos● Treat data as
immutable
from: https://aphyr.com/posts/294-call-me-maybe-cassandra
Distributed Concurrent Writes
● Cassandra trades consistency for availability, partition tolerance, and super easy administration...
● This is not for all use cases
from: https://aphyr.com/posts/294-call-me-maybe-cassandra
THE use case: time series data
CREATE TABLE sensor_data ( sensor_id text, time timestamp, value text, PRIMARY KEY (sensor_id, time));
Thanks!
Cassandra?