DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - NoSQl matters...
-
Upload
nosqlmatters -
Category
Data & Analytics
-
view
695 -
download
2
Transcript of DOAN DuyHai – Cassandra: real world best use-cases and worst anti-patterns - NoSQl matters...
@doanduyhai
Cassandra nice use-cases and worst anti-patterns DuyHai DOAN, Technical Advocate
@doanduyhai
Shameless self-promotion!
2
Duy Hai DOAN Cassandra technical advocate • talks, meetups, confs • open-source devs (Achilles, …) • technical point of contact
☞ [email protected] • production troubleshooting
@doanduyhai
Agenda!
3
Anti-patterns • Queue-like designs • CQL null values • Intensive updates on same column • Design around dynamic schema
@doanduyhai
Agenda!
4
Nice use-cases • Rate-limiting • Anti Fraud • Account validation • Sensor data timeseries
Data Model Crash Course!
@doanduyhai
Last Write Win (LWW)!
6
jdoe age name
33 John DOE
INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
#partition
@doanduyhai
Last Write Win (LWW)!
jdoe age (t1) name (t1)
33 John DOE
7
INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
auto-generated timestamp (μs)
.
@doanduyhai
Last Write Win (LWW)!
8
UPDATE users SET age = 34 WHERE login = jdoe;
jdoe age (t1) name (t1)
33 John DOE jdoe
age (t2)
34
SSTable1 SSTable2
@doanduyhai
Last Write Win (LWW)!
9
DELETE age FROM users WHERE login = jdoe;
jdoe age (t3)
ý
tombstone
jdoe age (t1) name (t1)
33 John DOE jdoe
age (t2)
34
SSTable1 SSTable2 SSTable3
@doanduyhai
Last Write Win (LWW)!
10
SELECT age FROM users WHERE login = jdoe;
? ? ?
SSTable1 SSTable2 SSTable3
jdoe age (t3)
ý jdoe
age (t1) name (t1)
33 John DOE jdoe
age (t2)
34
@doanduyhai
Last Write Win (LWW)!
11
SELECT age FROM users WHERE login = jdoe;
✓ ✕ ✕
SSTable1 SSTable2 SSTable3
jdoe age (t3)
ý jdoe
age (t1) name (t1)
33 John DOE jdoe
age (t2)
34
@doanduyhai
Compaction!
12
SSTable1 SSTable2 SSTable3
jdoe age (t3)
ý jdoe
age (t1) name (t1)
33 John DOE jdoe
age (t2)
34
New SSTable
jdoe age (t3) name (t1)
ý John DOE
@doanduyhai
Simple Table!
13
CREATE TABLE users ( login text, name text, age int, … PRIMARY KEY(login));
partition key (#partition)
@doanduyhai
Clustered table (1 – N)!
14
CREATE TABLE sensor_data ( sensor_id text, date timestamp, raw_data blob, PRIMARY KEY((sensor_id), date));
partition key clustering column (sorted)
unicity
@doanduyhai
Sorted on disk!
sensor_id date1(t1) date2(t2) date3(t3) date4(t4) date5(t5) …
… … … … …
SSTable1
Worst anti-patterns!
Queue-like designs!CQL null!
Intensive updates on same column!Design around dynamic schema!
!
@doanduyhai
Failure level!
17
☠
☠☠
☠☠☠
☠☠☠☠
@doanduyhai
Queue-like designs!
18
Adding new message ☞ 1 physical insert
@doanduyhai
Queue-like designs!
19
Adding new message ☞ 1 physical insert Consuming message = deleting it ☞ 1 physical insert (tombstone)
@doanduyhai
Queue-like designs!
20
Adding new message ☞ 1 physical insert Consuming message = deleting it ☞ 1 physical insert (tombstone) Transactional queue = re-inserting messages ☞ physical insert * <many>
@doanduyhai
Queue-like designs!
21
A
FIFO queue
{ A }
@doanduyhai
Queue-like designs!
22
A B
FIFO queue
{ A, B }
@doanduyhai
Queue-like designs!
23
A B C
FIFO queue
{ A, B, C }
@doanduyhai
Queue-like designs!
24
A B C A
FIFO queue
{ B, C }
@doanduyhai
Queue-like designs!
25
A B C A D
FIFO queue
{ B, C, D }
@doanduyhai
Queue-like designs!
26
A B C A D B
FIFO queue
{ C, D }
@doanduyhai
Queue-like designs!
27
A B C A D B C
FIFO queue
{ D }
@doanduyhai
Queue-like designs!
28
A A A A A A A A A A
FIFO queue, worst case
{ }
@doanduyhai
Failure level!
29
☠☠☠
@doanduyhai
Queue-like designs!
30
Solution: event-sourcing • write ahead, never delete • read = move a cursor forward (or backward in time for history)
A B C D A E
Write cursor
Read cursor. Next read will give {A, E}
@doanduyhai
CQL null semantics!
31
Reading null value means • value does not exist (has never bean created) • value deleted (tombstone)
SELECT age FROM users WHERE login = jdoe; à NULL
@doanduyhai
CQL null semantics!
32
Writing null means • delete value (creating tombstone) • even though it does not exist
UPDATE users SET age = NULL WHERE login = jdoe;
@doanduyhai
CQL null semantics!
33
Seen in production: prepared statement
UPDATE users SET age = ?, … geo_location = ?, mood = ?, … WHERE login = ?;
@doanduyhai
CQL null semantics!
34
Seen in production: bound statement
preparedStatement.bind(33, …, null, null, null, …);
null ☞ tombstone creation on each update …
jdoe age name geo_loc mood status
33 John DOE ý ý ý
@doanduyhai
Failure level!
35
☠
@doanduyhai
Intensive update!
36
Context • small start-up • cloud-based video recording & alarm • internet of things (sensor) • 10 updates/sec for some sensors
@doanduyhai
Intensive update on same column!
37
Data model
sensor_id value
45.0034
CREATE TABLE sensor_data ( sensor_id long, value double, PRIMARY KEY(sensor_id));
@doanduyhai
Intensive update on same column!
38
Updates
sensor_id value (t1)
45.0034
UPDATE sensor_data SET value = 45.0034 WHERE sensor_id = …; UPDATE sensor_data SET value = 47.4182 WHERE sensor_id = …; UPDATE sensor_data SET value = 48.0300 WHERE sensor_id = …;
sensor_id value (t13)
47.4182 sensor_id
value (t36)
48.0300
@doanduyhai
Intensive update on same column!
39
Read
SELECT sensor_value from sensor_data WHERE sensor_id = …;
read N physical columns, only 1 useful … (until compaction)
sensor_id value (t1)
45.0034 sensor_id
value (t13)
47.4182 sensor_id
value (t36)
48.0300
@doanduyhai
Failure level!
40
☠☠
@doanduyhai
Intensive update on same column!
41
Solution 1: leveled compaction! (if your I/O can keep up)
sensor_id value (t1)
45.0034 sensor_id
value (t13)
47.4182 sensor_id
value (t36)
48.0300
sensor_id value (t36)
48.0300
@doanduyhai
Intensive update on same column!
42
Solution 2: reversed timeseries & DateTiered compaction strategy
CREATE TABLE sensor_data ( sensor_id long, date timestamp, value double, PRIMARY KEY((sensor_id), date)) WITH CLUSTERING ORDER (date DESC);
@doanduyhai
Intensive update on same column!
43
Data cleaning by configuration the strategy (base_time_seconds)
SELECT sensor_value FROM sensor_data WHERE sensor_id = … LIMIT 1;
sensor_id date3(t3) date2(t2) date1(t1) ...
48.0300 47.4182 45.0034 …
@doanduyhai
Design around dynamic schema!
44
Customer emergency call • 3 nodes cluster almost full • impossible to scale out • 4th node in JOINING state for 1 week • disk space is filling up, production at risk!
@doanduyhai
Design around dynamic schema!
45
After investigation • 4th node in JOINING state because streaming is stalled • NPE in logs
@doanduyhai
Design around dynamic schema!
46
After investigation • 4th node in JOINING state because streaming is stalled • NPE in logs Cassandra source-code to the rescue
@doanduyhai
Design around dynamic schema!
47
public class CompressedStreamReader extends StreamReader { … @Override public SSTableWriter read(ReadableByteChannel channel) throws IOException { … Pair<String, String> kscf = Schema.instance.getCF(cfId); ColumnFamilyStore cfs = Keyspace.open(kscf.left).getColumnFamilyStore(kscf.right);
NPE here
@doanduyhai
Design around dynamic schema!
48
The truth is • the devs dynamically drop & recreate table every day • dynamic schema is in the core of their design Example:
DROP TABLE catalog_127_20140613; CREATE TABLE catalog_127_20140614( … );
@doanduyhai
Design around dynamic schema!
49
Failure sequence
n1
n2
n4
n3
catalog_x_y
catalog_x_y
catalog_x_y
catalog_x_y
1 4
2
3
5
6
@doanduyhai
Design around dynamic schema!
50
Failure sequence
n1
n2
n4
n3
catalog_x_y
catalog_x_y
catalog_x_y
catalog_x_y
1 4
2
3
5
6
catalog_x_z
catalog_x_z
catalog_x_z
catalog_x_z
@doanduyhai
Design around dynamic schema!
51
Failure sequence
n1
n2
n4
n3
1 4
2
3
5
6
catalog_x_z
catalog_x_z
catalog_x_z
catalog_x_z
catalog_x_y ????
@doanduyhai
Design around dynamic schema!
52
Nutshell • dynamic schema change as normal prod operation is not
recommended • schema AND topology change at the same time is an anti-pattern
@doanduyhai
Failure level!
53
☠☠☠☠
Q & R
! " !
Nice Examples!
Rate limiting!Anti Fraud!
Account Validation!
@doanduyhai
Rate limiting!
56
Start-up company, reset password feature
1) /password/reset
2) SMS with token A0F83E63DB935465CE73DFE….
Phone number Random token
3) /password/new/<token>/<password>
@doanduyhai
Rate limiting!
57
Problem 1 • account created with premium phone number
@doanduyhai
Rate limiting!
58
Problem 1 • account created with premium phone number • /password/reset x 100
@doanduyhai
Rate limiting!
59
« money, money, money, give money, in the richman’s world » $$$
@doanduyhai
Rate limiting!
60
Problem 2 • massive hack
@doanduyhai
Rate limiting!
61
Problem 2 • massive hack • 106 /password/reset calls from few accounts
@doanduyhai
Rate limiting!
62
Problem 2 • massive hack • 106 /password/reset calls from few accounts • SMS messages are cheap
@doanduyhai
Rate limiting!
63
Problem 2 • ☞ but not at the 106/per user/per day scale
@doanduyhai
Rate limiting!
64
Solution • premium phone number ☞ Google libphonenumber
@doanduyhai
Rate limiting!
65
Solution • premium phone number ☞ Google libphonenumber • massive hack ☞ rate limiting with Cassandra
@doanduyhai
Cassandra Time To Live!
66
Time to live • built-in feature • insert data with a TTL in sec • expires server-side automatically • ☞ use as sliding-window
@doanduyhai
Rate limiting in action!
67
Implementation • threshold = max 3 reset password per sliding 24h per
user
@doanduyhai
Rate limiting in action!
68
Implementation • when /password/reset called • check threshold • reached ☞ error message/ignore
• not reached ☞ log the attempt with TTL = 86400
Rate Limiting Demo
@doanduyhai
Anti Fraud!
70
Real story • many special offers available • 30 mins international calls (50 countries)
• unlimited land-line calls to 5 countries • …
@doanduyhai
Anti Fraud!
71
Real story • each offer has a duration (week/month/year) • only one offer active at a time
@doanduyhai
Anti Fraud!
72
Cassandra TTL • when granting new offer
INSERT INTO user_special_offer(login, offer_code, …) VALUES(‘jdoe’, ’30_mins_international’,…) IF NOT EXISTS USING TTL <offer_duration>;
Anti Fraud Demo
@doanduyhai
Account Validation!
74
Requirement • user creates new account • sends sms/email link with token to validate account • 10 days to validate
@doanduyhai
Account Validation!
75
How to ? • create account with 10 days TTL
INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33) USING TTL 864000;
@doanduyhai
Account Validation!
76
How to ? • create random token for validation with 10 days TTL
INSERT INTO account_validation(token, login, name, age) VALUES(‘A0F83E63DB935465CE73DFE…’, ‘jdoe’, ‘John DOE’, 33) USING TTL 864000;
@doanduyhai
Account Validation!
77
On token validation • check token exist & retrieve user details
SELECT login, name, age FROM account_validation WHERE token = ‘A0F83E63DB935465CE73DFE…’;
• re-insert durably user details without TTL INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
Q & R
! " !