Download - Cassandra at Zalando

Transcript
Page 1: Cassandra at Zalando

Cassandra at ZalandoEarly experiences, mistakes and the future

2 March 2016

Luis Mineiro

Page 2: Cassandra at Zalando

Outline

1. Early experiences 2. Mistakes and learnings 4. Data modeling 3. Future of Cassandra at Zalando

Page 3: Cassandra at Zalando

Early Experiences

The persistent sessions (aka PESSIONs)

Replaced Memcached + MySQL

Recommendations

Replaced MongoDB

ZMON (https://demo.zmon.io (https://demo.zmon.io) )

KairosDB on top of Cassandra

All those are still alive and kicking

Page 4: Cassandra at Zalando

Mistakes

Obviously, we did "some" mistakes:

Bad cluster planning

Poor choices on compaction

Wrong data models

etc...

But we learned a lot from them!

Page 5: Cassandra at Zalando

FU#01 - Over-sized nodes

We started with a relatively small amount of super sized nodes. Each node had lots of

CPU and lots of storage.

The first cluster had 10 nodes, evenly distributed across 2 data centers

Recovering or repairing a node required a lot of time

Loss of a single node had a relevant performance impact on the rest of the cluster

Page 6: Cassandra at Zalando

FU#02 - Bad storage planning (1/2)

We did estimations on required storage and provisioned each node accordingly. We used RF=3 and thougth we would be totally safe up to 80% usage

Page 7: Cassandra at Zalando

Bad storage planning (2/2)

Depending on the compaction strategy and number of SSTables, you may need up todouble the amount of disk space for a full compaction.

Page 8: Cassandra at Zalando

FU#03 - Wrong compaction strategies

The default SizeTieredCompactionStrategy seemed fine for all cases

But for data that is continuously expiring, it's NOT a good fit.LeveledCompactionStrategy is more appropriate (http://www.datastax.com/dev/blog/when-to-use-leveled-compaction) .

DateTieredCompactionStrategy is tricky (CASSANDRA-9666 (https://issues.apache.org/jira/browse/CASSANDRA-9666)

). The original author created an alternative TimeWindowCompactionStrategy(https://github.com/jeffjirsa/twcs/)

Page 9: Cassandra at Zalando

FU#04 - Cassandra configuration

We tried to be smart and tuned the amount of concurrent_reads and concurrent_writes.

The defaults were carefully chosen and they will do the Right Thing™. If, and when, you know each and every corner, you can adventure at it.

Page 10: Cassandra at Zalando

Data modeling and random stuff

Page 11: Cassandra at Zalando

RDBMS

In the relational world it's common practice:

think about the data

model it

build the application

3rdBuild Application

2ndNormalization

1stData

This worked fine because it's possible to join data, aggregate, however needed.

Page 12: Cassandra at Zalando

Cassandra

With Cassandra it's the other way around. You MUST start with 'How will I access thedata'?

3rdData

2ndDe-normalization

1stIdentify your queries

Knowing your queries in advance is NOT optional

This if different from RDBMS because you can't just JOIN or create new indexes tosupport new queries

Page 13: Cassandra at Zalando

What Does it Mean?

Page 14: Cassandra at Zalando

An example

Let's consider an application where videos are published and users can comment onthem.

create table video (

id uuid,

description text,

tags set<text>,

primary key(id)

);

create table user (

id text,

password text,

first_name text,

last_name text,

primary key (id)

);

This looks like a reasonable data model for videos and users

Page 15: Cassandra at Zalando

Commenting videos

create table comment (

video_id uuid,

user_id text,

comment_date timestamp,

content text,

primary key (video_id, user_id, comment_date)

);

How do we get the comments for a given video?

select * from comment where video_id = 7ede2c5e-8814-4516-a20d-bf01d4da381c;

What about for a given user?

select * from comment where user_id = 'lmineiro';

Page 16: Cassandra at Zalando

You wish!

You should get the infamous error:

Cannot execute this query as it might involve data filtering and thus may have unpredictable

performance. If you want to execute this query despite the performance unpredictability, use

ALLOW FILTERING

Page 17: Cassandra at Zalando

Getting comments for a given user

Let's try the suggestion

select * from comment where user_id = 'lmineiro' allow filtering;

video_id | user_id | comment_date | content

--------------------------------------+----------+--------------------------+--------------------

7ede2c5e-8814-4516-a20d-bf01d4da381c | lmineiro | 2016-03-02 13:18:05+0000 | Some dummy comment

Seems to work. But this will query all the nodes and won't be efficient.

We could still add an index:

create index comment_user_id on comment(user_id);

It would optimize the previous query slightly, but still not perfect.

Page 18: Cassandra at Zalando

Denormalization

Create multiple tables to support different queries

create table comment_by_user (

user_id text,

video_id uuid,

comment_date timestamp,

content text,

primary key (user_id, video_id, comment_date)

);

create table comment_by_video (

video_id uuid,

user_id text,

comment_date timestamp,

content text,

primary key (video_id, user_id, comment_date)

);

Page 19: Cassandra at Zalando

Inserting comments

Every time a user comments on a video, you need to insert a row on each table.

Page 20: Cassandra at Zalando

Think again

We DON'T have transactions. At least not as we're used to.

We can batch (https://docs.datastax.com/en/developer/java-driver/2.1/java-driver/reference/batch-statements.html) the insert statements

though.

begin batch using timestamp 123456789

insert into comment_by_user(user_id, video_id, comment_date, content)

values ('lmineiro', 7ede2c5e-8814-4516-a20d-bf01d4da381c,

dateof(now()), 'Dummy comment')

insert into comment_by_video(video_id, user_id, comment_date, content)

values (7ede2c5e-8814-4516-a20d-bf01d4da381c, 'lmineiro',

dateof(now()), 'Dummy comment')

apply batch;

Page 21: Cassandra at Zalando

Finally

Let's try to repeat the query for comments from a given user:

select * from comment_by_user where user_id='lmineiro';

user_id | video_id | comment_date | content

----------+--------------------------------------+--------------------------+----------------

lmineiro | 7ede2c5e-8814-4516-a20d-bf01d4da381c | 2016-03-02 13:43:49+0000 | Dummy comment

No error anymore. If we need to query the comments for a given video:

select * from comment_by_video where video_id = 7ede2c5e-8814-4516-a20d-bf01d4da381c;

video_id | user_id | comment_date | content

--------------------------------------+----------+--------------------------+----------------

7ede2c5e-8814-4516-a20d-bf01d4da381c | lmineiro | 2016-03-02 13:43:49+0000 | Dummy comment

Page 22: Cassandra at Zalando

Future

We continued to invest in Cassandra and we have a lot more teams and applicationsusing it.

Some of them, without any order of importance:

Cart and Checkout

IAM PlanB (JSON Web Token Provider)

The Platform

Many others...

Page 23: Cassandra at Zalando

Thank you

Luis [email protected] (mailto:[email protected])

@voidmaze (http://twitter.com/voidmaze)