Download - Cassandra at Zalando

Cassandra at ZalandoEarly experiences, mistakes and the future

2 March 2016

Luis Mineiro

Outline

1. Early experiences 2. Mistakes and learnings 4. Data modeling 3. Future of Cassandra at Zalando

Early Experiences

The persistent sessions (aka PESSIONs)

Replaced Memcached + MySQL

Recommendations

Replaced MongoDB

ZMON (https://demo.zmon.io (https://demo.zmon.io) )

KairosDB on top of Cassandra

All those are still alive and kicking

https://demo.zmon.io/

Mistakes

Obviously, we did "some" mistakes:

Bad cluster planning

Poor choices on compaction

Wrong data models

etc...

But we learned a lot from them!

FU#01 - Over-sized nodes

We started with a relatively small amount of super sized nodes. Each node had lots of

CPU and lots of storage.

The first cluster had 10 nodes, evenly distributed across 2 data centers

Recovering or repairing a node required a lot of time

Loss of a single node had a relevant performance impact on the rest of the cluster

FU#02 - Bad storage planning (1/2)

We did estimations on required storage and provisioned each node accordingly. We used RF=3 and thougth we would be totally safe up to 80% usage

Bad storage planning (2/2)

Depending on the compaction strategy and number of SSTables, you may need up todouble the amount of disk space for a full compaction.

FU#03 - Wrong compaction strategies

The default SizeTieredCompactionStrategy seemed fine for all cases

But for data that is continuously expiring, it's NOT a good fit.LeveledCompactionStrategy is more appropriate (http://www.datastax.com/dev/blog/when-to-use-leveled-compaction) .

DateTieredCompactionStrategy is tricky (CASSANDRA-9666 (https://issues.apache.org/jira/browse/CASSANDRA-9666)

). The original author created an alternative TimeWindowCompactionStrategy(https://github.com/jeffjirsa/twcs/)

http://www.datastax.com/dev/blog/when-to-use-leveled-compaction

https://issues.apache.org/jira/browse/CASSANDRA-9666

https://github.com/jeffjirsa/twcs/

FU#04 - Cassandra configuration

We tried to be smart and tuned the amount of concurrent_reads and concurrent_writes.

The defaults were carefully chosen and they will do the Right Thing™. If, and when, you know each and every corner, you can adventure at it.

Data modeling and random stuff

RDBMS

In the relational world it's common practice:

think about the data

model it

build the application

3rdBuild Application

2ndNormalization

1stData

This worked fine because it's possible to join data, aggregate, however needed.

Cassandra

With Cassandra it's the other way around. You MUST start with 'How will I access thedata'?

3rdData

2ndDe-normalization

1stIdentify your queries

Knowing your queries in advance is NOT optional

This if different from RDBMS because you can't just JOIN or create new indexes tosupport new queries

What Does it Mean?

An example

Let's consider an application where videos are published and users can comment onthem.

create table video (

id uuid,

description text,

tags set<text>,

primary key(id)

);

create table user (

id text,

password text,

first_name text,

last_name text,

primary key (id)

);

This looks like a reasonable data model for videos and users

Commenting videos

create table comment (

video_id uuid,

user_id text,

comment_date timestamp,

content text,

primary key (video_id, user_id, comment_date)

);

How do we get the comments for a given video?

select * from comment where video_id = 7ede2c5e-8814-4516-a20d-bf01d4da381c;

What about for a given user?

select * from comment where user_id = 'lmineiro';

You wish!

You should get the infamous error:

Cannot execute this query as it might involve data filtering and thus may have unpredictable

performance. If you want to execute this query despite the performance unpredictability, use

ALLOW FILTERING

Getting comments for a given user

Let's try the suggestion

select * from comment where user_id = 'lmineiro' allow filtering;

video_id | user_id | comment_date | content

--------------------------------------+----------+--------------------------+--------------------

7ede2c5e-8814-4516-a20d-bf01d4da381c | lmineiro | 2016-03-02 13:18:05+0000 | Some dummy comment

Seems to work. But this will query all the nodes and won't be efficient.

We could still add an index:

create index comment_user_id on comment(user_id);

It would optimize the previous query slightly, but still not perfect.

Denormalization

Create multiple tables to support different queries

create table comment_by_user (

user_id text,

video_id uuid,


content text,

primary key (user_id, video_id, comment_date)

);

create table comment_by_video (

video_id uuid,

user_id text,


content text,

primary key (video_id, user_id, comment_date)

);

Inserting comments

Every time a user comments on a video, you need to insert a row on each table.

Think again

We DON'T have transactions. At least not as we're used to.

We can batch (https://docs.datastax.com/en/developer/java-driver/2.1/java-driver/reference/batch-statements.html) the insert statements

though.

begin batch using timestamp 123456789

insert into comment_by_user(user_id, video_id, comment_date, content)

values ('lmineiro', 7ede2c5e-8814-4516-a20d-bf01d4da381c,

dateof(now()), 'Dummy comment')

insert into comment_by_video(video_id, user_id, comment_date, content)

values (7ede2c5e-8814-4516-a20d-bf01d4da381c, 'lmineiro',

dateof(now()), 'Dummy comment')

apply batch;

https://docs.datastax.com/en/developer/java-driver/2.1/java-driver/reference/batch-statements.html

Finally

Let's try to repeat the query for comments from a given user:

select * from comment_by_user where user_id='lmineiro';

user_id | video_id | comment_date | content

----------+--------------------------------------+--------------------------+----------------

lmineiro | 7ede2c5e-8814-4516-a20d-bf01d4da381c | 2016-03-02 13:43:49+0000 | Dummy comment

No error anymore. If we need to query the comments for a given video:

select * from comment_by_video where video_id = 7ede2c5e-8814-4516-a20d-bf01d4da381c;

video_id | user_id | comment_date | content

--------------------------------------+----------+--------------------------+----------------

7ede2c5e-8814-4516-a20d-bf01d4da381c | lmineiro | 2016-03-02 13:43:49+0000 | Dummy comment

Future

We continued to invest in Cassandra and we have a lot more teams and applicationsusing it.

Some of them, without any order of importance:

Cart and Checkout

IAM PlanB (JSON Web Token Provider)

The Platform

Many others...

Thank you

Luis [email protected] (mailto:[email protected])

@voidmaze (http://twitter.com/voidmaze)

mailto:[email protected]

http://twitter.com/voidmaze