C* Summit 2013: The State of CQL by Sylvain Lebresne

The State of CQL

Sylvain Lebresne (@pcmanus)June 12, 2013

Why CQL?(Rational and goals behind CQL)

What is CQL?(How do you model application with CQL)

The native protocol(Transporting CQL queries)

What's next?(Cassandra 2.0 and beyond)

2/26

Disclaimer

This presentation focuses exclusively on CQL version 3. Many things do not apply to CQL version 1 and 2.

Unless explicitly state otherwise, the terms rows and columns means CQL3 rows and CQL3 columns, which doesnot map directly to the notion of rows and columns in thrift (or the internal C* implementation).

··

3/26

Why?Rational and goals behind CQL

The thrift API is:

Cassandra has often been regarded as hard to develop against.

It doesn't have to be that way!

Not user friendly, hard to use.

Low level.

Very little abstraction.

Hard to evolve (in a backward compatible way).

····

5/26

Why the hell a SQL look-alike query language?!

So why not?

Very easy to read.

Programming Language independent.

Ubiquitous, widely known.

Copy/paste friendly.

Easy to evolve.

Does not imply slow.

Doesn't force you to work with string.

·······

6/26

Hence, CQL

"Denormalized SQL"

Strictly real-time oriented

··

No joins

No sub-queries

No aggregation

Limited ORDER BY

----

7/26

CQL: the 'C' stands for Cassandra

Goals:

Not goals:

Provide a user friendly, productive API for C*.

Make it easy to do the right thing, hard to do the wrong one.

Provide higher level constructs for useful modeling patterns.

Be a complete alternative to the Thrift API.

····

Be SQL.

Abstract C* (useful) specificities away (distribution awareness, C* storage engine, ...).

Be slow.

···

8/26

What is CQL?How do you model application with CQL

Cassandra modeling 101

Efficient queries in Cassandra boils down to:

And denormalization is the technique that allows to achieve this in practice.

But this imply the API should:

The Thrift API allows that. So does CQL.

1. Data Locality at the cluster level: a query should only hit one node.

2. Data Locality at the node level: C* storage engine allows data collocation on disk.

expose how to collocate data in the same replica set.

expose how to collocate data on disk (for a given replica).

to query data that is collocated.

···

10/26

A naive e-mailing application

We want to model:

Users

Emails

Users inboxes (all emails received by a user in chronological order)

···

11/26

Storing user profiles

CREATE TABLE users ( user_id uuid, name text, password text, email text, picture_profile blob, PRIMARY KEY (user_id))

-- This is really an UPSERTINSERT INTO users (user_id, name, password, email, picture_profile) VALUES (51b-23-ab8, 'Sylvain Lebresne', 'Hd3!ba', '[email protected]', 0xf8ac...);

-- This too is an UPSERTUPDATE users SET email='[email protected]', password='B9a1̂' WHERE user_id=51b-23-ab8;

CQL

The first component of the PRIMARY KEY is called the partition key.

All the data sharing the same partition key is stored on the same replica set.

··

12/26

Allowing user defined properties

Say we want the user to be able to add to this own profile a set of custom properties:

user_id email name password picture_profile user_props

51b-23-ab8 [email protected] Sylvain Lebresne B9a1^ 0xf8ac... { 'myProperty' : 'Whatever I want' }

ALTER TABLE users ADD user_props map<text, text>;

UPDATE users SET user_props['myProperty'] = 'Whatever I want' WHERE user_id=51b-23-ab8;

SELECT * FROM users;

CQL

13/26

Storing emails

Only “indexed” queried are allowed. You cannot do:

That is, unless you explicitely index from using:

CREATE TABLE emails ( email_id timeuuid PRIMARY KEY, -- Embeds the email creation date subject text, sender uuid, recipients set<uuid>, body text)

-- Inserts emails...

CQL

SELECT * FROM emails WHERE sender=51b-23-ab8; CQL

CREATE INDEX ON emails(sender); CQL

14/26

Inboxes

For each user, it's inbox is the list of it's emails chronologically sorted.

To display the inbox, we need for each email the subject, the sender and recipients names and emails.

In a traditional RDBMS, we could join the users and emails table.

In Cassandra, we denormalize. That is, we store the pre-computed result of queries we care about (always up todate materialized view).

···

Good luck to scale that!-·

Collocate all the data for an inbox on the same node.

Collocate all inbox emails on disk, in the order queried.

This is typically the time-series kind of model for which Cassandra shines.

---

15/26

Storing inboxes

CQL distinguishes 2 sub-parts in the PRIMARY KEY:

In practice, we are interested by having emails stored in reverse chronological order.

CREATE TABLE inboxes ( user_id uuid, email_id timeuuid, sender_email text, recipients_emails set<text>, subject text, is_read boolean, PRIMARY KEY (user_id, email_id)) WITH CLUSTERING ORDER BY (email_id DESC)

CQL

partition key: decides the node on which the data is stored

clustering columns: within the same partition key, (CQL3) rows are physically ordered following the clusteringcolumns

··

16/26

Storing inboxes cont'd

In this example, this allows efficient queries of time range of emails for a given inbox.

email_id dateOf(email_id) sender_email recipients_emails subject

d20-32-012 2013-06-24 00:42+0000 Yuki Morishita <[email protected]> { 'Sylvain Lebresne' } あなたに幸せな誕生日 false

17a-bf-65f 2013-03-01 17:03+0000 Aleksey Yeschenko <[email protected]> { 'Sylvain Lebresne' } RE: What do you think? true

a9c-13-9da 2013-02-10 04:12+0000 Brandon Williams <[email protected]> { 'Jonathan Ellis', 'Sylvain Lebresne' } dtests are broken!?@# true

241-b4-ca0 2013-01-04 12:45+0000 Jonathan Ellis <[email protected]> { 'Sylvain Lebresne' } Whatzz up? true

-- Get all emails for user 51b-23-ab8 since Jan 01, 2013 in reverse chronological order.SELECT email_id, dateOf(email_id), sender_email, recipients_emails, subject, is_read FROM inboxes WHERE user_id=51b-23-ab8 AND email_id > minTimeuuid('2013-01-01 00:00+0000') ORDER BY email_id DESC;

CQL

17/26

Handling huge inboxes

What if inboxes can become too big? The traditional solution consists in sharding inboxes in adapted time shards(say a year), to avoid storing it all on one node.

This can be easily done using a composite partition key:

CREATE TABLE inboxes ( user_id uuid, year int, email_id timeuuid, sender_email text, recipients_names text, subject text, PRIMARY KEY ((user_id, year), email_id)) WITH CLUSTERING ORDER BY (email_id DESC)

CQL

18/26

Upgrading from thrift

For more details on the relationship between thrift and CQL:

CQL uses the same internal storage engine than Thrift

CQL can read your existing Thrift column families (no data migration needed):

You can read CQL3 tables from thrift, but this is not easy in practice because some CQL3 metadata are notexposed through thrift for compatibility reasons.

CQL is meant to be an alternative to Thrift, not a complement to it.

··

cqlsh> USE "<keyspace_name>";cqlsh> DESCRIBE "<column_family_name>";cqlsh> SELECT * FROM "<column_family_name>" LIMIT 20;

CQL

·

·

http://www.datastax.com/dev/blog/thrift-to-cql3

http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows

··

19/26

http://www.datastax.com/dev/blog/thrift-to-cql3

http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows

The native protocolTransporting CQL queries

The native protocol

A binary transport for CQL3:

Want to know more about drivers using this native protocol? Stay in the room for Michaël and Patrick's talk.

Asynchronous (allows multiple concurrent queries per connection)

Server notifications (Only for generic cluster events currently)

Made for CQL3

···

21/26

What's next?Cassandra 2.0 and beyond

Cassandra 2.0: CQL3

Compare-and-swap support

Triggers

Allow preparation of TIMESTAMP, TTL and LIMIT.

Primary key columns 2ndary indexing

ALTER ... DROP

·UPDATE login SET password='fs3!c' WHERE username='pcmanus' IF NOT EXISTS;UPDATE users SET email='[email protected]' WHERE user_id=51b-23-ab8 IF email='[email protected]';

CQL

····

23/26

Cassandra 2.0: Native protocol

One-short prepare-and-execute message

Batching of prepared statement

SASL authentication

Automatic query paging

····

24/26

After C* 2.0

Continue to improve the user experience by facilitating good data modeling, while respecting Cassandra inherentspecificities.

Storage engine optimizations

Collections 2ndary indexing

Aggregations within a partition

User defined 'struct' types

...

·····

25/26

Thank You!(Questions?)

C* Summit 2013: The State of CQL by Sylvain Lebresne

Technology

Transcript of C* Summit 2013: The State of CQL by Sylvain Lebresne