Cassandra Summit 2015

71
CTO, DataStax Jonathan Ellis Project Chair, Apache Cassandra™

Transcript of Cassandra Summit 2015

Page 1: Cassandra Summit 2015

CTO, DataStax

Jonathan EllisProject Chair, Apache Cassandra™

Page 2: Cassandra Summit 2015

“”

After two years of working with Cassandra’s CQL driver, [JDBC] feels like a step backwards.

Java Python C# C++

Ruby Go PHP node.js

Jonah Grimes

Page 3: Cassandra Summit 2015
Page 4: Cassandra Summit 2015

Operational Database Magic Quadrant

Page 5: Cassandra Summit 2015
Page 6: Cassandra Summit 2015

Legacy Post Relational

Ope

ratio

nal

Anal

ytic

al

Page 7: Cassandra Summit 2015
Page 8: Cassandra Summit 2015
Page 9: Cassandra Summit 2015
Page 10: Cassandra Summit 2015

client

server

Page 11: Cassandra Summit 2015
Page 12: Cassandra Summit 2015

client

router

2002

Page 13: Cassandra Summit 2015

X

Page 14: Cassandra Summit 2015
Page 15: Cassandra Summit 2015
Page 16: Cassandra Summit 2015

Armin Ronacher

Page 17: Cassandra Summit 2015
Page 18: Cassandra Summit 2015

client

p1

p1

p1

Page 19: Cassandra Summit 2015
Page 20: Cassandra Summit 2015

Apr 2011 Oct 2012 Dec 2012 Sep 2013 Sep 2014 Sep 2015

outages

Page 21: Cassandra Summit 2015

AWS RE:Boot 2014. 218 Cassandra nodes rebooted. 22 nodes didn’t come back and got replaced. Zero Netflix downtime.

Christos Kalantzis Director of Engineering

“”

Page 22: Cassandra Summit 2015
Page 23: Cassandra Summit 2015
Page 24: Cassandra Summit 2015

2013: 3 nodes

2015: 300 nodes

Page 25: Cassandra Summit 2015
Page 26: Cassandra Summit 2015

Geolocated

Active-Active — Stable State

Geolocated

Regional Load Balancers Regional Load Balancers

Page 27: Cassandra Summit 2015

100%

Active-Active — Failover

Geolocated

Regional Load Balancers Regional Load Balancers

Page 28: Cassandra Summit 2015
Page 29: Cassandra Summit 2015

HDFS

64MB Blocks

Page 30: Cassandra Summit 2015
Page 31: Cassandra Summit 2015

[I]t seems like HBase is very buggy where 50% of sampled failures are catastrophic, and Cassandra is well engineered.

“”

Murat Demirbas SUNY Buffalo

Page 32: Cassandra Summit 2015

mmap, fsync, fadvise

commitlog, sstables

compaction, streaming

Page 33: Cassandra Summit 2015

write-mostly read-mostly

balanced op/analytic

Page 34: Cassandra Summit 2015
Page 35: Cassandra Summit 2015

Cassandra 2.2 and 3.0

Page 36: Cassandra Summit 2015

Windows Support

JSONUDF

Role-based authz

Commitlog Compression

New Storage Engine

New Hints Architecture

Materialized Views

DTCS

Message Coalescing

3.0

Page 37: Cassandra Summit 2015

Windows Support

DTCS

UDF

Role-based authz

Commitlog Compression

New Storage Engine

New Hints Architecture

Materialized Views

JSON

Message Coalescing

2.2 3.0(July) (Expected in October)

Page 38: Cassandra Summit 2015

JSONCREATE TABLE users ( id uuid PRIMARY KEY, name text, state text, birth_date int);

INSERT INTO users (id, name, state, birth_date)VALUES(now(), 'Joe User', 'TX', 1982);

INSERT INTO users JSON'{"id": "1a4f88e2-6dc8-4edd-9e16-a7ba9c941f8d", "name": "Joe User", "state": "TX", "birth_date": 1982}';

Page 39: Cassandra Summit 2015

CollectionsCREATE TABLE example (    id int PRIMARY KEY,    tupleval tuple<int, text>,    numbers set<int>,    words list<text>);

INSERT INTO example (id, tupleval, numbers, words)VALUES (0, (1, 'foo'), {1, 2, 3, 6}, ['the', 'quick', 'brown', 'fox']);

INSERT INTO example JSON'{"id": 0, "tupleval": [1, "foo"], "numbers": [1, 2, 3, 6], "words": ["the", "quick", "brown", "fox"]}';

Page 40: Cassandra Summit 2015

User-defined typesCREATE TYPE address (number int, street text);

CREATE TABLE users ( id int PRIMARY KEY, street_address frozen<address>);

INSERT INTO users (id, street_address)VALUES (1, {number: 123, street: 'Cassandra Ave'});

INSERT INTO users JSON'{"id": 1, "street_address": {"number": 1, "street": "Cassandra Ave"}}';

Page 41: Cassandra Summit 2015

Deeper NestingCREATE TYPE address ( street text, city text, zip_code int, phones set<text>);

CREATE TABLE users ( id uuid PRIMARY KEY, name text, addresses map<text, frozen<address>>);

Page 42: Cassandra Summit 2015

Deeper NestingINSERT INTO users JSON '{"id": "0514e410-2a9f-11e5-a2cb-0800200c9a66", "name": "jellis", "addresses": {"home": {"street": "9920 Cassandra Ave", "city": "Austin", "zip_code": 78700, "phones": ["1238614789"]}}}';

Page 43: Cassandra Summit 2015
Page 44: Cassandra Summit 2015
Page 45: Cassandra Summit 2015

Role-based AuthorizationCREATE ROLE accounting;

GRANT all ON invoices TO accounting;GRANT select ON expenses TO accounting;GRANT select ON payroll TO accounting;

GRANT accounting TO josie;GRANT accounting TO jay;

Page 46: Cassandra Summit 2015

User-defined Functions (UDF)CREATE FUNCTION my_sin (input double)RETURNS double LANGUAGE javaAS ’ return input == null ? null : Double.valueOf(Math.sin(input.doubleValue()));’;

SELECT key, my_sin(value) FROM my_table WHERE key IN (1, 2, 3);

Page 47: Cassandra Summit 2015

UDF AggregationCREATE AGGREGATE avg (int)SFUNC avgStateSTYPE tuple<long, int>FINALFUNC avgFinal;

CREATE FUNCTION avgState (state frozen<tuple<bigint, int>>, i int)RETURNS frozen<tuple<bigint, int>>, intLANGUAGE JAVA AS ’ // (state[0] + i, state[1] + 1) state.setLong(0, state.getLong(0) + i.intValue()); state.setInt(1, state.getInt(1) + 1); return state;’;

CREATE FUNCTION avgFinal (state frozen<tuple<bigint, int>>)RETURNS doubleLANGUAGE JAVA AS ’ double r = state.getLong(0) / state.getInt(1); return Double.valueOf(r);’;

Page 48: Cassandra Summit 2015

Commitlog Compression2.2

2.1

Ope

ratio

ns/s

Time

Page 49: Cassandra Summit 2015

DateTieredCompactionStrategyRe

ads/

s

DTCS

STCS

LCS

2.5TB

Page 50: Cassandra Summit 2015

DateTieredCompactionStrategyW

rites

/s

DTCS

STCS

LCS

Page 51: Cassandra Summit 2015

3.0 Preview

Page 52: Cassandra Summit 2015

New Storage Engine

Workloads

Byt

es

Page 53: Cassandra Summit 2015

Hinted Handoff Improvements

client

p1

p1

p1

X

Page 54: Cassandra Summit 2015

Hinted Handoff Improvements

client

p1

p1

p1

Hint

Page 55: Cassandra Summit 2015

Hinted Handoff Improvements

client

p1

p1

p1

Handoff

Page 56: Cassandra Summit 2015

SSTable-based Hints

Hint

Compacted

Tombstone

Memtable

CommitlogSSTableCommitlogSSTable

Memtable

Page 57: Cassandra Summit 2015

File-based Hints

Hint

.168.101

Hint

Hint

Hint

Hint

Hint

Hint

Hint

.168.104

Hint

Hint

Hint

Hint

Hint

Hint

Hint

Hint

.168.112

Hint

Hint

Hint

Hint

Hint

Hint

Hint

Hint

Page 58: Cassandra Summit 2015

File-based Hints.168.104

Hint

Hint

Hint

Hint

Hint

Hint

Hint

Hint

.168.112

Hint

Hint

Hint

Hint

Hint

Hint

Hint

Hint

Page 59: Cassandra Summit 2015

3.0

2.2

2.2 Hints vs 3.0O

pera

tions

/s

Time

Page 60: Cassandra Summit 2015

Materialized ViewsCREATE TABLE songs (  id uuid PRIMARY KEY,  title text,  album text,  artist text);

CREATE MATERIALIZED VIEW songs_by_album ASSELECT * FROM songsWHERE album IS NOT NULLPRIMARY KEY (album, id);

SELECT * FROM songs_by_albumWHERE album = ‘Tres Hombres’;

Page 61: Cassandra Summit 2015

IndexesCREATE TABLE songs (  id uuid PRIMARY KEY,  title text,  album text,  artist text);

CREATE INDEX songs_by_album on songs(album);

SELECT * FROM songsWHERE album = ‘Tres Hombres’;

Page 62: Cassandra Summit 2015

Local Indexes

client

title artist album

La Grange

ZZ Top Tres Hombres

title artist album

Outside... Back Door Slam

Roll Away

title artist album

Waitin... ZZ Top Tres Hombres

Page 63: Cassandra Summit 2015

Materialized Views

client

album id

Tres Hombres

a3e64f8f

Tres Hombres

8a172618

album id

Roll Away 2b09185b

Page 64: Cassandra Summit 2015

Stress: raw vs 1 MV vs 5 MVO

pera

tions

/s

Time

raw

1 MV

5 MV

Page 65: Cassandra Summit 2015

mvbench, 4 denormalizationsMV

Manual

Ope

ratio

ns/s

Time

Page 66: Cassandra Summit 2015

Beyond 3.0

Page 67: Cassandra Summit 2015

3.x Development Process

Page 68: Cassandra Summit 2015

3.0 releases

3.0 rc1: Out now

3.0 GA

October

December

3.1

3.2

January

February

3.3

Page 69: Cassandra Summit 2015

3.0 releases

3.0 rc1: Out now

3.0 GA

October

December

3.1

3.0.1

3.2

3.0.2

January

February

3.3

3.0.3

Page 70: Cassandra Summit 2015

Compatibility

3.03.13.2

Page 71: Cassandra Summit 2015

DataStax Training for Apache Cassandra™