2 Dundee - Cassandra-3
-
Upload
christopher-batey -
Category
Software
-
view
37 -
download
0
Transcript of 2 Dundee - Cassandra-3
©2013 DataStax Confidential. Do not distribute without consent.
@chbateyChristopher Batey
Cassandra 2.2 and 3.0
@chbatey
First comes a blog• Each new feature has a vastly more detailed blog post:
http://christopher-batey.blogspot.co.uk/
@chbatey
New features• 2.2- JSON- User defined functions- User defined aggregates- Role based authentication- The small print• 3.0- New storage engine- Materialised views
@chbatey
Hello JSON• create TABLE user (username text primary key,
first_name text , last_name text , emails set<text> , country text);• INSERT INTO user JSON '{"username": "chbatey",
"first_name":"Christopher", "last_name": "Batey", “emails":["[email protected]"]}';
@chbatey
JSON + User Defined Types• CREATE TYPE movie (title text, time timestamp,
description text);• ALTER TABLE user ADD movies set<frozen<movie>>;• UPDATE user SET movies = { { title:'Batman',
time:'2011-02-03T04:05:00+0000', description: 'This film rocks' } } where username = 'chbatey';
@chbatey
Cassandra HTTP Wrapper?
@RequestMapping(method = {RequestMethod.POST}, value = "/{keyspace}/{table}", consumes = "application/json") public ResponseEntity<String> store(@PathVariable String keyspace, @PathVariable String table, @RequestBody String body) { session.execute(String.format("insert into %s.%s JSON '%s'", keyspace, table, body)); return ResponseEntity.ok("OK");}
Keyspace Table
Raw JSON
curl --header "Content-Type: application/json" -X POST -v "localhost:8080/twotwo/user" --data '{"username": "trev2", "country": null, "emails": ["[email protected]", "[email protected]"], "first_name": "trevor", "last_name": "bunting", "movies": null}'
@chbatey
User defined functions• Run code on the server !Dangerous!• Java + JavaScript supported out of the box
@chbatey
UDF exampleCREATE TABLE user ( username text primary key, first_name text , last_name text , emails set<text> , country text);
@chbatey
Custom name
CREATE FUNCTION name ( first_name text, last_name text ) CALLED ON NULL INPUT RETURNS text LANGUAGE java AS ‘ return first_name + " " + last_name; ‘;
cqlsh:twotwo> select name(first_name, last_name) FROM user;
twotwo.name(first_name, last_name)------------------------------------ Christopher Batey
@chbatey
User defined aggregatesCREATE AGGREGATE average ( int ) SFUNC averageState STYPE tuple<int,bigint> FINALFUNC averageFinal INITCOND (0, 0);
Called for every row state passed between
Initial state
Return type (CQL)
Optional function called onfinal state
@chbatey
State functionCREATE FUNCTION averageState ( state tuple<int,bigint>, value int ) CALLED ON NULL INPUT RETURNS tuple<int,bigint> LANGUAGE java AS ' if (val != null) { state.setInt(0, state.getInt(0)+1); state.setLong(1, state.getLong(1)+val.intValue()); } return state; ';
Type Columns
@chbatey
Final functionCREATE FUNCTION averageFinal ( state tuple<int,bigint> ) CALLED ON NULL INPUT RETURNS double LANGUAGE java AS ' double r = 0; if (state.getInt(0) == 0) return null; double r = state.getLong(1) / state.getInt(0); return Double.valueOf(r); ';
Type
@chbatey
Customer events
CREATE AGGREGATE count_by_type(text) SFUNC countEventTypes STYPE map<text, int> INITCOND {};
CREATE FUNCTION countEventTypes( state map<text, int>, type text ) CALLED ON NULL INPUT RETURNS map<text, int> LANGUAGE java AS ' Integer count = (Integer) state.get(type); if (count == null) count = 1; else count = count + 1; state.put(type, count); return state; ' ;
@chbatey
Built in aggregates• count• max• min• avg• sum
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/AggregateFcts.java
@chbatey
Built in time functions
https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/functions/TimeFcts.java
@chbatey
Small print• New types- smallint - short- tinyint - byte- date - time• Warnings now sent back to client- batch too large
@chbatey
Materialsed views• Designed to stop *you* having to duplicate• Do we need a secondary index primer?
@chbatey
Customer events tableCREATE TABLE if NOT EXISTS customer_events ( customer_id text, staff_id text, store_type text, time timeuuid , event_type text, PRIMARY KEY (customer_id, time))
create INDEX on customer_events (staff_id) ;
@chbatey
Indexes to the rescue?customer_id time staff_idchbatey 2015-03-03 08:52:45 trevorchbatey 2015-03-03 08:52:54 trevorchbatey 2015-03-03 08:53:11 billchbatey 2015-03-03 08:53:18 billrusty 2015-03-03 08:56:57 billrusty 2015-03-03 08:57:02 billrusty 2015-03-03 08:57:20 trevor
staff_id customer_idtrevor chbateytrevor chbateybill chbateybill chbateybill rustybill rustytrevor rusty
@chbatey
Secondary index are local • The staff_id partition in the secondary index is not
distributed like a normal table• The secondary index entries are only stored on the node
that contains the customer_id partition
@chbatey
Indexes to the rescue?
staff_id customer_idtrevor chbateytrevor chbateybill chbateybill chbatey
staff_id customer_idbill rustybill rustytrevor rusty
A B
chbatey rusty
customer_id time staff_idchbatey 2015-03-03 08:52:45 trevorchbatey 2015-03-03 08:52:54 trevorchbatey 2015-03-03 08:53:11 billchbatey 2015-03-03 08:53:18 billrusty 2015-03-03 08:56:57 billrusty 2015-03-03 08:57:02 billrusty 2015-03-03 08:57:20 trevor
customer_events tablestaff_id customer_idtrevor chbateytrevor chbateybill chbateybill chbateybill rustybill rustytrevor rusty
staff_id index
@chbatey
Do it your self indexCREATE TABLE if NOT EXISTS customer_events ( customer_id text, statff_id text, store_type text, time timeuuid , event_type text, PRIMARY KEY (customer_id, time))
CREATE TABLE if NOT EXISTS customer_events_by_staff ( customer_id text, statff_id text, store_type text, time timeuuid , event_type text, PRIMARY KEY (staff_id, time))
@chbatey
KillrWeather data modelINSERT INTO raw_weather_data(wsid, year, month, day, hour, country_code, state_code, temperature, one_hour_precip ) values ('station1', 2012, 12, 25, 1, 'GB', 'Cumbria', 14.0, 20) ;
INSERT INTO raw_weather_data(wsid, year, month, day, hour, country_code, state_code, temperature, one_hour_precip ) values ('station2', 2012, 12, 25, 1, 'GB', 'Cumbria', 4.0, 2) ;
INSERT INTO raw_weather_data(wsid, year, month, day, hour, country_code, state_code, temperature, one_hour_precip ) values ('station3', 2012, 12, 25, 1, 'GB', 'Greater London', 16.0, 10) ;
@chbatey
Fine print• Primary key columns + one other in your MV primary key• Un-used Primary key columns are added to the end of
your MV PK• If the part of your primary key is NULL then it won't
appear in the materialised view• This is not free!