A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit...

58
JON HADDAD THE LAST PICKLE LEARN DATA MODELING BY EXAMPLE THIS IS AWESOME!!!

Transcript of A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit...

Page 1: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

JON HADDAD THE LAST PICKLE

LEARN DATA MODELING BY EXAMPLE

THIS IS AWESOME!!!

Page 2: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

WHAT’S THE LAST PICKLE DO?

Page 3: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

WE HELP MAKE YOU A TEAM OF EXPERTS

Page 4: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

> 50 YEARS COMBINED EXPERIENCE

Page 5: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016
Page 6: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

WHO IS THIS GUY?

Page 7: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

15 YEARS EXPERIENCE

Page 8: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

4 YEARS WITH CASSANDRA

Page 9: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

LEARNING HOW TO CASSANDRA

Page 10: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

WHAT’S YOUR BACKGROUND?

Page 11: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

ORACLE! MYSQL!

POSTGRES!

Page 12: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

CQL LOOKS LIKE SQL

Page 13: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

BAD ASSUMPTIONS

Page 14: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

3RD NORMAL FORM?

Page 15: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

WHERE’S MY JOINS?

Page 16: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

SECONDARY INDEX?

Page 17: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

DO IT WRONG

TRY TO DATA MODELGET ANGRY

WATCH VIDEOS & READ

Page 18: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

EVERYTHING I KNOW IS WRONG

Page 19: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016
Page 20: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016
Page 21: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

LEARN BY EXAMPLE

Page 22: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016
Page 23: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

CASSANDRA DATASET MANAGER

Page 24: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

CDM

Page 25: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

APT FOR CASSANDRA DATA

Page 26: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

INSTALL DATA TO YOUR CASSANDRA CLUSTER

Page 27: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

cdm install <dataset>

Page 28: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

jhaddad@rustyrazorblade ~$ cdm listStarting CDMDatasets:movielenskillrvideokillrweatherFinished.

Page 29: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

jhaddad@rustyrazorblade ~$ cdm install movielensStarting CDMInstalling movielensChecking for repo at /Users/jhaddad/.cdm/movielensPulling latestCDM is using dataset path: /Users/jhaddad/.cdm/movielenscqlsh -e "DROP KEYSPACE IF EXISTS movielens; CREATE KEYSPACE movielens WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}"Schema: /Users/jhaddad/.cdm/movielens/schema.cqlLoading datacqlsh -k movielens -e "COPY movies FROM '/Users/jhaddad/.cdm/movielens/data/movies.csv'"cqlsh -k movielens -e "COPY users FROM '/Users/jhaddad/.cdm/movielens/data/users.csv'"cqlsh -k movielens -e "COPY ratings_by_user FROM '/Users/jhaddad/.cdm/movielens/data/ratings_by_user.csv'"cqlsh -k movielens -e "COPY original_movie_map FROM '/Users/jhaddad/.cdm/movielens/data/original_movie_map.csv'"cqlsh -k movielens -e "COPY ratings_by_movie FROM '/Users/jhaddad/.cdm/

Page 30: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

jhaddad@rustyrazorblade ~/dev/cassandra$ cqlsh Connected to Test Cluster at 127.0.0.1:9042.[cqlsh 5.0.1 | Cassandra 3.10-SNAPSHOT | CQL spec 3.4.3 | Native protocol v4]Use HELP for help.cqlsh> use movielens ;cqlsh:movielens> desc tables;

movies users ratings_by_user original_movie_map ratings_by_movie

Page 31: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

WHAT CAN WE DO WITH IT?▸ Learn by example

▸Blog posts / Tutorials

▸ Jupyter notebooks

▸Reference applications

▸Data Models for presentations

Page 32: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

MANAGING REFERENCE / TEST DATA

Page 33: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

DATASETS

Page 34: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

MOVIELENS

Page 35: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

DETAILS▸GroupLens Research Project

▸University of Minnesota

▸100K ratings

▸1K users

▸1700 movies

Page 36: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

cqlsh:movielens> select id, avg_rating, genres, name ... from movies limit 1;

@ Row 1------------+-------------------------------------- id | 76a38f64-94d8-4b8f-b830-a40af96f8d20 avg_rating | 3.16667 genres | {'Drama'} name | Little Lord Fauntleroy (1936)

(1 rows)

Page 37: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

cqlsh:movielens> select * from users limit 1;

@ Row 1------------+-------------------------------------- id | b52fcdfc-0eaf-4432-9896-aa22db56edb2 address | 0322 Mattie Ramp Apt. 177 age | 37 city | South Fremont gender | M name | Harrold Hills occupation | administrator zip | 06513

(1 rows)

Page 38: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

BLOG: WORKING RELATIONALLY WITH CASSANDRA

Page 39: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

CONNECTING CASSANDRA DATA WITH

GRAPHFRAMES

Page 40: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

cdm install killrweather

Page 41: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

Helena Edelson Patrick McFadin

Page 42: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016
Page 43: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

cdm install killrvideo

Page 44: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

Luke Tillman Patrick McFadin

Page 45: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016
Page 46: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016
Page 47: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016
Page 48: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

UPCOMING DATA SETS

Page 49: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

openflights.org

‣ airports ‣ flight data

Page 50: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

HEALTH CARE▸ Cancer Genome Atlas Project

▸ Ebola cases

▸ Healthcare financial data

▸ Dani Traphagen

Page 51: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

NYC TAXI DATA▸ pick up / drop off times & locations

▸ trip distances

▸ itemized fares

▸ rate types

▸ payment types

Page 52: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

SOCIAL DATA▸Higgs Twitter Data

▸Foursquare

▸Enron executive emails

Page 53: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

HOW TO CONTRIBUTE

Page 54: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

https://github.com/riptano/cdm-java

Page 55: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

ADD FEATURES

Page 56: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

SUGGEST DATASETS

Page 57: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

CREATE A DATASET▸ create a git repo

▸ datasets.yaml

▸ schema.cql

▸ insert data

▸ “cdm dump”

▸ cdm install .

▸ create a PR on cdm-java

OMG BEST DATASET EVER

Page 58: A Shortcut to Awesome: Cassandra Data Modeling By Example (Jon Haddad, The Last Pickle) | C* Summit 2016

@RUSTYRAZORBLADE

THANK YOU, KIND HUMANS