LA Cassandra Day 2015 - Cassandra for developers

Click here to load reader

  • date post

    16-Jul-2015
  • Category

    Software

  • view

    683
  • download

    3

Embed Size (px)

Transcript of LA Cassandra Day 2015 - Cassandra for developers

  • @chbatey

    Christopher BateyTechnical Evangelist for Apache Cassandra

    LA 2015: Cassandra for Java developers

  • @chbatey

    Who am I? Based in London Technical Evangelist for Apache Cassandra

    Work on Stubbed CassandraHelp out Apache Cassandra users

    Previous: Cassandra backed apps at BSkyB

  • @chbatey

    Overview Why distributed databases? Cassandra fundamentals Java driver Example: Customer events

  • @chbatey

    Why Cassandra?

  • @chbatey

    Building a web app

  • @chbatey

    Running multiple copies of your app

  • @chbatey

    Still in one DC?

  • @chbatey

    Handling hardware failure

  • @chbatey

    Handling hardware failure

  • @chbatey

    Cassandra fundamentals

  • @chbatey

    Cassandra

    Cassandra

    Distributed masterless database (Dynamo)

    Column family data model (Google BigTable)

  • @chbatey

    Datacenter and rack aware

    Europe

    Distributed master less database (Dynamo)

    Column family data model (Google BigTable)

    Multi data centre replication built in from the start

    USA

  • @chbatey

    Replication

    DC1 DC2

    client

    RF3 RF3

    C

    RC

    WRITECL = 1 We have replication!

    client

  • @chbatey

    Tunable ConsistencyData is replicated N timesEvery query that you execute you give a consistency-ALL-QUORUM-LOCAL_QUORUM-ONE

    Christos Kalantzis Eventual Consistency != Hopeful Consistency: http://youtu.be/A6qzx_HE3EU?list=PLqcm6qE9lgKJzVvwHprow9h7KMpb5hcUU

  • @chbatey

    Async Replication

  • @chbatey

    Async Replication

  • @chbatey

    Async Replication

  • @chbatey

    Java Driver

  • @chbatey

    DataStax Java Driver Open source

  • @chbatey

    Java driver 101 Cluster - singleton per app Session - singleton per keyspaceCluster cluster = Cluster.builder() .addContactPoint(host1, host2) .build();

    Session session = cluster.connect("customer_events");

  • @chbatey

    Load balancingData centre aware policyToken aware policyLatency aware policyWhitelist policy

    APP APP

    Async Replication

    DC1 DC2

  • @chbatey

    Multi Data Center Load Balancing Local nodes are queried first, if none are available the

    request will be sent to a remote data center

    Cluster cluster = Cluster.builder().addContactPoints("10.158.02.40", "10.158.02.44").withLoadBalancingPolicy(

    new DCAwareRoundRobinPolicy("DC1")).build();

    Name of the local DC

  • @chbatey

    Token Aware Load Balancing Nodes that own a replica of the data being read or

    written by the query will be contacted first

    Cluster cluster = Cluster.builder().addContactPoints("10.158.02.40", "10.158.02.44").withLoadBalancingPolicy(

    new TokenAwarePolicy(new DCAwareRoundRobinPolicy("DC1")))

    .build();

  • @chbatey

    Retry Policies The behaviour to adopt when a request returns a

    timeout or is unavailable.

    Cluster cluster = Cluster.builder().addContactPoints("10.158.02.40", "10.158.02.44").withRetryPolicy(FallthroughRetryPolicy.INSTANCE).withLoadBalancingPolicy(new TokenAwarePolicy(new DCAwareRoundRobinPolicy("DC1"))).build();

    DefaultRetryPolicy DowngradingConsistencyRetryPolicy FallthroughRetryPolicy LoggingRetryPolicy

  • @chbatey

    Reconnection Policies Policy that decides how often the reconnection to a

    dead node is attempted.

    Cluster cluster = Cluster.builder().addContactPoints("10.158.02.40", "10.158.02.44) .withRetryPolicy(DowngradingConsistencyRetryPolicy.INSTANCE).withReconnectionPolicy(new ConstantReconnectionPolicy(1000)).withLoadBalancingPolicy(new TokenAwarePolicy(new DCAwareRoundRobinPolicy("DC1"))).build();

    ConstantReconnectionPolicy ExponentialReconnectionPolicy (Default)

  • @chbatey

    Session interfaceResultSet result = session.execute("select * from customer_events"); ResultSetFuture resultSetFuture = session.executeAsync("select * from customer_events);

    // prepare PreparedStatement preparedStatement = session.prepare("select * from where customer_id = ?); // execute BoundStatement boundStatement = preparedStatement.bind(chbatey"); ResultSet execute = session.execute(boundStatement);

    Guava listenable future!

    Statement simpleStatement = new SimpleStatement("select * from customer_events"); simpleStatement.setConsistencyLevel(ConsistencyLevel.QUORUM);ResultSetFuture resultSetFuture = session.executeAsync(simpleStatement);

  • @chbatey

    RxJava?public Observable getCustomerEventsObservable(String customerId) { BoundStatement boundStatement = getEventsForCustomer.bind(customerId); ListenableFuture resultSetFuture = session.executeAsync(boundStatement); // map to an obserable Observable observable = Observable.from(resultSetFuture, Schedulers.io());

    // ResultSet is an Iterator Observable rowObservable = observable.flatMapIterable(eachRow -> eachRow); return rowObservable.map(row -> new CustomerEvent( row.getString("customer_id"), row.getUUID("time"), row.getString("staff_id"), row.getString("store_type"), row.getString("event_type"), row.getMap("tags", String.class, String.class)));

    http://christopher-batey.blogspot.com/2014/12/streaming-large-payloads-over-http-from.htmlhttp://www.datastax.com/dev/blog/java-driver-async-queries

  • @chbatey

    Example Time: Customer event store

  • @chbatey

    An example: Customer event store Customer event

    - customer_id e.g ChrisBatey- event_type e.g login, logout, add_to_basket, remove_from_basket, buy_item

    Staff- name e.g Charlie- favourite_colour e.g red

    Store- name- type e.g Website, PhoneApp, Phone, Retail

  • @chbatey

    Requirements Get all events for a particular customer As above for a time slice Future requirement: Get all events by staff member?

  • @chbatey

    Modelling in a relational databaseCREATE TABLE customer_events( customer_id text, staff_name text, time timeuuid, event_type text, store_name text, PRIMARY KEY (customer_id)); CREATE TABLE store( name text, location text, store_type text, PRIMARY KEY (name)); CREATE TABLE staff( name text, favourite_colour text, job_title text, PRIMARY KEY (name));

  • @chbatey

    Reading thisclient

    RF3

    C

    Read latest customer eventCL = QUORUM

    Read store or staff information

    C

  • @chbatey

    You must denormalise

  • @chbatey

    Your model should look like your queries

  • Modelling in CassandraCREATE TABLE customer_events(

    customer_id text, staff_id text, time timeuuid, store_type text, event_type text, tags map, PRIMARY KEY ((customer_id), time));

    Partition Key

    Clustering Column(s)

  • How it is stored on diskcustomer

    _idtime event_type store_type tags

    charles 2014-11-18 16:52:04 basket_add online {'item': 'coffee'}charles 2014-11-18 16:53:00 basket_add online {'item': wine'}charles 2014-11-18 16:53:09 logout online {}chbatey 2014-11-18 16:52:21 login online {}chbatey 2014-11-18 16:53:21 basket_add online {'item': 'coffee'}chbatey 2014-11-18 16:54:00 basket_add online {'item': 'cheese'}

    charles event_typebasket_addstaff_idn/a

    store_typeonline

    tags:itemcoffee

    event_typebasket_add

    staff_idn/a

    store_typeonline

    tags:itemwine

    event_typelogout

    staff_idn/a

    store_typeonline

    chbatey event_typeloginstaff_idn/a

    store_typeonline

    event_typebasket_add

    staff_idn/a

    store_typeonline

    tags:itemcoffee

    event_typebasket_add

    staff_idn/a

    store_typeonline

    tags:itemcheese

  • @chbatey

    Get all the eventspublic List getAllCustomerEvents() { return session.execute("select * from customers.customer_events") .all().stream() .map(mapCustomerEvent()) .collect(Collectors.toList()); }

    private Function mapCustomerEvent() { return row -> new CustomerEvent( row.getString("customer_id"), row.getUUID("time"), row.getString("staff_id"), row.getString("store_type"), row.getString("event_type"), row.getMap("tags", String.class, String.class)); }

  • @chbatey

    All events for a particular customerprivate PreparedStatement getEventsForCustomer; @PostConstructpublic void prepareSatements() { getEventsForCustomer = session.prepare("select * from customers.customer_events where customer_id = ?"); } public List getCustomerEvents(String customerId) { BoundStatement boundStatement = getEventsForCustomer.bind(customerId); return session.execute(boundStatement) .all().stream() .map(mapCustomerEvent()) .collect(Collectors.toList()); }

  • @chbatey

    Customer events for a time slicepublic List getCustomerEventsForTime(String customerId, long startTime, long endTime) { Select.Where getCustomers = QueryBuilder.select() .all() .from("customers", "customer_events") .where(eq("customer_id", customerId)) .and(gt("time", UUIDs.startOf(startTime))) .and(lt("time", UUIDs.endOf(endTime))); return session.execute(getCustomers).all().stream() .map(mapCustomerEvent()) .collect(Collectors.toList()); }

  • @chbatey

  • @chbatey

    Mapping API@Table(keyspace = "customers", name = "customer_events") pu