Cassandra Day Atlanta 2015: Software Development with Apache Cassandra: A Walkthrough

Click here to load reader

  • date post

    15-Jul-2015
  • Category

    Technology

  • view

    296
  • download

    1

Embed Size (px)

Transcript of Cassandra Day Atlanta 2015: Software Development with Apache Cassandra: A Walkthrough

  • CASSANDRA DAY ATLANTA 2015

    SOFTWARE DEVELOPMENT WITH CASSANDRA:A WALKTHROUGH

    Nate McCall@zznate

    #CassandraDaysCo-Founder & Sr. Technical Consultant

    Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License

  • About The Last Pickle.

    Work with clients to deliver and improve Apache Cassandra based solutions.

    Based in New Zealand & USA.

  • OVERVIEW

  • Overview:

    What makes a software development

    project successful?

  • Overview: Successful Software Development

    - it ships- maintainable- good test coverage- check out and build

  • Overview:

    Impedance mismatch:distributed systems

    developmenton a laptop.

  • DATA MODELING

  • Data Modeling:

    a topic unto itself.But quickly:

  • Data Modeling - Quickly

    Its Hard Do research #1 performance problem Tip: dont port your schema

  • Data Modeling - Using CQL:

    tools support easy tracing (and trace discovery) documentation*

    *Maintained in-tree:https://github.com/apache/cassandra/blob/cassandra-1.2/doc/cql3/CQL.textilehttps://github.com/apache/cassandra/blob/cassandra-2.0/doc/cql3/CQL.textilehttps://github.com/apache/cassandra/blob/cassandra-2.1/doc/cql3/CQL.textile

  • Data Modeling - DevCenter :

    Tools:DataStax DevCenter

    http://www.datastax.com/what-we-offer/products-services/devcenter

  • WRITING CODE

  • Writing Code:

    ORM?maybe - only if its very simple

    more later

    http://www.datastax.com/documentation/developer/java-driver/2.1/java-driver/reference/crudOperations.html

  • Writing Code:

    use CQL

  • Writing Code:

    Use the Java Driver

  • Writing Code - Java Driver :

    Reference implementation Well written, extensive coverage open source dedicated resourceshttps://github.com/datastax/java-driver/

  • Writing Code - Java Driver :

    Existing Spring Users:Spring Data Integration

    http://projects.spring.io/spring-data-cassandra/

  • Writing Code - Java Driver :

    Guice Users:GuicyFig:

    Archaius + Guice

    https://stash.safehaus.org/projects/GFIG/repos/main/browse

  • Writing Code - Java Driver :

    Four rules for Writing Code one Cluster for physical cluster one Session per app per keyspace use PreparedStatements use Batches to reduce network IO

  • Writing Code - Java Driver :

    Configuration is Similar to Other DB Drivers(with caveats**)

    http://www.datastax.com/documentation/developer/java-driver/2.1/common/drivers/reference/clusterConfiguration_c.html

  • Writing Cluster - Java Driver - Configuration:

    Major Difference:its a Cluster!

  • Writing Code - Java Driver - Configuration:

    Two groups of configurations

    policies connections

  • Writing Code - Java Driver - Configuration:

    Three Policy Types: load balancing connection retry

  • Writing Code - Java Driver - Configuration:

    Connection Options: protocol* pooling socket

    *https://github.com/apache/cassandra/blob/cassandra-2.1/doc/native_protocol_v3.spec

  • Writing Code - Java Driver :

    Embrace Asynchronicity(but use RxJava)

    https://github.com/ReactiveX/RxJava

  • Writing Code - Java Driver :

    A note about User Defined Types (UTDs)

  • Writing Code - Java Driver - Using UDTs:

    Wait.- serialized as blobs !!?!- new version already being discussed*- will be a painful migration path

    * https://issues.apache.org/jira/browse/CASSANDRA-7423

  • Writing Code:

    Metrics API for your own code

    https://github.com/apache/cassandra/blob/cassandra-2.1/src/java/org/apache/cassandra/metrics/ColumnFamilyMetrics.javahttps://dropwizard.github.io/metrics/3.1.0/

  • Writing Code - Instrumentation via Metrics API:

    Run Riemann locally

    http://riemann.io/

  • Writing Code:

    Using Trace (and doing so frequently)

  • Writing Code - Tracing:

    Trace per query via DevCenter

    http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/tracing_r.html

  • Writing Code - Tracing:

    Trace per query via cqlsh

    http://www.datastax.com/documentation/cql/3.0/cql/cql_reference/tracing_r.html

  • cqlsh> tracing on;Now tracing requests.cqlsh> SELECT doc_version FROM data.documents_by_version ... WHERE application_id = myapp ... AND document_id = foo ... AND chunk_index = 0 ... ORDER BY doc_version ASC ... LIMIT 1;

    doc_version------------- 65856

    Tracing session: 46211ab0-2702-11e4-9bcf-8d157d448e6b

  • Preparing statement | 18:05:44,845 | 192.168.1.197 | 22337 Enqueuing data request to /192.168.1.204 | 18:05:44,845 | 192.168.1.197 | 22504 Sending message to /192.168.1.204 | 18:05:44,847 | 192.168.1.197 | 24498 Message received from /192.168.1.197 | 18:05:44,854 | 192.168.1.204 | 872 Executing single-partition query on documents_by_version | 18:05:44,888 | 192.168.1.204 | 35183 Acquiring sstable references | 18:05:44,888 | 192.168.1.204 | 35459 Merging memtable tombstones | 18:05:44,889 | 192.168.1.204 | 35675 Key cache hit for sstable 2867 | 18:05:44,889 | 192.168.1.204 | 35792 Seeking to partition beginning in data file | 18:05:44,889 | 192.168.1.204 | 35817

  • Preparing statement | 18:05:44,845 | 192.168.1.197 | 22337 Enqueuing data request to /192.168.1.204 | 18:05:44,845 | 192.168.1.197 | 22504 Sending message to /192.168.1.204 | 18:05:44,847 | 192.168.1.197 | 24498 Message received from /192.168.1.197 | 18:05:44,854 | 192.168.1.204 | 872 Executing single-partition query on documents_by_version | 18:05:44,888 | 192.168.1.204 | 35183 Acquiring sstable references | 18:05:44,888 | 192.168.1.204 | 35459 Merging memtable tombstones | 18:05:44,889 | 192.168.1.204 | 35675 Key cache hit for sstable 2867 | 18:05:44,889 | 192.168.1.204 | 35792 Seeking to partition beginning in data file | 18:05:44,889 | 192.168.1.204 | 35817

  • Merging data from memtables and 8 sstables | 18:05:44,892 | 192.168.1.204 | 38605 Read 1 live and 2667 tombstoned cells | 18:05:54,135 | 192.168.1.204 | 9282428 Enqueuing response to /192.168.1.197 | 18:05:54,136 | 192.168.1.204 | 9283423 Sending message to /192.168.1.197 | 18:05:54,138 | 192.168.1.204 | 9284753 Message received from /192.168.1.204 | 18:05:54,155 | 192.168.1.197 | 9332505 Processing response from /192.168.1.204 | 18:05:54,158 | 192.168.1.197 | 9335372 Request complete | 18:05:54,158 | 192.168.1.197 | 9335592

  • Merging data from memtables and 8 sstables | 18:05:44,892 | 192.168.1.204 | 38605 Read 1 live and 2667 tombstoned cells | 18:05:54,135 | 192.168.1.204 | 9282428 Enqueuing response to /192.168.1.197 | 18:05:54,136 | 192.168.1.204 | 9283423 Sending message to /192.168.1.197 | 18:05:54,138 | 192.168.1.204 | 9284753 Message received from /192.168.1.204 | 18:05:54,155 | 192.168.1.197 | 9332505 Processing response from /192.168.1.204 | 18:05:54,158 | 192.168.1.197 | 9335372 Request complete | 18:05:54,158 | 192.168.1.197 | 9335592

    !!?!

  • Merging data from memtables and 8 sstables | 18:05:44,892 | 192.168.1.204 | 38605 Read 1 live and 2667 tombstoned cells | 18:05:54,135 | 192.168.1.204 | 9282428 Enqueuing response to /192.168.1.197 | 18:05:54,136 | 192.168.1.204 | 9283423 Sending message to /192.168.1.197 | 18:05:54,138 | 192.168.1.204 | 9284753 Message received from /192.168.1.204 | 18:05:54,155 | 192.168.1.197 | 9332505 Processing response from /192.168.1.204 | 18:05:54,158 | 192.168.1.197 | 9335372 Request complete | 18:05:54,158 | 192.168.1.197 | 9335592

  • Writing Code - Tracing:

    Enable traces in the driver

    http://www.datastax.com/documentation/developer/java-driver/2.0/java-driver/tracing_t.html

  • Writing Code - Tracing:

    `nodetool settraceprobability`

  • Writing Code - Tracing:

    then make sure you try it again

    with a node down!

  • Writing Code - Tracing:

    Final note on tracing:do it sparingly

  • Writing Code - Tracing:

    Coming Soon:slow query log

    (client side)

    https://github.com/datastax/java-driver/compare/java646https://datastax-oss.atlassian.net/browse/JAVA-646

  • Writing Code:

    Logging Verbositycan be changed dynamically**

    ** since 0.4rc1

    http://www.datastax.com/documentation/cassandra/2.0/cassandra/configuration/configLoggingLevels_r.html

  • Writing Code:

    nodetool for developers cfstats cfshistograms proxyhistograms

  • Writing Code - nodetool - cfstats:

    cfstats:per-table statistics about size

    and performance (single most useful command)

  • Writing Code - nodetool - cfhistograms:

    cfhistograms:column count and partition size vs. latency distribution

  • Writing Code - nodetool - proxyhistograms:

    proxyhistograms:performance of inter-cluster

    requests

  • MANAGING ENVIRONMENTS

  • Managing Environments:

    Configuration Management is Essential

  • Managing Environments:

    Laptop to Productionwith NO

    Manual Modifications!

  • Managing Environments:

    Running Cassandraduring development

  • Managing Environments - Running Cassandra:

    Local Cassandra easy to setup you control it but then you c