  • Running Cassandra on Amazons ECS

    Anirvan Chakraborty


  • Agenda Motivation




    Cassandra on Docker best practices

    Cassandra on ECS

  • Motivation

  • Motivation Ease of development

    Support polyglot languages, frameworks and


    Operational simplicity

    Quick feedback loop

  • Docker

  • Docker history

    Came out of dotCloud, a PaaS company

    Was originally written in Python

    Got re-written in Golang in Feb, 2013

    Docker 0.1 was released on Mar, 2013

    Docker 1.10 is the latest release

  • Docker tag line

    Build, ship and run any app, anywhere

  • Docker tag line

    Build: package your application in a container

    Ship: move it between machines

    Run: execute that container with your application

    Any application: as long as it runs on Linux

    Anywhere: local VM, bare metal, cloud instances

  • Why Docker?

    Deploy reliably & consistently

    Execution is fast and light weight Simplicity

    Developer friendly workflow

    Fantastic community

  • Apache Cassandra

  • What is Apache Cassandra? Fast distributed database

    High Availability

    Linear Scalability

    Predictable performance

    No single point of failure


    Easy to manage

    Can use commodity hardware

    Not a drop in replacement for RDBMS

  • Hash ring

    Data is partitioned around the ring

    Location of data in ring is determined by partition


    Data is replicated to N servers based on RF

    All nodes hold data and can answer read or write



  • CAP Tradeoff

    During network partition it is impossible to be both consistent and highly available

    Latency between data centres also makes consistency impractical

    Cassandra chooses Availability & Partition tolerance over Consistency

  • Replication Choose replication factor or RF

    Data is always replicated to each replica

    If node is down, missing data is replayed via

    hinted handoff


  • Consistency level


    Per query consistency


    How many replicas to respond OK for query to


  • Amazon EC2 Container Service

  • What is ECS?

    is a highly scalable, fast, container management service that makes it easy to

    run, stop, and manage Docker containers on a cluster of Amazon EC2 instances.

  • What is ECS?

    Amazon Docker as a Service

  • How does ECS work

  • Cluster

  • Container Instance

  • ECS Agent

  • Task

  • ECS Service

  • Typical ECS workflow

    Build Docker image using whatever you want.

    Push image to registry.

    Create JSON file describing your task definition.

    Register this task definition with ECS.

    Make sure that your cluster has enough


    Start a new task from the task definition.

  • Tips & tricks

  • Dockerize C* Dev Environment Make it run as slow, but as stable as possible!

    Super low memory settings in


    Remove caches in dev mode in cassandra.yml

    key_cache_size_in_mb: 0 reduce_cache_sizes_at: 0 reduce_cache_capacity_to: 0

  • Dockerize C* Production

    Use host networking (net=host) for better network performance

    Put data, commitlog and saved_caches in volume mount folders to the underlying host

    Run cassandra on the foreground using (-f)

    Tune JVM heap for optimal size

    Tune JVM garbage collector for your workload

  • Dockerize C* on ECS Simple service discovery using ECS API

    Custom Dockerfile and entry-point script

    to control Cassandra configuration

    Cleanup downed node and repair cluster

    on node failover

  • Where to next?

    Consider GlusterFS for Cassandra and Spark on ECS

    Consider Weave for networking with Docker on ECS

  • Thanks!

    Twitter: @cakesolutionsTel: 0845 617 1200


  • Resources