Austin Cassandra Users 6/19: Apache Cassandra at Vast

download Austin Cassandra Users 6/19: Apache Cassandra at Vast

of 27

  • date post

  • Category


  • view

  • download


Embed Size (px)


For our June meetup, we'll have our local friends at presenting some of their current use cases for Cassandra. Additionally, Vast will be talking about a non-blocking Scala client that they have developed in house.

Transcript of Austin Cassandra Users 6/19: Apache Cassandra at Vast

  • 1. June 19, 2014 Cassandra at Vast Graham Sanderson - CTO, David Pratt - Director of Applications1

2. June 19, 2014 Introduction 2 Dont want this to be a data modeling talk We aren't experts - we are learning as we go Hopefully this will be useful to both you and us Informal, questions as we go We will share our experiences so far moving to Cassandra We are working on a bunch existing and new projects We'll talk about 2 1/2 of them Some dev stuff, some ops stuff Some thoughts for the future Athena Scala Driver 3. June 19, 2014 Who isVast? 3 Vast operates while-label performance based marketplaces for publishers; and delivers big data mobile applications for automotive and real estate sales professionals Big Data for Big Purchases Marketplaces Large partner sites, including AOL, CARFAX,TrueCar, Realogy, USAA, Yahoo Hundreds of smaller partner sites Analytics Strong team of scarily smart data scientists Integrating analytics everywhere 4. June 19, 2014 Big Data 4 HDFS - 1100TB Amazon S3 - 275TB Amazon Glacier - 150TB DynamoDB -12TB Vertica - 2TB Cassandra - 1.5TB SOLR/Lucene - 400GB Zookeeper MySQL Postgres Redis CouchDB 5. June 19, 2014 Data Flow 5 Flows between different data store types (many include historical data too) Systems of Record (SOR) Both root nodes and leaf nodes Derived data stores (mostly MVCC) for: Real time customer facing queries Real time analytics Alerting Ofine analytics Reporting Debugging Mixture of dumps and deltas We have derived SORs Cached smaller subset records/elds for a specic purpose SORs in multiple data centers - some derived SORs shared Data ow is graph not a tree - feedback 6. June 19, 2014 Goals 6 Reduce latency