Lambda at Weather Scale by Robbie Strickland

Post on 16-Apr-2017

1.130 views 3 download

Transcript of Lambda at Weather Scale by Robbie Strickland

Lambdaat Weather Scale Robbie Strickland

Who Am I?

Robbie StricklandDirector of Engineering, Analyticsrstrickland@weather.com@rs_atl

An IBM Business

Who Am I?• Contributor to C*

community since 2010

• DataStax MVP 2014/15

• Author, Cassandra High Availability

• Founder, ATL Cassandra User Group

About TWC

~30 billion API requests per day

About TWC

~30 billion API requests per day

~120 million active mobile users

About TWC

~30 billion API requests per day

~120 million active mobile users

#3 most active mobile user base

About TWC

~30 billion API requests per day

~120 million active mobile users

#3 most active mobile user base

~360 PB of traffic daily

About TWC

~30 billion API requests per day

~120 million active mobile users

#3 most active mobile user base

~360 PB of traffic daily

Most weather data comes from us

Use CaseBillions of events per day (~1.3M per sec)

Web/mobile beaconsLogsWeather conditions + forecastsetc.

Use CaseBillions of events per day (~1.3M per sec)

Web/mobile beaconsLogsWeather conditions + forecastsetc.

Keep data forever

Use CaseEfficient batch + streaming analysis

Use CaseEfficient batch + streaming analysis

Self-serve data science

Use CaseEfficient batch + streaming analysis

Self-serve data science

BI / visualization tool support

Architecture

Attempt[0] ArchitectureOperational Analytics

Business Analytics

Executive Dashboards

Data Discovery

Data Science

3rd Party

System Integration

Events

3rd Party

Other DBs

S3

Stream Processing

Batch Sources

Storage and Processing

Consumers

Data Access

Kafka

Streaming

Custom Ingestion Pipeline

ETL

Streaming Sources

RESTful Enqueue service

SQL

Attempt[0] Data ModelCREATE TABLE events (

timebucket bigint,timestamp bigint,eventtype varchar,eventid varchar,platform varchar,userid varchar,version int,appid varchar,useragent varchar,eventdata varchar,tags set<varchar>,devicedata map<varchar, varchar>,PRIMARY KEY ((timebucket, eventtype), timestamp, eventid)

) WITH CACHING = 'none'AND COMPACTION = { 'class' : 'DateTieredCompactionStrategy' };

Attempt[0] Data ModelCREATE TABLE events (

timebucket bigint,timestamp bigint,eventtype varchar,eventid varchar,platform varchar,userid varchar,version int,appid varchar,useragent varchar,eventdata varchar,tags set<varchar>,devicedata map<varchar, varchar>,PRIMARY KEY ((timebucket, eventtype), timestamp, eventid)

) WITH CACHING = 'none'AND COMPACTION = { 'class' : 'DateTieredCompactionStrategy' };

Event payload == schema-less JSON

Attempt[0] Data ModelCREATE TABLE events (

timebucket bigint,timestamp bigint,eventtype varchar,eventid varchar,platform varchar,userid varchar,version int,appid varchar,useragent varchar,eventdata varchar,tags set<varchar>,devicedata map<varchar, varchar>,PRIMARY KEY ((timebucket, eventtype), timestamp, eventid)

) WITH CACHING = 'none'AND COMPACTION = { 'class' : 'DateTieredCompactionStrategy' };

Partitioned by time bucket + type

Attempt[0] Data ModelCREATE TABLE events (

timebucket bigint,timestamp bigint,eventtype varchar,eventid varchar,platform varchar,userid varchar,version int,appid varchar,useragent varchar,eventdata varchar,tags set<varchar>,devicedata map<varchar, varchar>,PRIMARY KEY ((timebucket, eventtype), timestamp, eventid)

) WITH CACHING = 'none'AND COMPACTION = { 'class' : 'DateTieredCompactionStrategy' };

Time-series data good fit for DTCS

Attempt[0] tl;drC* everywhere

Attempt[0] tl;drC* everywhereStreaming data via custom ingest process

Attempt[0] tl;drC* everywhereStreaming data via custom ingest processKafka backed by RESTful service

Attempt[0] tl;drC* everywhereStreaming data via custom ingest processKafka backed by RESTful serviceBatch data via Informatica

Attempt[0] tl;drC* everywhereStreaming data via custom ingest processKafka backed by RESTful serviceBatch data via InformaticaSpark SQL through ODBC

Attempt[0] tl;drC* everywhereStreaming data via custom ingest processKafka backed by RESTful serviceBatch data via InformaticaSpark SQL through ODBCSchema-less event payload

Attempt[0] tl;drC* everywhereStreaming data via custom ingest processKafka backed by RESTful serviceBatch data via InformaticaSpark SQL through ODBCSchema-less event payloadDate-tiered compaction

Attempt[0] tl;drC* everywhereStreaming data via custom ingest processKafka backed by RESTful serviceBatch data via InformaticaSpark SQL through ODBCSchema-less event payloadDate-tiered compaction

Attempt[0] LessonsBatch loading large data sets into C* is silly

Attempt[0] LessonsBatch loading large data sets into C* is silly… and expensive

Attempt[0] LessonsBatch loading large data sets into C* is silly… and expensive… and using Informatica to do it is SLOW

Attempt[0] LessonsBatch loading large data sets into C* is silly… and expensive… and using Informatica to do it is SLOWKafka + REST services == unnecessary

Attempt[0] LessonsBatch loading large data sets into C* is silly… and expensive… and using Informatica to do it is SLOWKafka + REST services == unnecessaryNo viable open source C* Hive driver

Attempt[0] LessonsBatch loading large data sets into C* is silly… and expensive… and using Informatica to do it is SLOWKafka + REST services == unnecessaryNo viable open source C* Hive driverDTCS is broken (see CASSANDRA-9666)

Attempt[0] LessonsSchema-less == bad:

Attempt[0] LessonsSchema-less == bad:

Must parse JSON to extract key data

Attempt[0] LessonsSchema-less == bad:

Must parse JSON to extract key dataExpensive to analyze by event type

Attempt[0] LessonsSchema-less == bad:

Must parse JSON to extract key dataExpensive to analyze by event typeCannot tune by event type

Attempt[1] Architecture

Data Lake

Operational Analytics

Business Analytics

Executive Dashboards

Data Discovery

Data Science

3rd Party

System Integration

Stream Processing

Long Term Raw Storage

Short Term Storage and Big Data Processing

Consumers

Amazon SQS

Streaming

Custom Ingestion Pipeline

Events

3rd Party

Other DBs

S3

Batch Sources

Streaming Sources

ETL

Data Access

SQL

Attempt[1] Data ModelEach event type gets its own table

Attempt[1] Data ModelEach event type gets its own tableTables individually tuned based on workload

Attempt[1] Data ModelEach event type gets its own tableTables individually tuned based on workloadSchema applied at ingestion:

Attempt[1] Data ModelEach event type gets its own tableTables individually tuned based on workloadSchema applied at ingestion:

We’re reading everything anyway

Attempt[1] Data ModelEach event type gets its own tableTables individually tuned based on workloadSchema applied at ingestion:

We’re reading everything anywayMakes subsequent analysis much easier

Attempt[1] Data ModelEach event type gets its own tableTables individually tuned based on workloadSchema applied at ingestion:

We’re reading everything anywayMakes subsequent analysis much easierAllows us to filter junk early

Attempt[1] tl;drUse C* for streaming data

Attempt[1] tl;drUse C* for streaming data

Rolling time window (TTL depends on type)

Attempt[1] tl;drUse C* for streaming data

Rolling time window (TTL depends on type)Real-time access to events

Attempt[1] tl;drUse C* for streaming data

Rolling time window (TTL depends on type)Real-time access to eventsData locality makes Spark jobs faster

Attempt[1] tl;drEverything else in S3

Attempt[1] tl;drEverything else in S3

Batch data loads (mostly logs)

Attempt[1] tl;drEverything else in S3

Batch data loads (mostly logs)Daily C* backups

Attempt[1] tl;drEverything else in S3

Batch data loads (mostly logs)Daily C* backupsStored as Parquet

Attempt[1] tl;drEverything else in S3

Batch data loads (mostly logs)Daily C* backupsStored as ParquetCheap, scalable long-term storage

Attempt[1] tl;drEverything else in S3

Batch data loads (mostly logs)Daily C* backupsStored as ParquetCheap, scalable long-term storageEasy access from Spark

Attempt[1] tl;drEverything else in S3

Batch data loads (mostly logs)Daily C* backupsStored as ParquetCheap, scalable long-term storageEasy access from SparkEasy to share internally & externally

Attempt[1] tl;drEverything else in S3

Batch data loads (mostly logs)Daily C* backupsStored as ParquetCheap, scalable long-term storageEasy access from SparkEasy to share internally & externallyOpen source Hive support

Attempt[1] tl;drKafka replaced by SQS:

Attempt[1] tl;drKafka replaced by SQS:

Scalable & reliable

Attempt[1] tl;drKafka replaced by SQS:

Scalable & reliableAlready fronted by a RESTful interface

Attempt[1] tl;drKafka replaced by SQS:

Scalable & reliableAlready fronted by a RESTful interfaceNearly free to operate (nothing to manage)

Attempt[1] tl;drKafka replaced by SQS:

Scalable & reliableAlready fronted by a RESTful interfaceNearly free to operate (nothing to manage)Robust security model

Attempt[1] tl;drKafka replaced by SQS:

Scalable & reliableAlready fronted by a RESTful interfaceNearly free to operate (nothing to manage)Robust security modelOne queue per event type/platform

Attempt[1] tl;drKafka replaced by SQS:

Scalable & reliableAlready fronted by a RESTful interfaceNearly free to operate (nothing to manage)Robust security modelOne queue per event type/platformBuilt-in monitoring

Attempt[1] tl;drDTCS replaced by Time-Window Compaction

Attempt[1] tl;drDTCS replaced by Time-Window Compaction

Developed by Jeff Jirsa at CrowdStrike

Attempt[1] tl;drDTCS replaced by Time-Window Compaction

Developed by Jeff Jirsa at CrowdStrikeGroups similar timestamps/expirations together

Attempt[1] tl;drDTCS replaced by Time-Window Compaction

Developed by Jeff Jirsa at CrowdStrikeGroups similar timestamps/expirations togetherSimply delete expired sstables

Attempt[1] tl;drDTCS replaced by Time-Window Compaction

Developed by Jeff Jirsa at CrowdStrikeGroups similar timestamps/expirations togetherSimply deletes expired sstablesImproved stability & throughput

Attempt[1] tl;drDTCS replaced by Time-Window Compaction

Developed by Jeff Jirsa at CrowdStrikeGroups similar timestamps/expirations togetherSimply deletes expired sstablesImproved stability & throughput

Fine PrintUse C* >= 2.1.8

CASSANDRA-9637 - fixes Spark input split computation

CASSANDRA-9549 - fixes memory leakCASSANDRA-9436 - exposes rpc/broadcast

addresses for Spark/cloud environments

Fine PrintUse C* >= 2.1.8

CASSANDRA-9637 - fixes Spark input split computation

CASSANDRA-9549 - fixes memory leakCASSANDRA-9436 - exposes rpc/broadcast

addresses for Spark/cloud environments

Version incompatibilities abound (check sbt file for Spark-Cassandra connector)

Fine PrintTwo main Spark clusters:

Fine PrintTwo main Spark clusters:

Co-located with C* for heavy analysisPredictable loadEfficient C* access

Fine PrintTwo main Spark clusters:

Co-located with C* for heavy analysisPredictable loadEfficient C* access

Self-serve in same DC but not co-locatedUnpredictable loadFavors mining S3 dataIsolated from production jobs

Data Modeling

PartitioningOpposite strategy from “normal” C* modeling

PartitioningOpposite strategy from “normal” C* modeling

Model for good parallelism

PartitioningOpposite strategy from “normal” C* modeling

Model for good parallelism… not for single-partition queries

PartitioningOpposite strategy from “normal” C* modeling

Model for good parallelism… not for single-partition queries

Avoid shuffling for most cases

PartitioningOpposite strategy from “normal” C* modeling

Model for good parallelism… not for single-partition queries

Avoid shuffling for most casesShuffles occur when NOT grouping by partition key

PartitioningOpposite strategy from “normal” C* modeling

Model for good parallelism… not for single-partition queries

Avoid shuffling for most casesShuffles occur when NOT grouping by partition keyPartition for your most common grouping

Secondary IndexesUseful for C*-level filtering

Secondary IndexesUseful for C*-level filteringReduces Spark workload and RAM footprint

Secondary IndexesUseful for C*-level filteringReduces Spark workload and RAM footprintLow cardinality is still the rule

Secondary Indexes (Client Access)

Secondary Indexes (with Spark)

Full-text IndexesEnabled via Stratio-Lucene custom index

(https://github.com/Stratio/cassandra-lucene-index)

Full-text IndexesEnabled via Stratio-Lucene custom index

(https://github.com/Stratio/cassandra-lucene-index)

Great for C*-side filters

Full-text IndexesEnabled via Stratio-Lucene custom index

(https://github.com/Stratio/cassandra-lucene-index)

Great for C*-side filtersSame access pattern as secondary indexes

Full-text IndexesCREATE CUSTOM INDEX email_index on emails(lucene)USING 'com.stratio.cassandra.lucene.Index'WITH OPTIONS = {

'refresh_seconds':'1','schema': '{

fields: {id : {type : "integer"},

user : {type : "string"},subject : {type : "text", analyzer : "english"},body : {type : "text", analyzer : "english"},time : {type : "date", pattern : "yyyy-MM-dd hh:mm:ss"}}

}'};

Full-text IndexesSELECT * FROM emails WHERE lucene='{

filter : {type:"range", field:"time", lower:"2015-05-26 20:29:59"},query : {type:"phrase", field:"subject", values:["test"]}

}';

SELECT * FROM emails WHERE lucene='{filter : {type:"range", field:"time", lower:"2015-05-26 18:29:59"},query : {type:"fuzzy", field:"subject", value:"thingy", max_edits:1}

}';

WIDE ROWS

Caution:

Wide RowsIt only takes one to ruin your day

Wide RowsIt only takes one to ruin your dayMonitor cfstats for max partition bytes

Wide RowsIt only takes one to ruin your dayMonitor cfstats for max partition bytesUse toppartitions to find hot keys

Avoid NullsNulls are deletes

Avoid NullsNulls are deletesDeletes create tombstones

Avoid NullsNulls are deletesDeletes create tombstonesDon’t write nulls!

Avoid NullsNulls are deletesDeletes create tombstonesDon’t write nulls!Beware of nulls in prepared statements

Data Exploration

Data Warehouse Paradigm - Old

Ingest Model Transform Design

Visualize

Data Warehouse Paradigm - New

Ingest Explore Analyze Deploy

Visualize

VisualizationCritical to understanding your data

VisualizationCritical to understanding your dataReduced time to visualization

VisualizationCritical to understanding your dataReduced time to visualization… from >1 month to minutes (!!)

VisualizationCritical to understanding your dataReduced time to visualization… from >1 month to minutes (!!)Waterfall to agile

ZeppelinOpen source Spark notebook

ZeppelinOpen source Spark notebookInterpreters for Scala, Python, Spark SQL,

CQL, Hive, Shell, & more

ZeppelinOpen source Spark notebookInterpreters for Scala, Python, Spark SQL,

CQL, Hive, Shell, & moreData visualizations

ZeppelinOpen source Spark notebookInterpreters for Scala, Python, Spark SQL,

CQL, Hive, Shell, & moreData visualizationsScheduled jobs

Zeppelin

Zeppelin

Zeppelin

Future Work

FiloDBLow latency time-series aggregations using

Spark + Cassandra/in-memory storage

FiloDBLow latency time-series aggregations using

Spark + Cassandra/in-memory storageSpace efficient – similar to Parquet

FiloDBLow latency time-series aggregations using

Spark + Cassandra/in-memory storageSpace efficient – similar to ParquetSQL queries using ODBC/JDBC

Direct to ParquetStream to Parquet directly

Direct to ParquetStream to Parquet directlyEliminate interim storage

Direct to ParquetStream to Parquet directlyEliminate interim storageCurrently in R&D

We’re Hiring!

Robbie Stricklandrstrickland@weather.com