Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

54
BETTER GRAPHITE STORAGE WITH CYANITE PIERRE-YVES RITSCHARD @PYR #CASSANDRASUMMIT 0

description

Presenter: Pierre-Yves Ritschard, CTO at Exoscale Graphite is the go-to tool of sysadmins everywhere to store and retrieve timeseries data. Cyanite is an alternative graphite compatible daemon which uses Cassandra as its main storage engine. The talk will focus on how to build efficient time-series data models in Cassandra, how the ecosystem of tools around Cassandra can help in processing timeseries in batches and will provide architectural insight in how to build truly scalable time series pipelines.

Transcript of Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

Page 1: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

BETTER GRAPHITESTORAGE WITH CYANITE

PIERRE-YVES RITSCHARD@PYR

#CASSANDRASUMMIT

0

Page 2: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

@PYRCTO at exoscale, the safe home for your cloud applicationsOpen source developer: pithos, cyanite, riemann, collectd…Recovering Operations Engineer

Page 3: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

AIM OF THIS TALKPresenting graphite and its ecosystemPresenting cyaniteShow-casing simplicity through cassandra

Page 4: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

OUTLINEGraphite overviewThe problem with graphiteCyanite solutions & internalsLooking forward

Page 5: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

GRAPHITE OVERVIEW

Page 6: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

FROM THE SITEGraphite does two things:

1. Store numeric time-series data2. Render graphs of this data on demand

http://graphite.readthedocs.org

Page 7: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

SCOPEA metrics toolNot a complete monitoring solutionInteracts with metric submission tools

Page 8: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

WHY ARE METRICS IMPORTANTOutside the scope of this talkNarrowing the gap between map and territory

Page 9: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

GRAPHITE COMPONENTSwhispercarbongraphite-web

Page 10: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

WHISPERRRD like storage libraryWritten in pythonEach file contains different roll-up periods and an aggregationmethod

Page 11: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

CARBONAsynchronous (twisted) TCP and UDP service to input time-series dataSimple storage rulesSplit across several daemons

Page 12: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

CARBON-CACHEMain carbon daemonTemporarily caches values to RAMWrites out to whisper

Page 13: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

CARBON-AGGREGATORAggregates data and forwards to carbon-cacheLess I/O strain on the filesystemAt the expense of resolution

Page 14: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

CARBON-RELAYProvides sharding and replicationForwards to appropriate carbon-cache processes based on aprovided hashing method

Page 15: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

GRAPHITE-WEBSimple Django-Based HTTP apiPersists configuration to SQLData query and manipulation through a very simple DSLGraph renderingComposer client interface to build graphs

## sum CPU valuessumSeries("collectd.web01.cpu-*")

## provide memory percentagealias(asPercent(web01.mem.used, sumSeries(web01.mem.*)), "mem percent")

Page 16: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

SCREENSHOTS

Page 17: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

SCREENSHOTS

Page 18: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

ARCHITECTURE OVERVIEW

Page 19: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

MODULARITY IN GRAPHITERecently improvedA module can implement a storage strategy for graphite-webCarbon modularity is a bit harder

Page 20: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

THE GRAPHITE ECOSYSTEMA wealth of tools are now graphite compatible

Page 21: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

STATSDVery popular metric service to integrate within applications.

Aggregates events in n second windowsShips off to graphite

statsd.increment 'session.open'statsd.gauge 'session.active', 370statsd.timing 'pdf.convert', 320

Page 22: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

COLLECTDVery popular collection daemon with a graphite destinationEvery conceivable system metricsA wealth of additional metric sources (such as a fast statsdserver)<plugin write_graphite> <carbon> Host "graphite-host" </carbon></plugin>

Page 23: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

GRAPHITE-APIAlternative to graphite-webShares data manipulation codeNo persistence of configuration

Page 24: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

GRAFANAIncreasingly popular alternative to graphite-web, withgraphite-apiInspired by the kibana project for logstashOptional persistence to elasticsearch for configuration

Page 25: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

RIEMANNDistributed system monitoring solution

(def graph! (graphite {:host "graphite-server"}))

(streams (where (service "http.404") (rate 5 graph!)))

Page 26: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

AND A LOT MOREsyslog-nglogstashdescartestasseojmxtrans

Page 27: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

HIGH VALUE PROJECTActive and friendly developer communityGrowing ecosystemVery few contenders

Page 28: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

THE PROBLEM WITH GRAPHITE

Page 29: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

ESSENTIALY A SINGLE-HOST SOLUTIONBuilt in a day where cacti reignedInnovative project at the time which decoupled collectionfrom storage and display

Page 30: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

THE WHISPER FILE FORMATOne file per data pointOptimized for space, not speedPlenty of seeksOnly shared storage option is NFS…In many ways can be seen as RRD in python

Page 31: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

SCALING STRATEGIESTacked on after the factThe decoupled architecture means that both graphite-weband carbon need upfront knowledge on the locations of shard

Page 32: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

SCALING OVERVIEW

Page 33: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

IT GETS A BIT HAIRYCluster topology must be stored on all nodesManual replication mechanism (through carbon-relay)Changing cluster topology means re-assigning shards byhand

Page 34: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

WHAT GRAPHITE CAN KEEPPersistence of configurationLocal data manipulation

Page 35: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

WHAT GRAPHITE WOULD NEEDAutomatic shard assignmentReplicationEasy managementEasy cluster topology changes (horizontal scalability)

Page 36: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

THE CYANITE APPROACHLeveraging Apache Cassandra to store time-seriesLeveraging Graphite for the interface

Page 37: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

A CASSANDRA-BACKED CARBON REPLACEMENT

Written in clojure Async I/ONo more whisper filesFast storageHorizontally scalableInterfaced with graphite-web through graphite-cyanite

Page 38: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

CYANITE DUTIESProviding graphite-compatible input methods (carbonlisteners)Providing a way to retrieve metric names and metric time-series

Implemented as two protocolsA metric-storeA path-store

The rest is up to the graphite eco-system, through graphite-cyaniteThe recommended companion is graphite-api

Page 39: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

GETTING UP AND RUNNINGA simple configuration file

carbon: host: "127.0.0.1" port: 2003 readtimeout: 30 rollups: - period: 60480 rollup: 10 - period: 105120 rollup: 600http: host: "0.0.0.0" port: 8080logging: level: info files: - "/var/log/cyanite/cyanite.log"store: cluster: 'localhost' keyspace: 'metric'

Page 40: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

GRAPHITE-CYANITEwith graphite-web:

STORAGE_FINDERS = ( 'cyanite.CyaniteFinder', )CYANITE_URLS = ( 'http://host:port', )

with graphite-api:cyanite: urls: - http://cyanite-host:portfinders: - cyanite.CyaniteFinder

Page 41: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

LEADING ARCHITECTURE DRIVERSSimplicityOptimize for speedAs few moving parts as possibleMulti-tenancyResource efficiencyRemain compatible with the graphite ecosystem

Page 42: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

CYANITE INTERNALS

Page 43: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

CASSANDRA IS GREAT FOR TIME-SERIESIt bears repeating

High write to read ratio workloadNo manual shard allocation or reassignmentSorted wide columns mean efficient retrieval of data

Page 44: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

A NEW STACK

Page 45: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

SIMPLE SCHEMACREATE TABLE "metric" ( tenant text, period int, rollup int, path text, time bigint, data list<double>, PRIMARY KEY((tenant, period, rollup, path), time))

Page 46: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

TAKING ADVANTAGE OF WIDE COLUMNS

Page 47: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

LOOKING FORWARD

Page 48: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

REPLACING MORE GRAPHITE PARTS, EXTENDINGFUNCTIONALITY

Implement graphite's data manipulation functionsRemove the need for graphite-api or graphite-web whenusing grafanaFinish providing multi-tenancy options

Page 49: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

PICKLE SUPPORTEasier integration in existing architecturesWould allow integration with carbon-relay

Page 50: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

ALTERNATIVE INPUT METHODSSupport queue input of metricsCollectd already supports shipping graphite data to ApacheKafkaSupport the statsd protocol directly

Page 51: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

PROVIDE A CYANITE LIBRARYEasy, standard-compliant storage from JVM basedapplications

Page 52: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

BATCH OPERATIONSCompactions of rolled up seriesDynamic thresholdsGreat opportunity to leverage the cassandra & sparkinteraction

Page 53: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

A FEW TAKE-AWAYSCassandra enabled a quick-win in about 1100 lines of clojureGreatly simplified scaling strategyBuilding block for a lot moreGood way to reduce technology creep if you're already usingcassandra

Page 54: Cassandra Summit 2014: Cyanite — Better Graphite Storage with Apache Cassandra

THANKS !Cyanite owes a lot to:

Max Penet (@mpenet) for the great alia libraryBruno Renie (@brutasse) for graphite-api, graphite-cyaniteand the initial nudgeDatastax for the awesome cassandra java-driverIts contributorsApache Cassandra obviously

@pyr – #CassandraSummit