Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

28
Metrics with Riak A retrospective Martin Törnwall

Transcript of Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Page 1: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Metrics with RiakA retrospective

MartinTörnwall

Page 2: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Many definitions, but here's ours...

Metrics?

Page 3: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

So we can visualize it and search for patterns

Recording things that change over time

Page 4: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

CPU, network, memory and disk usage, ...

OS

Page 5: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Number of requests, errors, events, ...

Application

Page 6: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Text messages or emails sent, customer service calls, ...

External events

Page 7: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

● A named variable: "sys.mem.free"● With tags: "host=sl075", "code=403", ...

avg("sys.mem.free") from 1 hour ago where host="sl075"

What is a Metric?

Page 8: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Going Technical

Page 9: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Why not have distributed metrics?

We have distributed services

Page 10: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Solutions exist, but rely on technology stacks we had no experience of (e.g., HBASE)

Reinventing the wheel?

Page 11: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Just how hard can it be?

I mean, really...

Page 12: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Just how hard can it be?

I mean, really...

Page 13: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Our weekend hack glorious metrics storage and processing software

Introducing Metyr

Page 14: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Design Decisions

● Use familiar tools: Erlang, Riak, HTTP● Not a critical service but ...● ... Avoid SPOF● Write performance >> read performance● Centralized reference clock● Integer only● Avoid 2i if possible● When in doubt, leave it to Riak

Page 15: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

In Theory...

Metyr Metyr Metyr

Riak cluster

Client Client Client

Page 16: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

No SQL, no schemas, no indices (?), no aggregate operations

Storing metrics in Riak

Page 17: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

The naïve way just never works...

Attempt 1

Page 18: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

A bucket per metric; index by Epoch time

Make each sample an object

Page 19: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Atomicity, write-once, fast range queries

The Good™

Page 20: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Slow, large overhead, requires 2i

The Bad

Page 21: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Combine samples into chunks by time

Attempt 2

Page 22: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Key Points

● One bucket per metric as before● Split into hour-sized chunks

(configurable)● Chunk key: Epoch time● Chunk value: List of samples● To read: Fetch chunks within interval● To write: Fetch chunk, add sample, write

back

Page 23: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Chunk Anatomy

Time0 Value0

64 bits 64 bits

Tags0...

One sample

TimeN ValueN TagsN......

Page 24: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Writing just got harderSlower since we must fetch a chunk first;

potential race conditions, ...

Page 25: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Tests showed that the solution described so far was inadequate

(Arbitrary) Goal:Write 1K samples/sec

Page 26: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Keep per-metric write buffers, flushed every 10 seconds or so

Buffer them writes

Page 27: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

● Race condition on write● Storage requirements● Downsampling of old data

Some Remaining Issues

Page 28: Timeseries data in Riak - Riak Meetup Stockholm 1/11/2012

Thank you!