Austin Cassandra Meetup re: Atomic Counters

19
Cassandra at 46 Labs: Idempotent Counters July 17, 2014

Transcript of Austin Cassandra Meetup re: Atomic Counters

Page 1: Austin Cassandra Meetup re: Atomic Counters

Cassandra at 46 Labs: Idempotent CountersJuly 17, 2014

Page 2: Austin Cassandra Meetup re: Atomic Counters

Who is this guy?

I’m also the Founder, which in Latin means “everyone else gets paid before me.”

~ Literal Translation

Page 3: Austin Cassandra Meetup re: Atomic Counters

Founded in 2012

Currently handle around 1/2 Billion call billing records per day.

What is 46 Labs?

We build realtime telecom analytics and security solutions for Carriers and Enterprises

Page 4: Austin Cassandra Meetup re: Atomic Counters

Shout Outs#Cassandra IRC Channel

“Unbelievable resource”

!“Thumbs up for the Startup Program”

Nate McCall “Helped us in our time of need”

Page 5: Austin Cassandra Meetup re: Atomic Counters

To all of you who aren’t in that ballpark…feel free to take the pitch and swing away.

Patent Warning

So…we the have parts of this process related to the handling of telecom analytics and billing records patented.

!Fair Warning to the telecom folks in the room.

Page 6: Austin Cassandra Meetup re: Atomic Counters

You can do an operation several times without changing the result as a function of performing the operation.

Simple Answer:

What is idempotence?

Example:

For example, as “set” is idempotent. An “increment or decrement” isn’t. Not just with Cassandra, but with anything, by definition.

Page 7: Austin Cassandra Meetup re: Atomic Counters

But why?

Because counters are NOT atomic in Cassandra.

Why does it matter?

Because it is really, really, really hard to do anything atomic and distributed, especially counters.

Page 8: Austin Cassandra Meetup re: Atomic Counters

Since counters aren’t idempotent, by definition, and not atomic in Cassandra, it means that if you repeated the

same counter operation 100 times….you might get different results on each run.

So…

???

Page 9: Austin Cassandra Meetup re: Atomic Counters

It means that you can’t use Cassandra counters for anything requiring precision….like billing balances, voting, statistical

analysis or any time-series data that must be exact.

The higher the volume and the more nodes you have, the more inaccurate the counters become.

And…?

Page 10: Austin Cassandra Meetup re: Atomic Counters

If you are wanting atomic counters inside of a database as of today’s date, then maybe.

Hint: We have tried both (and a lot more). They are slow. Like…really slow for this type of operation and have hurdles way beyond just being slow.

So I should use Mysql or Couchbase?

Page 11: Austin Cassandra Meetup re: Atomic Counters

Is there a chance that a better alternative exists that will allow me to use Cassandra and have atomic and

idempotent counters?

So, All is Lost?

Yeap.!!

But it involves some helpers.

Page 12: Austin Cassandra Meetup re: Atomic Counters

+

How we do it

+

=

Page 13: Austin Cassandra Meetup re: Atomic Counters

Our call billing records come off our infrastructure and go into a RabbitMQ cluster.

!Hint: you could use Kafka, Redis, 0MQ, etc.

The RabbitMQ queues are a nice and safe place for our messages to sit and wait to be processed.

RabbitMQ

With RabbitMQ ACKs, we can be sure the messages are fully processed before they are removed.

Page 14: Austin Cassandra Meetup re: Atomic Counters

We wrote Java workers, who’s sole job in life is to:

1. Consume Messages from Rabbit!!

2. Perform In-memory atomic increment operations (increment/decrement).!!

3. Persist the message to Cassandra.!!

4. Push a static counter value into Cassandra (i.e. a set instead of an increment) every X seconds.!!

5. ACK that the operation is complete back to Rabbit.

Workers

(You can use whatever language you prefer)

Page 15: Austin Cassandra Meetup re: Atomic Counters

1. You can stream analytics in realtime. !

2. Being in-memory, it is ridiculously fast and lightweight. !

3. Its atomic because each counter constituent is in a single thread. !

4. Cassandra can be used to atomically persist the counter. !

5. The counter data matches the underlying data used to generate it exactly.

Why is this special?

Page 16: Austin Cassandra Meetup re: Atomic Counters

What happens if the worker crashes…its all in memory!!!

Refer to step 4 in what our worker’s job is to do: “Push a static counter value into Cassandra (i.e. a set instead of an increment) every second.”

Wait…

Since we push a static counter value into Cassandra, we now have an idempotent way to recover gracefully in the event of a crash. The worker fires up, asks Cassandra what

it should have in its memory, then starts its atomic operations again. This backup worker can come up (Zookeeper) on a different physical or virtual host if needed.

Page 17: Austin Cassandra Meetup re: Atomic Counters

Since you are limited to a single thread processing a single counter….once you run out of memory or saturate the CPU for that counter you can’t grow!!

!Yeap. This is why we shard our data at the application layer and not the worker layer. We abstract

scalability further out knowing we have a finite amount of memory and processing power to play with at the worker level.

You cant grow!

We can atomically handle 1M ops/sec from a single worker on a single moderately powered server. If you are taxing that single server you need to re-think your

architecture.!

Page 18: Austin Cassandra Meetup re: Atomic Counters

Sure it does.

Does it work?

We currently process over 2 million counter operations per second using this method.

Page 19: Austin Cassandra Meetup re: Atomic Counters

Questions?

If you think of any ones that you forgot to ask, you can email me at [email protected].