Download - Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant Datastax MVP for Apache Cassandra Licensed under a …

Transcript
Page 1: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Real world tales of repair

Page 2: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

APACHE BIGDATA - MAY 2017

Alexander Dejanovski@alexanderdeja

Consultantwww.thelastpickle.com

Datastax MVP for Apache Cassandra

Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License

Page 3: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

About The Last Pickle

We help people deliver and improve Apache Cassandra based solutions.

With staff in 5 countries : New Zealand, Australia, France, Spain, USA

Page 4: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

What and why ?Full repair

Incremental repairHow to make it work

www.thelastpickle.com

Page 5: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

What is repair ?A maintenance operation that (briefly) restores strong consistency throughout the cluster

www.thelastpickle.com

Page 6: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Why do we need repair ?

- Eventual consistency- Downtime / failure recovery- Safe deletes

www.thelastpickle.com

Page 7: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Tombstones need repair too

Missing tombstones can lead to zombie data(repair within gc_grace_seconds)

www.thelastpickle.com

Page 8: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Tombstones need repair too

www.thelastpickle.com

Page 9: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Tombstones need repair too

www.thelastpickle.com

Page 10: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Tombstones need repair too

www.thelastpickle.com

Page 11: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Tombstones need repair too

www.thelastpickle.com

Page 12: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Tombstones need repair too

www.thelastpickle.com

Page 13: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Tombstones need repair too

www.thelastpickle.com

Page 14: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

What and why ?Full repair

Incremental repairHow to make it work

www.thelastpickle.com

Page 15: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

How does anti-entropy repair works ?

Reads all data

www.thelastpickle.com

Page 16: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

How does anti-entropy repair works ?

Reads all dataCalculates hashes

www.thelastpickle.com

Page 17: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

How does anti-entropy repair works ?

Reads all dataCalculates hashesCompares hashes

www.thelastpickle.com

Page 18: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

How does anti-entropy repair works ?

Reads all dataCalculates hashesCompares hashes

Streams mismatching partitions

www.thelastpickle.com

Page 19: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

How does anti-entropy repair works ?

www.thelastpickle.com

Page 20: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Merkle tree is requested to all replicas

www.thelastpickle.com

Page 21: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Validation compaction

www.thelastpickle.com

Page 22: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Merkle tree comparison

www.thelastpickle.com

Page 23: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Streaming

www.thelastpickle.com

Page 24: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

How do we run repair ?

nodetool repair

www.thelastpickle.com

Page 25: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Improving repair

www.thelastpickle.com

Page 26: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Improving repair

www.thelastpickle.com

Page 27: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Improving repair

www.thelastpickle.com

Page 28: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Improving repair

repairing each range once is enough

www.thelastpickle.com

Page 29: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Improving repair

nodetool repair -pr

www.thelastpickle.com

Page 30: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Improving repair

nodetool repair -pr

not suitable for node recovery

www.thelastpickle.com

Page 31: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Sequential or parallel ?

Sequential : takes a snapshot on all replicas and computes merkle trees one replica at a

time (on the snapshots)

www.thelastpickle.com

Page 32: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Sequential or parallel ?

Parallel : No snapshot, all replicas compute

merkle trees at the same time

www.thelastpickle.com

Page 33: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Repair too slow ?

Sequential repair is the default

since C* 2.0

www.thelastpickle.com

Page 34: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Repair too slow ?

nodetool repair -par

www.thelastpickle.com

Page 35: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

The problem with dense nodes

Overstreaming

Leaves of the Merkle tree contain several partitions.

32k leaves at most.www.thelastpickle.com

Page 36: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

The solutions with dense nodes

cassandra_range_repair(Matt Stump & Brian Gallew)

Breaks the repair sessions in n steps

Cassandra reaper(Spotify)

Full orchestration tool for repairs + sub range repair support

www.thelastpickle.com

Page 37: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

The solutions with dense nodes

vnodes : one repair session per vnode

Drawback : if you have many vnodes, repair takes longer

www.thelastpickle.com

Page 38: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Repair in…

www.thelastpickle.com

Page 39: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

The early days of your cluster

Node density is low,repair works just finehowever you run it.

www.thelastpickle.com

Page 40: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

The early days of your cluster

So maybe like I did,you run « nodetool repair »on all nodes… at the sametime

www.thelastpickle.com

Page 41: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

The (not so) early days of your cluster

As nodes gets higher in density, repair takes longer… and longer…

www.thelastpickle.com

Page 42: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

The (not so) early days of your cluster

… and latencies riseas repair is a CPU and I/O intensive operation

www.thelastpickle.com

Page 43: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Your cluster is a grown up now

… until it breaks your cluster

www.thelastpickle.com

Page 44: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

How can it break ?

Load gets too high

www.thelastpickle.com

Page 45: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

How can it break ?

Load gets too highYou don’t meet your latency SLA anymore

www.thelastpickle.com

Page 46: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

How can it break ?

Load gets too high

www.thelastpickle.com

Page 47: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

How can it break ?

Load gets too highStreams get stuck

www.thelastpickle.com

Page 48: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

How can it break ?

Load gets too highStreams get stuck

and out of nowhere, all nodes start to eat all your CPU doing nothing

www.thelastpickle.com

Page 49: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

The fun part ?

You need to run repairto recover from the repair outage !

www.thelastpickle.com

Page 50: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

The cluster keeps growing And you realize orchestration is neededto stop blowing up your cluster

www.thelastpickle.com

Page 51: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Orchestrating repair

Repair must not run on all nodesat the same time

www.thelastpickle.com

Page 52: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Tools to orchestrate repairs

OpsCenter repair service (DSE users)

Cassandra reaper

www.thelastpickle.com

Page 53: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Cassandra reaper

https://github.com/spotify/cassandra-reaperhttps://github.com/thelastpickle/cassandra-reaper

www.thelastpickle.com

Page 54: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Cassandra reaper

Performs subrange repair

www.thelastpickle.com

Page 55: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Cassandra reaper

Performs subrange repairLimits repair pressure

www.thelastpickle.com

Page 56: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Cassandra reaper

Performs subrange repairLimits repair pressureRetries failed sessions

www.thelastpickle.com

Page 57: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Cassandra reaper

Performs subrange repairLimits repair pressureRetries failed sessions

(auto-)Schedules cyclic repairs

www.thelastpickle.com

Page 58: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Cassandra reaper

Performs subrange repairLimits repair pressureRetries failed sessions

(auto-)Schedules cyclic repairsOptimizes cluster load

www.thelastpickle.com

Page 59: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Cassandra reaper - with UI (thx Stefan Podkowinski)

GUI screenshots

www.thelastpickle.com

Page 60: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

What and why ?Full repair

Incremental repairHow to make it workAutomated repairs

www.thelastpickle.com

Page 61: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

What if we stopped repairing repaired data ?

www.thelastpickle.com

Page 62: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Here comes the savior !

C* 2.1 introduces incremental repairDefault repair mode since C* 2.2

www.thelastpickle.com

Page 63: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

How does incremental repair work ?

www.thelastpickle.com

Page 64: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Anticompaction

www.thelastpickle.com

Page 65: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Anticompaction (repair on all ranges on local node)

www.thelastpickle.com

Page 66: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair looks awesome… …but has flaws and drawbacks

www.thelastpickle.com

Page 67: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair caveats

Carefully prepare your switch to incremental repair

www.thelastpickle.com

Page 68: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair caveats

Carefully prepare your switch to incremental repair

i.e. do not run « nodetool repair -inc »straight away…

www.thelastpickle.com

Page 69: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair caveats

It doesn’t handle missing/corrupted datathat was already repaired

www.thelastpickle.com

Page 70: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair caveats

It splits SSTables in 2 setsthat cannot be compacted together

(think tombstone purge)

www.thelastpickle.com

Page 71: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair caveats

It is incompatible with subrange repair(anticompaction)

www.thelastpickle.com

Page 72: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair caveats

It doesn’t like concurrency very much

www.thelastpickle.com

Page 73: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair caveats

Validator.java:261 - Failed creating a merkle tree for [repair #e4c782d0-11fc-11e6-b616-51a3849870bb on table_v2/table_attributes, [(8835460833482333317,8838777311566358575],(-7300486781514672850,-7298192396576668423], (-959298474675167225,-959177964106074209]]], /10.10.10.33 (see log for details)

www.thelastpickle.com

Page 74: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair caveats

CompactionManager.java:1320 - Cannot start multiple repair sessions over the same sstables

www.thelastpickle.com

Page 75: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair caveats

CASSANDRA-8316

A running anticompation prevents validation compaction

www.thelastpickle.com

Page 76: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair caveats

Do not use -pr with incremental repair

www.thelastpickle.com

Page 77: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair caveats

Do not use -pr with incremental repair

Useless : data is repaired once only

www.thelastpickle.com

Page 78: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair caveats

Do not use -pr with incremental repair

Useless : data is repaired once only anywayMisleading : anticompaction partially disabled

www.thelastpickle.com

Page 80: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair bugs

CASSANDRA-13153 Fixed in 2.2.10, 3.0.13, 3.11.0, 4.0

Reappearing Data when Mixing Incremental and Full Repairs

www.thelastpickle.com

Page 81: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair bugs

CASSANDRA-9143 Fix planned for 4.0

SSTables marked as repaired on some nodes only

Because : node can fail during anti compactionor : SSTables can get compacted during repair

www.thelastpickle.com

Page 82: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair bugs

CASSANDRA-10446 Fix planned for 4.0

Spotted by Paulo Motta in the comments :

SSTables are streamed with a repairedAt value.

www.thelastpickle.com

Page 83: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair will not…

Fix a poor repair strategy

www.thelastpickle.com

Page 84: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Incremental repair will not…

Prevent you from having to run full repair

www.thelastpickle.com

Page 85: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Reaper does support incremental repair

github.com/thelastpickle

www.thelastpickle.com

Page 86: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Reaper and incremental repair

No subrange repair

www.thelastpickle.com

Page 87: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Reaper and incremental repair

No subrange repairSingle repair thread => no concurrency

www.thelastpickle.com

Page 88: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

What and why ?Full repair

Incremental repairHow to make it work

www.thelastpickle.com

Page 89: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Repair best practices

Put your repair strategy in place on day 1

www.thelastpickle.com

Page 90: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Repair best practices

Use appropriate tooling or build your own

www.thelastpickle.com

Page 91: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Repair best practices

Spread repair over a gc_grace_seconds cycle

www.thelastpickle.com

Page 92: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Repair best practices

Adjust repair pressure on your cluster(Reaper does that)

www.thelastpickle.com

Page 93: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Repair best practices

Don’t repair everything !

Pick tables with deletes and those with critical data

www.thelastpickle.com

Page 94: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Repair best practices

If every data is critical, then none is ;)

www.thelastpickle.com

Page 95: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Repair best practices

Be tight on your schedule with inc repair

Tombstones and anticompaction

www.thelastpickle.com

Page 96: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Repair best practices

Avoid concurrency with inc repairOne node at a time

www.thelastpickle.com

Page 97: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Repair best practices

Wait for 4.0.x before moving to incremental repair…?

www.thelastpickle.com

Page 98: Real world tales of repair - Apache BigData BIGDATA - MAY 2017 Alexander Dejanovski @alexanderdeja Consultant  Datastax MVP for Apache Cassandra Licensed under a …

Thanks!@alexanderdeja