Cassandra South Bay Meetup - Backup And Restore For Apache Cassandra
-
Upload
aaronmorton -
Category
Technology
-
view
293 -
download
3
Transcript of Cassandra South Bay Meetup - Backup And Restore For Apache Cassandra
SOUTH BAY CASSANDRA USERS MARCH 2016
BACKUP AND RESTORE FOR APACHE CASSANDRA
Aaron Morton@aaronmorton
CEO
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
About The Last Pickle.
Work with clients to deliver and improve Apache Cassandra based solutions.
Apache Cassandra Committer and DataStax MVPs.
Based in New Zealand, Australia, France & USA.
Why BackupCommit Log Archiving
Table Snap
Why Backup?
Replication is for Availability.
Why Backup?
Replicate good data as fast as bad data.
Three Reasons To Backup…
Business Continuity Planning / Disaster Recovery Planning
(AKA Data Centre is on fire.)
Three Reasons To Backup…
Environment Cloning(AKA Let’s make a new Data Centre.)
Three Reasons To Backup…
Point In Time Recovery(AKA Bad deploy.)
Why BackupCommit Log Archiving
Table Snap
Commit Log
Writes are first written to the Commit Log (on each node).
Commit Log
Commit Log can grow up to 8GB in size.
Commit Log
Commit Log is made up of 32 MB Segments.
Commit Log
Commit Log contains Mutations, which have row
fragments.
Commit Log
Mutations are serialised in the form they are sent over the
wire.
Commit Log Archiving
Archive Segment when full.
Restore Segments at startup (if specified).
commitlog_archiving.propertiesarchive_command=
Run this command when a Segment is full.
commitlog_archiving.propertiesrestore_directories=
Read all files in this CSV list of directories at startup and run restore_command for each.
commitlog_archiving.propertiesrestore_point_in_time=
Stop processing mutations with a timestamp higher than this.
commitlog_archiving.propertiesprecision=MICROSECONDS
Precision used for timestamps.
Cassandra Parameter-Dcassandra.replayList=
CSV white list of keyspace.table to replay.
Why BackupCommit Log Archiving
Table Snap
Table Snap
Table Snap
Continually Backup and Restore SSTables to S3.
tablesnap
Watch for files closed or moved into the data
directories.
tablesnap
Upload all SSTable components, splitting large files, using multiple threads.
tablesnap
Includes a list of SSTables in the directory.
tablesnap
Skips file if it was removed by compaction during processing.
tablechop
Deletes old files from the backup set to implement a
rolling window.
tablechop
Specify how many days to keep.
tablechop
Use - -debug to reduce the stress.
(AKA Dry Run, does not delete the files.)
tableslurp
Slurp SSTables from S3 to a local directory for restoring.
tableslurp
Restores the latest backup set, or a named backup set.
Table Snap Pros
Simple.
Table Snap Cons
No monitoring.Manual restore into cluster.No support for topology
change.
Thanks.
Aaron Morton@aaronmorton
Co-Founder & Principal Consultantwww.thelastpickle.com