DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

23
©2015 DataStax. Do not distribute without consent. DataStax Philip Thompson Software Engineer The Cassandra Validation Harness: Achieving More Stable Releases 1

Transcript of DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

Page 1: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax. Do not distribute without consent.

DataStax

Philip ThompsonSoftware Engineer

The Cassandra Validation Harness:Achieving More Stable Releases

1

Page 2: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Release Stability• New Major Releases, X.Y.0

• Historically challenging

• Few functional bugs, many “at scale” bugs

• Problems with doing operator tasks while under load• 6285 - HSHA corrupt data

• 10270 - C* stops compacting

• 9687 - Wrong partitioner after upgrade

• 9116 - Indexes lost on upgrade

2

Page 3: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Tick Tock• 3.1, 3.2, etc

• Reducing scope not enough to reduce defect rate

• Need to catch more bugs in testing

3

Page 4: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Test Retrospective Process• Analyzing every bug filed to the Apache C* Jira by

OSS users

• http://www.mail-archive.com/[email protected]/msg08106.html

4

Page 5: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Jepsen• Correctness testing

• Made famous by Aphyr’s Call Me Maybe series

• https://github.com/riptano/jepsen

5

Page 6: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Cassandra Dtest• Current non-unit test framework

• github.com/riptano/cassandra-dtest

• Python based tests, backed by CCM

6

Page 7: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

CCM

7

• CCM - Cassandra Cluster Manager• https://github.com/pcmanus/ccm

• Simulates a C* cluster on localhost

• Starts multiple JVMs, separated with loopback addresses

Page 8: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Problems w/ existing tests• Mostly short-lived, functional tests

• Difficult to coordinate testing features in parallel

8

Page 9: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

CVH• Simulate a real production environment

• Multiple, concurrent administrative actions, while cluster is under load

• Test the entire configuration space

9

Page 10: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Architecture• Harness

• Bridges

• Modules

10

Page 11: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Bridges• How do we test a real C* cluster?

• Entire lifecycle spin up/destroy

• Talk to nodes, why not use existing tools

• Inspiration from DataStax’s Java driver’s CCMBridge

11

Page 12: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Modules• Totally independent thread pools, that perform their

own operations and validation, communicate to the harness

• Signal failures, request shutdown, etc

• Arbitrary modules can run in parallel, serial, or any chain of ordering

12

Page 13: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Module Validation• Data Correctness

• OS and C* metrics

13

Page 14: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Test writing process• Simply write up a .yaml file that defines the modules

you need run, and the options to pass them.

• Code is called via reflection from yaml options

• If the modules you want exist, no need to write new code.

14

Page 15: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Sample Testbridge: CCMBridge nodeCount: 3modules: - [LargeStressWriteModule] - [TableModule, ArbitraryStressOperationModule]cassandrayaml: !!map concurrent_reads: 1000moduleArgs: !!map ArbitraryStressOperationModule: !!map stress_settings: write n=1MignoredErrors: - "RejectedExecutionException"requiredErrors: - "Could not validate"

15

Page 16: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Executor Services• Nice way to do concurrent, async message passing

• Asynchronous execution of tasks in the background

• Accepts Runnables as tasks, returns Futures that complete once the task is done

• The calling thread can continue with its own execution, then block on the future to wait for the executor

16

Page 17: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Harness• WrappedRunnables

• CompletableFuture

• Receives/Handles signals from Modules

• Passes messages between/to Modules

• JUnit’s Parameterized Runner

17

Page 18: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Upcoming Improvements• Bridge for manipulating arbitrary, physical hardware

• Attacking the test retrospective backlog

• “Permanent” cluster testing

• Dynamically created tests

• Performance testing

18

Page 19: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

CVH in action• CASSANDRA-9913 - Select * Fails to return all data

• Not testable through unit tests

• Caught by the LargeStressWriteModule

19

Page 20: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

How to Contribute• For now, work is being done a fork, in preparation for

merging into tree

• https://github.com/ptnapoleon/cassandra/tree/validation-notest

• Eventually, will either be merged into C* trunk, or turned into its own codebase

• Patches/PRs for code, tests, or documentation are welcome and appreciated

20

Page 21: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

How to Contribute• Suggestions/Ideas for new areas to test?

• We love those too

21

Page 22: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

©2015 DataStax

Questions?• Ask for ptnapoleon on freenode IRC #cassandra-dev

• @ptnapoleon

22

Page 23: DataStax: The Cassandra Validation Harness: Achieving More Stable Releases

Thank you