Apache Cassandra and Drivers

63
Apache Cassandra and Drivers Overview of Apache Cassandra and DataStax Drivers Bulat Shakirzyanov @avalanche123 Sandeep Tamhankar @stamhankar999 https://goo.gl/cBsRVv

Transcript of Apache Cassandra and Drivers

Page 1: Apache Cassandra and Drivers

Apache Cassandra and DriversOverview of Apache Cassandra and DataStax Drivers

Bulat Shakirzyanov@avalanche123

Sandeep Tamhankar@stamhankar999

https://goo.gl/cBsRVv

Page 2: Apache Cassandra and Drivers

IntroductionCassandra Overview

Page 3: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Datacenter Datacenter

Cassandra Topology

3

Node

NodeNode

Node

Client Client

Node

NodeNode

Node

Client Client

Cluster

Page 4: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Datacenter Datacenter

Request Coordinator

4

Node

NodeNode

Node

Client Client

Node

NodeCoordinator

Node

Client Client

Coordinator node:Forwards requeststo corresponding replicas

Page 5: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Datacenter

Row Replica

5

Replica

NodeNode

Replica

Client Client

Datacenter

Node

Node

Replica

Client Client

Coordinator

Replica node:Stores a slice of total rows of each keyspace

Page 6: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Token Ring

6

121

2

3

4567

8

9

1011

Page 7: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Token Ring

6

-263 … (+263 - 1)

Murmur3 Partitioner

Page 8: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Token Ring

6

Node

11…12Node

12…1

Node

1…2

Node

2…3

Node

3…4

Node

4…5

Node

5…6Node

6…7

Node

7…8

Node

8…9

Node

9…10

Node

10…11

-263 … (+263 - 1)

Murmur3 Partitioner

Page 9: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Keyspaces

7

CREATE KEYSPACE default WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': 3 }

Page 10: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

C*

Data Partitioning

8

KeyspaceRow

token(PK) = 1RF = 3

Partitioner:Gets a token by hashing the primary key of a row

Page 11: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

C*

Replication Strategy

9

Keyspace 1

Row

RF = 3

Replication strategy:Determines the first

replica for the row

token(PK) = 1

Page 12: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

C*

Replication Factor

10

KeyspaceRow

RF = 3

Replication factor:Specifies total number of replicas for each row

token(PK) = 1

Page 13: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Coordinator

Node Replica

ReplicaNode

11

ReplicaApplication

Consistency Level

RF = 3, CL = Quorum

Page 14: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Coordinator

Node Replica

ReplicaNode

11

ReplicaApplication

Consistency Level

RF = 3, CL = Quorum

INSERT

Page 15: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Coordinator

Node Replica

ReplicaNode

11

ReplicaApplication

Consistency Level

RF = 3, CL = Quorum

INSERT

Page 16: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Coordinator

Node Replica

ReplicaNode

11

ReplicaApplication

Consistency Level

RF = 3, CL = Quorum

INSERT

Page 17: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Coordinator

Node Replica

ReplicaNode

11

ReplicaApplication

Consistency Level

RF = 3, CL = Quorum

INSERT

Page 18: Apache Cassandra and Drivers

DataStax DriversSmart clients for Apache Cassandra

Page 19: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Goals of DataStax Drivers

• Consistent set of features across languages• Asynchronous execution of requests• Load balancing• Fault tolerant• Address Resolution (multi-region!)• Automatic cluster discovery and reconnection

• Flexible to the core• Consistent terminology• Open source

13

Page 20: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved. 14

Page 21: Apache Cassandra and Drivers

Asynchronous ExecutionIO Reactor, Request Pipelining and Future Composition

Page 22: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Asynchronous Core

16

Application Thread

Business Logic

Driver

Background Thread

IO Reactor

Page 23: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Request Pipelining

17

Client

WithoutRequest Pipelining

Server

Client Server WithRequest Pipelining

1

2

2

3

1

3

1

2

3

1

2

3

Page 24: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

What is a Future?

• Represents the result of an asynchronous operation• Returned by any *_async method in the Ruby driver

• execute_async• prepare_async

• Will block if asked for the true result

18

Page 25: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Future Composition

19

select_user = session.prepare('SELECT * FROM users WHERE id = ?') select_page = session.prepare('SELECT * FROM pages WHERE slug = ?') user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(select_user, arguments: [id])

future.then do |users| user = users.first future = session.execute_async(select_page, arguments: [user['username']])

future.then do |pages| page = pages.first

User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get

Page 26: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

select_user = session.prepare('SELECT * FROM users WHERE id = ?') select_page = session.prepare('SELECT * FROM pages WHERE slug = ?') user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(select_user, arguments: [id])

future.then do |users| user = users.first future = session.execute_async(select_page, arguments: [user['username']])

future.then do |pages| page = pages.first

User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get

Future Composition

20

Page 27: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

select_user = session.prepare('SELECT * FROM users WHERE id = ?') select_page = session.prepare('SELECT * FROM pages WHERE slug = ?') user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(select_user, arguments: [id])

future.then do |users| user = users.first future = session.execute_async(select_page, arguments: [user['username']])

future.then do |pages| page = pages.first

User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get

Future Composition

21

Page 28: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

select_user = session.prepare('SELECT * FROM users WHERE id = ?') select_page = session.prepare('SELECT * FROM pages WHERE slug = ?') user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(select_user, arguments: [id])

future.then do |users| user = users.first future = session.execute_async(select_page, arguments: [user['username']])

future.then do |pages| page = pages.first

User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get

Future Composition

22

Page 29: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

select_user = session.prepare('SELECT * FROM users WHERE id = ?') select_page = session.prepare('SELECT * FROM pages WHERE slug = ?') user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(select_user, arguments: [id])

future.then do |users| user = users.first future = session.execute_async(select_page, arguments: [user['username']])

future.then do |pages| page = pages.first

User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get

Future Composition

23

Page 30: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

select_user = session.prepare('SELECT * FROM users WHERE id = ?') select_page = session.prepare('SELECT * FROM pages WHERE slug = ?') user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(select_user, arguments: [id])

future.then do |users| user = users.first future = session.execute_async(select_page, arguments: [user['username']])

future.then do |pages| page = pages.first

User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get

Future Composition

24

Page 31: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Future Composition

25

[#<User @id=1 @username="avalanche123"; @page=#<Page @slug="avalanche123" ... > ... >, ... ]

Page 32: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Pop Quiz: How to make this faster?

26

select_user = session.prepare('SELECT * FROM users WHERE id = ?') select_page = session.prepare('SELECT * FROM pages WHERE slug = ?') user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(select_user, arguments: [id])

future.then do |users| user = users.first future = session.execute_async(select_page, arguments: [user['username']])

future.then do |pages| page = pages.first

User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get

Page 33: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Pop Quiz: How to make this faster?

27

user_future = session.prepare_async(‘SELECT * FROM users WHERE id = ?') page_future = session.prepare_async(‘SELECT * FROM pages WHERE slug = ?’) user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(user_future.get, arguments: [id])

future.then do |users| user = users.first future = session.execute_async(page_future.get, arguments: [user['username']])

future.then do |pages| page = pages.first

User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get

Page 34: Apache Cassandra and Drivers

Load BalancingPrinciples and Implementations

Page 35: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Application Driver

Load Balancing

29

ApplicationThread

Node

Pool

SessionPool

Pool

Pool

ApplicationThread

ApplicationThread

Client Cluster

Node

Node

Node

Load Balancing Policy

Page 36: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Application Driver

Load Balancing

29

ApplicationThread

Node

Pool

SessionPool

Pool

Pool

ApplicationThread

ApplicationThread

Client Cluster

Node

Node

Node

Load Balancing Policy

Page 37: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Application Driver

Load Balancing

29

ApplicationThread

Node

Pool

SessionPool

Pool

Pool

ApplicationThread

ApplicationThread

Client Cluster

Node

Node

Node

Load Balancing Policy

Page 38: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Datacenter

Datacenter

DataCenter Aware Balancing

30

Node

Node

NodeClient

Node

Node

Node

Client

Client

Client

Client

Client

Local nodes are queried first, if none are available, the request could be sent to a remote node.

Page 39: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Token Aware Balancing

31

Route request directly to Replicas

Node

Node

ReplicaNode

Client

Replica

Replica

Uses prepared statement metadata to get the token

Page 40: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Other built-in policies

• Round Robin Policy• ignores topology

• White List Policy• only connect with certain hosts

32

Page 41: Apache Cassandra and Drivers

Fault ToleranceSources of Failure and Error Handling

Page 42: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Fault Tolerance

34

Coordinator

Node Replica

Replica

ReplicaNode

Business Logic

Driver

Application

Page 43: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved. 35

Coordinator

Node Replica

Replica

ReplicaNode

Business Logic

Driver

Application

Invalid Requests

Network Timeouts

Server ErrorsPossible Failures

Page 44: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Application Driver

Automatic Retry of Server Errors

36

ApplicationThread

Node

Pool

SessionPool

Pool

Pool

ApplicationThread

ApplicationThread

Client Cluster

Node

Node

Node

Load Balancing Policy

Page 45: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Application Driver

Automatic Retry of Server Errors

36

ApplicationThread

Node

Pool

SessionPool

Pool

Pool

ApplicationThread

ApplicationThread

Client Cluster

Node

Node

Node

Load Balancing Policy

Page 46: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Application Driver

Automatic Retry of Server Errors

36

ApplicationThread

Node

Pool

SessionPool

Pool

Pool

ApplicationThread

ApplicationThread

Client Cluster

Node

Node

Node

Load Balancing Policy

Page 47: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved. 37

Coordinator

Node Replica

Replica

ReplicaNode

Business Logic

Driver

Application

Unreachable Consistency

Page 48: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Coordinator

Node Replica

ReplicaNode

38

Replica

Business Logic

Driver

Application

Read / Write Timeout Error

Page 49: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Coordinator

Node Replica

ReplicaNode

38

Replica

Business Logic

Driver

Application

Read / Write Timeout Error

Page 50: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Coordinator

Node Replica

ReplicaNode

38

Replica

Business Logic

Driver

Application

Read / Write Timeout Error

read / write timeout

Page 51: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved. 39

Coordinator

Node Replica

Replica

ReplicaNode

Business Logic

Driver

Application

Unavailable Error

Page 52: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved. 39

Coordinator

Node Replica

Replica

ReplicaNode

Business Logic

Driver

Application

Unavailable Error

unavailable

Page 53: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved. 40

Error Handling

Page 54: Apache Cassandra and Drivers

Address ResolutionTopology Aware Client

Page 55: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Datacenter Datacenter

Multiple Addresses

42

Node

NodeNode

Node

Client Client

Node

NodeNode

Node

Client Client

Within Datacenter: Private IPs

Across Datacenters: Public IPs

Page 56: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Application Driver

Address Resolution

43

ApplicationThread

ApplicationThread

ApplicationThread

Client Cluster

Page 57: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Application Driver

Address Resolution

43

ApplicationThread Node

Cluster

ApplicationThread

ApplicationThread

Client Cluster

Address Resolution Policy

Page 58: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Application Driver

Address Resolution

43

ApplicationThread Node

Cluster

ApplicationThread

ApplicationThread

Client Cluster

Node

Node

Node

Address Resolution Policy

Control Connection

Page 59: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Application Driver

Address Resolution

43

ApplicationThread Node

Cluster

ApplicationThread

ApplicationThread

Client Cluster

Node

Node

Node

Address Resolution Policy

Control Connection

Page 60: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

Application Driver

Address Resolution

43

ApplicationThread Node

Pool

Cluster

Pool

Pool

Pool

ApplicationThread

ApplicationThread

Client Cluster

Node

Node

Node

Address Resolution Policy

Control Connection

Session

Page 61: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

EC2 Multi-Region Address Resolution

44

Page 62: Apache Cassandra and Drivers

© 2015 DataStax, All Rights Reserved.

More

• Request Tracing• Execution Information

• which node was used, # retries for query, etc.• State Listeners

• node goes down/comes up, schema changes, etc.• Result Paging• SSL and Authentication

45

Page 63: Apache Cassandra and Drivers

Questions