Apache Cassandra and Drivers
-
Upload
datastax-academy -
Category
Technology
-
view
1.771 -
download
0
Transcript of Apache Cassandra and Drivers
Apache Cassandra and DriversOverview of Apache Cassandra and DataStax Drivers
Bulat Shakirzyanov@avalanche123
Sandeep Tamhankar@stamhankar999
https://goo.gl/cBsRVv
IntroductionCassandra Overview
© 2015 DataStax, All Rights Reserved.
Datacenter Datacenter
Cassandra Topology
3
Node
NodeNode
Node
Client Client
Node
NodeNode
Node
Client Client
Cluster
© 2015 DataStax, All Rights Reserved.
Datacenter Datacenter
Request Coordinator
4
Node
NodeNode
Node
Client Client
Node
NodeCoordinator
Node
Client Client
Coordinator node:Forwards requeststo corresponding replicas
© 2015 DataStax, All Rights Reserved.
Datacenter
Row Replica
5
Replica
NodeNode
Replica
Client Client
Datacenter
Node
Node
Replica
Client Client
Coordinator
Replica node:Stores a slice of total rows of each keyspace
© 2015 DataStax, All Rights Reserved.
Token Ring
6
121
2
3
4567
8
9
1011
© 2015 DataStax, All Rights Reserved.
Token Ring
6
-263 … (+263 - 1)
Murmur3 Partitioner
© 2015 DataStax, All Rights Reserved.
Token Ring
6
Node
11…12Node
12…1
Node
1…2
Node
2…3
Node
3…4
Node
4…5
Node
5…6Node
6…7
Node
7…8
Node
8…9
Node
9…10
Node
10…11
-263 … (+263 - 1)
Murmur3 Partitioner
© 2015 DataStax, All Rights Reserved.
Keyspaces
7
CREATE KEYSPACE default WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': 3 }
© 2015 DataStax, All Rights Reserved.
C*
Data Partitioning
8
KeyspaceRow
token(PK) = 1RF = 3
Partitioner:Gets a token by hashing the primary key of a row
© 2015 DataStax, All Rights Reserved.
C*
Replication Strategy
9
Keyspace 1
Row
RF = 3
Replication strategy:Determines the first
replica for the row
token(PK) = 1
© 2015 DataStax, All Rights Reserved.
C*
Replication Factor
10
KeyspaceRow
RF = 3
Replication factor:Specifies total number of replicas for each row
token(PK) = 1
© 2015 DataStax, All Rights Reserved.
Coordinator
Node Replica
ReplicaNode
11
ReplicaApplication
Consistency Level
RF = 3, CL = Quorum
© 2015 DataStax, All Rights Reserved.
Coordinator
Node Replica
ReplicaNode
11
ReplicaApplication
Consistency Level
RF = 3, CL = Quorum
INSERT
© 2015 DataStax, All Rights Reserved.
Coordinator
Node Replica
ReplicaNode
11
ReplicaApplication
Consistency Level
RF = 3, CL = Quorum
INSERT
© 2015 DataStax, All Rights Reserved.
Coordinator
Node Replica
ReplicaNode
11
ReplicaApplication
Consistency Level
RF = 3, CL = Quorum
INSERT
© 2015 DataStax, All Rights Reserved.
Coordinator
Node Replica
ReplicaNode
11
ReplicaApplication
Consistency Level
RF = 3, CL = Quorum
INSERT
DataStax DriversSmart clients for Apache Cassandra
© 2015 DataStax, All Rights Reserved.
Goals of DataStax Drivers
• Consistent set of features across languages• Asynchronous execution of requests• Load balancing• Fault tolerant• Address Resolution (multi-region!)• Automatic cluster discovery and reconnection
• Flexible to the core• Consistent terminology• Open source
13
© 2015 DataStax, All Rights Reserved. 14
Asynchronous ExecutionIO Reactor, Request Pipelining and Future Composition
© 2015 DataStax, All Rights Reserved.
Asynchronous Core
16
Application Thread
Business Logic
Driver
Background Thread
IO Reactor
© 2015 DataStax, All Rights Reserved.
Request Pipelining
17
Client
WithoutRequest Pipelining
Server
Client Server WithRequest Pipelining
1
2
2
3
1
3
1
2
3
1
2
3
© 2015 DataStax, All Rights Reserved.
What is a Future?
• Represents the result of an asynchronous operation• Returned by any *_async method in the Ruby driver
• execute_async• prepare_async
• Will block if asked for the true result
18
© 2015 DataStax, All Rights Reserved.
Future Composition
19
select_user = session.prepare('SELECT * FROM users WHERE id = ?') select_page = session.prepare('SELECT * FROM pages WHERE slug = ?') user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(select_user, arguments: [id])
future.then do |users| user = users.first future = session.execute_async(select_page, arguments: [user['username']])
future.then do |pages| page = pages.first
User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get
© 2015 DataStax, All Rights Reserved.
select_user = session.prepare('SELECT * FROM users WHERE id = ?') select_page = session.prepare('SELECT * FROM pages WHERE slug = ?') user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(select_user, arguments: [id])
future.then do |users| user = users.first future = session.execute_async(select_page, arguments: [user['username']])
future.then do |pages| page = pages.first
User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get
Future Composition
20
© 2015 DataStax, All Rights Reserved.
select_user = session.prepare('SELECT * FROM users WHERE id = ?') select_page = session.prepare('SELECT * FROM pages WHERE slug = ?') user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(select_user, arguments: [id])
future.then do |users| user = users.first future = session.execute_async(select_page, arguments: [user['username']])
future.then do |pages| page = pages.first
User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get
Future Composition
21
© 2015 DataStax, All Rights Reserved.
select_user = session.prepare('SELECT * FROM users WHERE id = ?') select_page = session.prepare('SELECT * FROM pages WHERE slug = ?') user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(select_user, arguments: [id])
future.then do |users| user = users.first future = session.execute_async(select_page, arguments: [user['username']])
future.then do |pages| page = pages.first
User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get
Future Composition
22
© 2015 DataStax, All Rights Reserved.
select_user = session.prepare('SELECT * FROM users WHERE id = ?') select_page = session.prepare('SELECT * FROM pages WHERE slug = ?') user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(select_user, arguments: [id])
future.then do |users| user = users.first future = session.execute_async(select_page, arguments: [user['username']])
future.then do |pages| page = pages.first
User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get
Future Composition
23
© 2015 DataStax, All Rights Reserved.
select_user = session.prepare('SELECT * FROM users WHERE id = ?') select_page = session.prepare('SELECT * FROM pages WHERE slug = ?') user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(select_user, arguments: [id])
future.then do |users| user = users.first future = session.execute_async(select_page, arguments: [user['username']])
future.then do |pages| page = pages.first
User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get
Future Composition
24
© 2015 DataStax, All Rights Reserved.
Future Composition
25
[#<User @id=1 @username="avalanche123"; @page=#<Page @slug="avalanche123" ... > ... >, ... ]
© 2015 DataStax, All Rights Reserved.
Pop Quiz: How to make this faster?
26
select_user = session.prepare('SELECT * FROM users WHERE id = ?') select_page = session.prepare('SELECT * FROM pages WHERE slug = ?') user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(select_user, arguments: [id])
future.then do |users| user = users.first future = session.execute_async(select_page, arguments: [user['username']])
future.then do |pages| page = pages.first
User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get
© 2015 DataStax, All Rights Reserved.
Pop Quiz: How to make this faster?
27
user_future = session.prepare_async(‘SELECT * FROM users WHERE id = ?') page_future = session.prepare_async(‘SELECT * FROM pages WHERE slug = ?’) user_ids = [1, 2, 3, 4] futures = user_ids.map do |id| future = session.execute_async(user_future.get, arguments: [id])
future.then do |users| user = users.first future = session.execute_async(page_future.get, arguments: [user['username']])
future.then do |pages| page = pages.first
User.new(user, Page.new(page)) end end end Cassandra::Future.all(futures).get
Load BalancingPrinciples and Implementations
© 2015 DataStax, All Rights Reserved.
Application Driver
Load Balancing
29
ApplicationThread
Node
Pool
SessionPool
Pool
Pool
ApplicationThread
ApplicationThread
Client Cluster
Node
Node
Node
Load Balancing Policy
© 2015 DataStax, All Rights Reserved.
Application Driver
Load Balancing
29
ApplicationThread
Node
Pool
SessionPool
Pool
Pool
ApplicationThread
ApplicationThread
Client Cluster
Node
Node
Node
Load Balancing Policy
© 2015 DataStax, All Rights Reserved.
Application Driver
Load Balancing
29
ApplicationThread
Node
Pool
SessionPool
Pool
Pool
ApplicationThread
ApplicationThread
Client Cluster
Node
Node
Node
Load Balancing Policy
© 2015 DataStax, All Rights Reserved.
Datacenter
Datacenter
DataCenter Aware Balancing
30
Node
Node
NodeClient
Node
Node
Node
Client
Client
Client
Client
Client
Local nodes are queried first, if none are available, the request could be sent to a remote node.
© 2015 DataStax, All Rights Reserved.
Token Aware Balancing
31
Route request directly to Replicas
Node
Node
ReplicaNode
Client
Replica
Replica
Uses prepared statement metadata to get the token
© 2015 DataStax, All Rights Reserved.
Other built-in policies
• Round Robin Policy• ignores topology
• White List Policy• only connect with certain hosts
32
Fault ToleranceSources of Failure and Error Handling
© 2015 DataStax, All Rights Reserved.
Fault Tolerance
34
Coordinator
Node Replica
Replica
ReplicaNode
Business Logic
Driver
Application
© 2015 DataStax, All Rights Reserved. 35
Coordinator
Node Replica
Replica
ReplicaNode
Business Logic
Driver
Application
Invalid Requests
Network Timeouts
Server ErrorsPossible Failures
© 2015 DataStax, All Rights Reserved.
Application Driver
Automatic Retry of Server Errors
36
ApplicationThread
Node
Pool
SessionPool
Pool
Pool
ApplicationThread
ApplicationThread
Client Cluster
Node
Node
Node
Load Balancing Policy
© 2015 DataStax, All Rights Reserved.
Application Driver
Automatic Retry of Server Errors
36
ApplicationThread
Node
Pool
SessionPool
Pool
Pool
ApplicationThread
ApplicationThread
Client Cluster
Node
Node
Node
Load Balancing Policy
© 2015 DataStax, All Rights Reserved.
Application Driver
Automatic Retry of Server Errors
36
ApplicationThread
Node
Pool
SessionPool
Pool
Pool
ApplicationThread
ApplicationThread
Client Cluster
Node
Node
Node
Load Balancing Policy
© 2015 DataStax, All Rights Reserved. 37
Coordinator
Node Replica
Replica
ReplicaNode
Business Logic
Driver
Application
Unreachable Consistency
© 2015 DataStax, All Rights Reserved.
Coordinator
Node Replica
ReplicaNode
38
Replica
Business Logic
Driver
Application
Read / Write Timeout Error
© 2015 DataStax, All Rights Reserved.
Coordinator
Node Replica
ReplicaNode
38
Replica
Business Logic
Driver
Application
Read / Write Timeout Error
© 2015 DataStax, All Rights Reserved.
Coordinator
Node Replica
ReplicaNode
38
Replica
Business Logic
Driver
Application
Read / Write Timeout Error
read / write timeout
© 2015 DataStax, All Rights Reserved. 39
Coordinator
Node Replica
Replica
ReplicaNode
Business Logic
Driver
Application
Unavailable Error
© 2015 DataStax, All Rights Reserved. 39
Coordinator
Node Replica
Replica
ReplicaNode
Business Logic
Driver
Application
Unavailable Error
unavailable
© 2015 DataStax, All Rights Reserved. 40
Error Handling
Address ResolutionTopology Aware Client
© 2015 DataStax, All Rights Reserved.
Datacenter Datacenter
Multiple Addresses
42
Node
NodeNode
Node
Client Client
Node
NodeNode
Node
Client Client
Within Datacenter: Private IPs
Across Datacenters: Public IPs
© 2015 DataStax, All Rights Reserved.
Application Driver
Address Resolution
43
ApplicationThread
ApplicationThread
ApplicationThread
Client Cluster
© 2015 DataStax, All Rights Reserved.
Application Driver
Address Resolution
43
ApplicationThread Node
Cluster
ApplicationThread
ApplicationThread
Client Cluster
Address Resolution Policy
© 2015 DataStax, All Rights Reserved.
Application Driver
Address Resolution
43
ApplicationThread Node
Cluster
ApplicationThread
ApplicationThread
Client Cluster
Node
Node
Node
Address Resolution Policy
Control Connection
© 2015 DataStax, All Rights Reserved.
Application Driver
Address Resolution
43
ApplicationThread Node
Cluster
ApplicationThread
ApplicationThread
Client Cluster
Node
Node
Node
Address Resolution Policy
Control Connection
© 2015 DataStax, All Rights Reserved.
Application Driver
Address Resolution
43
ApplicationThread Node
Pool
Cluster
Pool
Pool
Pool
ApplicationThread
ApplicationThread
Client Cluster
Node
Node
Node
Address Resolution Policy
Control Connection
Session
© 2015 DataStax, All Rights Reserved.
EC2 Multi-Region Address Resolution
44
© 2015 DataStax, All Rights Reserved.
More
• Request Tracing• Execution Information
• which node was used, # retries for query, etc.• State Listeners
• node goes down/comes up, schema changes, etc.• Result Paging• SSL and Authentication
45
Questions