Using Apache Cassandra: What is this thing, and how do I use it?
-
Upload
jeremiahdjordan -
Category
Technology
-
view
1.897 -
download
2
description
Transcript of Using Apache Cassandra: What is this thing, and how do I use it?
©2013 DataStax. Do not distribute without consent.
@zanson
Jeremiah JordanLead Software Engineer/Support
Using Apache Cassandra for Big DataWhat is this thing, and how do I use it?
1Monday, October 14, 13
Who I am• Jeremiah Jordan
• Lead Software Engineer in Support at DataStax
• Previously Senior Architect at Morningstar, Inc.
• Using Cassandra since 0.6
• Before that, wrote code for the F22
Monday, October 14, 13
Cassandra - An introduction
Monday, October 14, 13
Cassandra - Intro
• Based on Amazon Dynamo and Google BigTable papers
• Shared nothing
• Distributed
• Data safe as possible
• Predictable scaling
4
Dynamo
BigTable
Monday, October 14, 13
Cassandra - More than one server
• All nodes participate in a cluster
• Shared nothing
• Add or remove as needed
•More capacity? Add a server
5
• Each node owns a number of tokens• Tokens denote a range of keys
• 4 nodes? -> Key range/4• Each node owns 1/4 the data
Monday, October 14, 13
Cassandra - Locally Distributed
• Client writes to any node
• Node coordinates with others
• Data replicated in parallel
• Replication factor (RF): How many copies of your data?
• RF = 3 here
6
Each node stores 3/4 of clusters total data.
Monday, October 14, 13
Cassandra - Geographically Distributed
• Client writes local
• Data syncs across WAN
• Replication Factor per DC
7
Single coordinator
Monday, October 14, 13
Cassandra - Consistency
• Consistency Level (CL)
• Client specifies per read or write
8
• ALL = All replicas ack
• QUORUM = > 51% of replicas ack
• LOCAL_QUORUM = > 51% in local DC ack
• ONE = Only one replica acks
Monday, October 14, 13
Cassandra - Transparent to the application
• A single node failure shouldn’t bring failure
• Replication Factor + Consistency Level = Success
• This example:
• RF = 3
• CL = QUORUM
9
>51% Ack so we are good!
Monday, October 14, 13
Application Example - Layout
• Active-Active
• Service based DNS routing
10
Cassandra Replication
Monday, October 14, 13
Application Example - Uptime
11
• Normal server maintenance
• Application is unaware
Cassandra Replication
Monday, October 14, 13
Application Example - Failure
12
• Data center failure
• Data is safe. Route traffic.
33
Another happy user!
Monday, October 14, 13
Five Years of Cassandra
Jul-09 May-10 Feb-11 Dec-11 Oct-12 Jul-13
0.1 0.3 0.6 0.7 1.0 1.2...
2.0
DSE
Jul-08
Monday, October 14, 13
Cassandra 2.0 - Big new features
Monday, October 14, 13
SELECT * FROM usersWHERE username = ’jbellis’
[empty resultset]
Session 1SELECT * FROM usersWHERE username = ’jbellis’
[empty resultset]
Session 2
Lightweight transactions: the problem
INSERT INTO users (username,password)VALUES (’jbellis’,‘xdg44hh’)
INSERT INTO users (userName,password)VALUES (’jbellis’,‘8dhh43k’)
It’s a Race!
Who wins?
Monday, October 14, 13
LWT: details• 4 round trips vs 1 for normal updates
• Paxos - Paxos made easy
• Immediate consistency with no leader election or failover
• For reads, ConsistencyLevel.SERIAL
• http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0
Monday, October 14, 13
UPDATE USERS SET email = ’[email protected]’, ...WHERE username = ’jbellis’IF email = ’[email protected]’;
INSERT INTO USERS (username, email, ...)VALUES (‘jbellis’, ‘[email protected]’, ... )IF NOT EXISTS;
Using LWT
• Don’t overwrite an existing record
• Only update record if condition is met
Monday, October 14, 13
LWT: Use with caution• Great for 1% of your application
• Eventual consistency is your friend• http://www.slideshare.net/planetcassandra/c-summit-2013-eventual-consistency-
hopeful-consistency-by-christos-kalantzis
Monday, October 14, 13
Installing Cassandra
Monday, October 14, 13
Download Cassandra
Monday, October 14, 13
Download Cassandra
Monday, October 14, 13
Download Cassandra
Monday, October 14, 13
Extract Cassandra
Monday, October 14, 13
Setup Data and Log Directories
Monday, October 14, 13
Start Cassandra
Monday, October 14, 13
Start Cassandra
Monday, October 14, 13
Installing Cassandra Python Driver
Monday, October 14, 13
Python Cassandra Driver
Monday, October 14, 13
Install Python Cassandra Driver
Monday, October 14, 13
Connect and Create a Keyspacefrom cassandra.cluster import Cluster
cluster = Cluster(['127.0.0.1'])session = cluster.connect()
log.info("creating keyspace...")KEYSPACE = "testkeyspace"session.execute(""" CREATE KEYSPACE IF NOT EXISTS %s WITH replication = { 'class': 'SimpleStrategy', 'replication_factor': '1' } """ % KEYSPACE)
Monday, October 14, 13
Create a Tablelog.info("setting keyspace...")session.set_keyspace(KEYSPACE)
log.info("creating table...")session.execute(""" CREATE TABLE IF NOT EXISTS mytable ( thekey text, col1 text, col2 text, PRIMARY KEY (thekey, col1) ) """)
Monday, October 14, 13
Insert a Rowquery = SimpleStatement(""" INSERT INTO mytable (thekey, col1, col2) VALUES ('key1', 'a', 'b') """, consistency_level=ConsistencyLevel.ONE)
log.info("inserting row")session.execute(query)
Monday, October 14, 13
Insert Rows (Prepared Statement)prepared = session.prepare(""" INSERT INTO mytable (thekey, col1, col2) VALUES (?, ?, ?) """)
for i in range(10): log.info("inserting row %d" % i) bound = prepared.bind(("key%d" % i, "b%d" % i, "c%d" % i)) session.execute(bound)
Monday, October 14, 13
Query Resultsfuture = session.execute_async(""" SELECT * FROM mytable WHERE thekey='key1' """)rows = future.result()
log.info("key\tcol1\tcol2")log.info("---\t----\t----")for row in rows: log.info("\t".join(row))
Monday, October 14, 13
Run It
Monday, October 14, 13
Cassandra Applications - Drivers
• DataStax Drivers for Cassandra
• Java
• C#
• Python
•more on the way
36Monday, October 14, 13
Find Out MoreCassandra: http://cassandra.apache.org
DataStax Drivers: https://github.com/datastax
Documentation: http://www.datastax.com/docs
Getting Started: http://www.datastax.com/documentation/gettingstarted/index.html
Developer Blog: http://www.datastax.com/dev/blog
Cassandra Community Site: http://planetcassandra.org
Download: http://planetcassandra.org/Download/DataStaxCommunityEdition
Webinars: http://planetcassandra.org/Learn/CassandraCommunityWebinars
Cassandra Summit Talks: http://planetcassandra.org/Learn/CassandraSummit
Monday, October 14, 13
©2013 DataStax Confidential. Do not distribute without consent. 38Monday, October 14, 13