Apache Cassandra, part 3 – machinery, work with Cassandra

Click here to load reader

  • date post

    25-May-2015
  • Category

    Technology

  • view

    3.963
  • download

    1

Embed Size (px)

description

Aim of this presentation to provide enough information for enterprise architect to choose whether Cassandra will be project data store. Presentation describes each nuance of Cassandra architecture and ways to design data and work with them.

Transcript of Apache Cassandra, part 3 – machinery, work with Cassandra

  • 1. Apache Cassandra, part 3 machinery, work with Cassandra

2. V. Architecture (part 2)
3. SEDA Architecture
SEDA Staged event-driven architecture
Every unit of work is split into several stages that are executed in parallel threads.
Each stage consist of input and output event queue, event handler and stage controller.
4. SEDA Architecture advantages
Well conditioned system load
Preventing resources from being overcommitted.
5. SEDA in Cassandra - Usages
Read
Mutation
Gossip
Anti Entropy
.
6. SEDA in Cassandra - Design
Stage Manager presents Map between stage names and Java 5 thread pool executers.
Each controller with queue is presented by ThreadPoolExecuter that can be configured through JMX.
7. VI. Working with Cassandra
8. Installing and launching Cassandra
Download from http://cassandra.apache.org/download/
9. Installing and launching Cassandra
Launching server:
bin/cassandra.bat
use -f key to run sever in foreground, so that all of the server logs will print to standard out
is started with single node cluster called Test Cluster listening on port 9160
10. Installing and launching Cassandra
Starting command-line client interface:
bin/cassandra-cli.bat
you see [[email protected]] at the beginning of every line
11. Creating a cluster
In configuration file cassandra.yaml specify:
seeds the list of seeds for the cluster
rpc_address and listen_address network addresses
12. Creating a cluster
initial_token defining the nodes token range
auto_bootstrap enables auto-migration of data to the new node
13. nodetool ring
Use nodetool for view configuration
~$ nodetool -h localhost -p 8080 ring
Address StatusState Load OwnsRange Ring
850705
10.203.71.154 UpNormal2.53 KB50.000||
14. Connecting to server
Connect from command line:
connect / [ ];
Examples:
connect localhost/9160;
connect 127.0.0.1/9160 user password;
Connect when staring command line client:
cassandra-cli
h,host
p,port
k,keyspace
u,username
p,password
15. Describing environment
show cluster name;
show keyspaces;
show api version;
describe cluster;
describe keyspace [];
16. Create keyspace
create keyspace ;
create keyspace with
= and
= ...;
Attributes:
placement_strategy
replication_factor

17. Create keyspace
Example:
create keyspace Keyspace1 with placement_strategy = org.apache.cassandra.locator.RackUnawareStrategy and replication_factor = 4;
18. Update keyspace
Update attributes of created keyspace:
update keyspace with
= and
= ...;
19. Switch to keyspace
use ;
use [ ];
If you dont specify username and password then credentials supplied to the connect statement will be used
If the server doesnt support authentication it will ignore credentials
20. Switch to keyspace
Example:
use Keyspace1 user1 qwerty123;
When you use keyspace youll see [[email protected]] at the beginning of every line
21. Create column family
create column family ;
create column family with
= and
= ...;
Example:
create column family Users with column_type = Super and
comparator = UTF8Type and
rows_cached = 1000;
22. Update column family
When column family is created you can update its attributes:
update column family with
= and
= ...;
23. Comparators and validators
Comparators compare column names
Validators validate column values
24. Comparators and validators
You can specify comparator for column family and all subcolumns in column family (one for all)
You can specify validators for each known column of column family
You can specify default validator for column family that will be used for columns for which validators arent specified
You can specify key validatorwhich will validate row keys
25. Attributes of column family
column_type: can be Standard or Super(default - Standard)
comparator: specifies how column names will be compared for sort order
column_metadata: defines the validation and indexes for known columns
default_validation_class: validator to use for values in columns which are not listed in the column_metadata. (default BytesType)
key_validation_class: validator for keys
26. Column metadata
You can define validators for each known column in the family
create column family User with
column_metadata = [
{column_name: name, validation_class: UTF8Type},
{column_name: age, validation_class: IntegerType},
{column_name: birth, validation_class: UTF8Type}
];
Columns not listed in this section are validated with default_validation_class
27. Secondary indexes
Allows queries by value
get users where name = Some user';
Can be created in background
28. Creating index
Define it in column metadata
For example in cassandra-cli:create column family users with comparator=UTF8Type and column_metadata=[{column_name: birth_date, validation_class: LongType, index_type: KEYS}];
29. Some restrictions
Cassandra use hash indexes instead of btree indexes. Thus, in where condition at least one indexed field with operator = must be presentSo, you cant useget users where birth_date > 1970; but canget users where birth_date = 1990 and karma > 50;
30. Index types
KEYS
BITMAP (will be supported in future releases)
31. Writing data
To write data use set command:
set Customers[ivan][name] = Ivan;
set Customers[makar][info][age] = 96;
32. Reading data
To read data use get command:
get Customers[ivan][name];
- this will display Ivan
get Customers[makar];
- this will display all columns for key makar
33. Reading data
To list a range of rows use list command:
list Customers;
list Customers[a:];
list Customers[a:c] limit 40;
- you can specify limit of rows that will be displayed (default - 100)
34. Reading data
To get columns number use count command:
count Customers[ivan]
- this will display number of columns for key ivan
35. Deleting data
To delete a row, a column or a subcolumn use del command:
del Customers[ivan];
- this will delete all columns for key ivan
del Customers[ivan][name];
- this will delete column name for key ivan
del Customers[ivan][accounts][2312784829312343];
- this will delete a subcolumn with an account number from accounts column for key ivan
36. Deleting data
To delete all data in a column family use truncate command:
truncate Customers;
37. Drop column family or keyspace
drop column family Customers;
drop keyspace Keyspace1;
38. Q&A
39. Resources
Home of Apache Cassandra Project http://cassandra.apache.org/
Apache Cassandra Wiki http://wiki.apache.org/cassandra/
Documentation provided by DataStaxhttp://www.datastax.com/docs/0.8/
Good explanation of creation secondary indexes http://www.anuff.com/2010/07/secondary-indexes-in-cassandra.html
Eben Hewitt Cassandra: The Definitive Guide, OREILLY, 2010, ISBN: 978-1-449-39041-9
40. Authors
Lev Sivashov- [email protected]
Andrey Lomakin - [email protected], twitter: @Andrey_LomakinLinkedIn: http://www.linkedin.com/in/andreylomakin
Artem Orobets [email protected]: @Dr_EniSh
Anton Veretennik - [email protected]