Introduce Apache Cassandra - JavaTwo Taiwan, 2012

美商優科無線資深工程師

Boris Yen

專家講座 B：淺談 Apache Cassandra

Outline

• Cassandra vs SQL Server

• Overview

• Data in Cassandra

• Data Partitioning

• Data Replication

• Data Consistency

• Client Libraries

Cassandra vs SQL Server

• Cassandrao More servers = More capacity.o The concerns of scaling is transparent to application.o No single point of failure.o Horizontal scale.

• SQL Servero More power machine = More capacity.o Adding capacity requires manual labor from ops people

and substantial downtime.o There would be limit on how big you could go.o Vertical scale, Moore’s law scaling

Overview

• Features are coming from Dynamo and BigTable

• Distributedo Data partitioned among all nodes

• Extremely Scalableo Add new node = Add more capacity o Easy to add new node

• Fault toleranto All nodes are the sameo Read/Write anywhereo Automatic Data replication

• High Performance

Overview

http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html

http://www.datastax.com/dev/blog/whats-new-in-cassandra-1-0-performance

http://www.cubrid.org/blog/dev-platform/nosql-benchmarking/







Data in Cassandra

Keyspace

ColumnFamily

ID Addr Phone

1 ... Taiwan 09.....

Key: Boris

{ column: Phone, value: 09..., timestamp: 1000}

timestamp is used to resolve conflict.

• Keyspace ~ Database in RDBMS• Column Family ~ Table in RDBMS

Data in Cassandra

• Keyspaceo Where the replication strategy and replication factor

is defined.

• ColumnFamily

CREATE KEYSPACE keyspace_name WITHstrategy_class = 'SimpleStrategy'AND strategy_options:replication_factor=2;

CREATE COLUMNFAMILY user ( id uuid PRIMARY KEY, address text, userName text ) WITH comment='' AND comparator=text AND read_repair_chance=0.100000 AND gc_grace_seconds=864000 AND default_validation=text AND min_compaction_threshold=4 AND max_compaction_threshold=32 AND replicate_on_write=True AND compaction_strategy_class='SizeTieredCompactionStrategy' ANDcompression_parameters:sstable_compression='org.apache.cassandra.io.compress.SnappyCompressor';

Data in Cassandra

• Commit logo Used to capture write activities. Data durability is

assured.

• Memtableo Used to store most recent write activities.

• SSTableo When a memtable got flushed to disk, it becomes a

sstable.

• Write

• Reado Search Row cache, if the result is not empty, then return the

result. No further actions are needed.o If no hit in the Row cache. Try to get data from Memtable(s)

and SSTable(s) that might contain requested key. Collate the results and return.

Data Read/Write

Commitlog MemtableData

SSTable

Flushed

Data Compaction

sstable1

sstable2

.

.

.

.

Boris:{ name: boris (t1) phone: 092xxx (t1) addr: tainan (t1) }

Boris:{ name: boris.yen (t2) sex: male (t2) email: y@gmail (t2)}

sstableX

Boris:{ addr: tainan (t1) email: y@gmail (t2) name: boris.yen (t2) phone: 092xxx (t1) sex: male (t2)}

t2 > t1

Data Partitioning• The total data managed by the cluster is

represented as a circular space or ring.

• Before a node can join the ring, it must be assigned a token.

• The token determines the node’s position on the ring and the range of data it is responsible for.

• Partitioning strategyo Random Partitioning

Default and Recommendedo Order Partitioning

Sequential writes can cause hot spots More administrative overhead to load balance the

cluster

t1

t5 t2

t4 t3

Data Partitioning

Random Partitioning

Data: k1

Data: k2

Data: k3

hash(k1)hash(k2)

hash(k3)

Data: k4

hash(k4)

Data Replication

• To ensure fault tolerance and no single point of failure.

• Replication is controlled by the parameters replication factor and replication strategy of a keyspace.

• Replication factor controls how many copies of a row should be stored in the cluster

• Replication strategy controls how the data being replicated.

Data Replication

t1

t5 t2

t4 t3

Random Partitioning

RF=3

Data: k1

hash(k1)

coordinator

Data Consistency

• Cassandra supports tunable data consistency.

• Choose from strong and eventual consistency depending on the need.

• Can be done on a per-operation basis, and for both reads and writes.

• Handles multi-data center operations

Consistency Level

Write Read

Any

One One

Quorum Quorum

Local_Quorum Local_Quorum

Each_Quorum Each_Quorum

All All

Built-in Consistency Repair Features

• Read Repair

• Hinted Handoff

• Anti-Entropy Node Repair

http://www.datastax.com/docs/0.8/dml/data_consistency#builtin-consistency

http://www.datastax.com/docs/0.8/dml/data_consistency

Client Library for Java

• Hectoro https://github.com/hector-client/hector.gito https://github.com/hector-client/hector/wiki/User-Gui

de

• Astyanaxo https://github.com/Netflix/astyanax.git

• CQL + JDBCo http://code.google.com/a/apache-extras.org/p/cassa

ndra-jdbc/

https://github.com/hector-client/hector.git

https://github.com/hector-client/hector/wiki/User-Guide


https://github.com/Netflix/astyanax.git

http://code.google.com/a/apache-extras.org/p/cassandra-jdbc/

http://code.google.com/a/apache-extras.org/p/cassandra-jdbc/

Hector

• High level, simple object oriented interface to cassandra

• Failover behavior on the client side

• Connection pooling for improved performance and scalability

• Automatic retry of downed hosts.

.

.

Hector


// slice querySliceQuery<String, String> q = HFactory.createSliceQuery(ko, se, se, se);q.setColumnFamily(cf).setKey("jsmith").setColumnNames("first", "last", "middle");Result<ColumnSlice<String, String>> r = q.execute();

// multi-getMultigetSliceQuery<String, String, String> multigetSliceQuery = HFactory.createMultigetSliceQuery(keyspace, stringSerializer, stringSerializer, stringSerializer);multigetSliceQuery.setColumnFamily("Standard1");multigetSliceQuery.setKeys("fake_key_0", "fake_key_1", "fake_key_2", "fake_key_3", "fake_key_4");multigetSliceQuery.setRange("", "", false, 3);Result<Rows<String, String, String>> result = multigetSliceQuery.execute();

// batch operationMutator<String> mutator = HFactory.createMutator(keyspace, stringSerializer);mutator.addInsertion("jsmith", "Standard1",HFactory.createStringColumn("first", "John")).addInsertion("jsmith", "Standard1", HFactory.createStringColumn("last", "Smith")).addInsertion("jsmith", "Standard1", HFactory.createStringColumn("middle", "Q"));mutator.execute();


CQL+JDBCClass.forName("org.apache.cassandra.cql.jdbc.CassandraDriver"); String URL = String.format("jdbc:cassandra://%s:%d/%s",HOST,PORT,"system"); System.out.println("Connection URL = '"+URL +"'");

con = DriverManager.getConnection(URL); Statement stmt = con.createStatement();

// Create KeySpaceString createKS = String.format("CREATE KEYSPACE %s WITH strategy_class = SimpleStrategy AND strategy_options:replication_factor = 1;",KEYSPACE);

stmt.execute(createKS);

https://code.google.com/a/apache-extras.org/p/cassandra-jdbc/source/browse/src/test/java/org/apache/cassandra/cql/jdbc/JdbcRegressionTest.java

// Create the target Column family String createCF = "CREATE COLUMNFAMILY RegressionTest (keyname text PRIMARY KEY,” + "bValue boolean, “+ "iValue int “+ ") WITH comparator = ascii AND default_validation = bigint;";

stmt.execute(createCF);



CQL+JDBCStatement statement = con.createStatement();

String truncate = "TRUNCATE RegressionTest;";statement.execute(truncate);

String insert1 = "INSERT INTO RegressionTest (keyname,bValue,iValue) VALUES ('key0',true, 2000);";statement.executeUpdate(insert1);

String insert2 = "INSERT INTO RegressionTest (keyname,bValue) VALUES( 'key1',false);";statement.executeUpdate(insert2);

String select = "SELECT * from RegressionTest;";ResultSet result = statement.executeQuery(select);ResultSetMetaData metadata = result.getMetaData();...




Useful Tools

• cassandra-clio <cassandra-dir>/bino http://www.datastax.com/docs/1.0/dml/using_cli

• cqlsho <cassandra-dir>/bino http://www.datastax.com/docs/1.0/references/cql/index

• nodetoolo <cassandra-dir>/bino http://www.datastax.com/docs/1.0/references/nodetool

• stresso <cassandra-dir>/tools/bino http://www.datastax.com/docs/1.0/references/stress_java

http://www.datastax.com/docs/1.0/dml/using_cli

http://www.datastax.com/docs/1.0/references/cql/index

http://www.datastax.com/docs/1.0/references/nodetool

http://www.datastax.com/docs/1.0/references/stress_java

Useful Tools

• OpsCentero http://www.datastax.com/products/opscenter

• sstableloadero <cassandra-dir>/bino http://www.datastax.com/dev/blog/bulk-loading

• More toolshttp://en.wikipedia.org/wiki/

Apache_Cassandra#Tools_for_Cassandra

http://www.datastax.com/products/opscenter

http://www.datastax.com/dev/blog/bulk-loading

Questions?

Introduce Apache Cassandra - JavaTwo Taiwan, 2012

Technology

Transcript of Introduce Apache Cassandra - JavaTwo Taiwan, 2012