About "Apache Cassandra"
date post
09-Apr-2017Category
Technology
view
3.924download
0
Embed Size (px)
Transcript of About "Apache Cassandra"
APACHE CASSANDRAScalability, Performance and Fault Tolerance
in Distributed databases
Jihyun.An (jihyun.an@kt.com)
18, June 2013
mailto:jihyun.an@kt.com
TABLE OF CONTENTS
Preface
Basic Concepts
P2P Architecture
Primitive Data Model & Architecture
Basic Operations
Fault Management
Consistency
Performance
Problem handling
TABLE OF CONTENTS (NEXT TIME)
Maintaining
Cluster Management
Node Management
Problem Handling
Tuning
Playing (for Development, Client stance)
Designing
Client
Thrift
Native
CQL
3rd party
Hector
OCM
Extension
Baas.io
Hadoop
PREFACE
OUR WORLD
Traditional DBMS is very valuable
Storage(+Memory) and Computational Resources cost is cheap (than before)
But we meet new section
Big data
(near) Real time
Complex and various requirement
Recommendation
Find FOAF
Event Driven Trigging
User Session
OUR WORLD (CONT)
Complex applications combine difference types of problems
Different language -> more productive
ex: Functional language, Multiprocessing optimized language
Polyglot persistent layer
Performance vs Durability?
Reliability?
TRADITIONAL DBMS
Relational Model
Well-defined Schema
Access with Selection/Projection
Derived from Joining/Grouping/Aggregating(Counting..)
Small data (from refined)
But
Painful data model changes
Hard to scale out
Ineffective in handling large volumes of data
Not considered with hardware
TRADITIONAL DBMS (CONT)
Has many constraints for ACID
PK/FK & checking
Domain Type checking
.. checking checking
Lots of IO / Processing
OODBMS, ORDBMS
Good but .. more more checking / processing
Not well with Disk IO
NOSQL
Key-value store
Column : Cassandra, Hbase, Bigtable
Others : Redis, Dynamo, Voldemort, Hazelcast
Document oriented
MongoDB, CouchDB
Graph store
Neo4j, Orient DB, BigOWL, FlockDB ..
NOSQL (CONT)
Benefits
Higher performance
Higher scalability
Flexible Datamodel
More effective for some case
Less administrative overhead
Drawbacks Limited Transactions
Relaxed Consistency
Unconstrained data
Limited ad-hoc query capabilities
Limited administrative aid tools
CAP
Brewers theorem
We can pick two of
Consistency
Availability
Partition tolerance
A
C P
Amazon Dynamo derivatives
Cassandra, Voldemort, CouchDB
, Riak
Neo4j, Bigtable
Bigtable derivatives : MongoDB, Hbase
Hypertable, Redis
Relational:
MySQL, MSSQL,
Postgres
Dynamo
(Architecture)
BigTable
(Data model)
Cassandra
(Apache) Cassandra is a free, open-source, high scalable,
distributed database system for managing large amounts of data
Written in JAVA
Running on JVM
References :
BigTable (http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en//archive/bigtable-osdi06.pdf)
Dynamo (http://web.archive.org/web/20120129154946/http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf)
http://static.googleusercontent.com/external_content/untrusted_dlcp/research.google.com/en/archive/bigtable-osdi06.pdfhttp://web.archive.org/web/20120129154946/http:/s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf
DESIGN GOALS
Simple Key/Value(Column) store
limited on storage
No support anything (aggregating, grouping ) but basic operation (CRUD, Range access)
But extendable
Hadoop (MR, HDFS, Pig, Hive ..)
ESP
Distributed Processing Interface (ex: BSP, MR)
Baas.io
DESIGN GOALS (CONT)
High Availability
Decentralized
Everyone can accessor
Replication & Their access
Multi DC support
Eventual consistency
Less write complexity
Audit and repair when read
Possible tuning -> Trade offs between consistency, durability and latency
DESIGN GOALS (CONT)
Incremental scalability
Equal Member
Linear Scalability
Unlimited space
Write / Read throughput increase linearly by add node(member)
Low total cost
Minimize administrative work
Automatic partitioning
Flush / compaction
Data balancing / moving
Virtual nodes (since v1.2)
Middle powered nodes make good performance
Collaborating work will make powerful performance and huge space
FOUNDER & HISTORY
Founder
Avinash Lakshman (one of the authors of Amazon's Dynamo)
Prashant Malik ( Facebook Engineer )
Developer
About 50
History
Open sourced by Facebook in July 2008
Became an Apache Incubator project in March 2009
Graduated to a top-level project in Feb 2010
0.6 released (added support for integrated caching, and Apache Hadoop MapReduce) in Apr 2010
0.7 released (added secondary indexes and online schema change) in Jan 2011
0.8 released (added the Cassandra Query Language (CQL), self-tuning memtables, and support for zero-downtime upgrades) in Jun 2011
1.0 released (added integrated compression, leveled compaction, and improved read performance) in Oct 2011
1.1 released (added self-tuning caches, row-level isolation, and support for mixed ssd/spinning disk deployments) in Apr 2012
1.2 released (added clustering across virtual nodes, inter-node communication, atomic batches, and request tracing) in Jan 2013
PROMINENT USERS
User Cluster size Node count Usage Now
Facebook >200 ? Inbox search Abandoned,Moved to HBase
Cisco WebEx ? ? User feed, activity OK
Netflix ? ? Backend OK
Formspring ? (26 million account with 10 m responsed per day)
? Social-graph data OK
Urban airship, Rackspace, Open X, Twitter (preparing move to)
BASIC CONCEPTS
P2P ARCHITECTURE
All nodes are same (has equality)
No single point of failure / Decentralized
Compare with
mongoDB
broker structure (cubrid )
Master / slave
P2P ARCHITECTURE
Driven linear scalability
References :
http://dev.kthcorp.com/2011/12/07/cassandra-on-aws-100-million-writ/
http://www.google.co.kr/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&docid=3YSDAgGnuMHm4M&tbnid=rpuahptcjv4gvM:&ved=0CAUQjRw&url=http://readwrite.com/2011/11/24/netflix-benchmarks-cassandra-o&ei=JfjAUabmMIiQkAX4loDIBQ&bvm=bv.47883778,d.dGI&psig=AFQjCNGBaG1NPmCzZ7tjSKwBgzwboyvxGA&ust=1371687139804572http://www.google.co.kr/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&docid=3YSDAgGnuMHm4M&tbnid=rpuahptcjv4gvM:&ved=0CAUQjRw&url=http://readwrite.com/2011/11/24/netflix-benchmarks-cassandra-o&ei=JfjAUabmMIiQkAX4loDIBQ&bvm=bv.47883778,d.dGI&psig=AFQjCNGBaG1NPmCzZ7tjSKwBgzwboyvxGA&ust=1371687139804572http://www.google.co.kr/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&docid=bYZ2I3MeFYR8PM&tbnid=v93nfjfUKSBHVM:&ved=0CAUQjRw&url=http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html&ei=oPjAUfHfBsSmkwWV0oGQAQ&bvm=bv.47883778,d.dGI&psig=AFQjCNGBaG1NPmCzZ7tjSKwBgzwboyvxGA&ust=1371687139804572http://www.google.co.kr/url?sa=i&rct=j&q=&esrc=s&frm=1&source=images&cd=&cad=rja&docid=bYZ2I3MeFYR8PM&tbnid=v93nfjfUKSBHVM:&ved=0CAUQjRw&url=http://techblog.netflix.com/2011/11/benchmarking-cassandra-scalability-on.html&ei=oPjAUfHfBsSmkwWV0oGQAQ&bvm=bv.47883778,d.dGI&psig=AFQjCNGBaG1NPmCzZ7tjSKwBgzwboyvxGA&ust=1371687139804572http://dev.kthcorp.com/2011/12/07/cassandra-on-aws-100-million-writ/
PRIMITIVE DATA MODEL & ARCHITECTURE
COLUMN
Basic and primitive type (the smallest increment of data)
A tuple containing a name, a value and a timestamp
Timestamp is important
Provided by client
Determine the most recent one
If meet the collision, DBMS chose the latest one
Name
Value
Timestamp
COLUMN (CONT)
Types
Standard: A column has a name (UUID or UTF8 )
Composite: A column has composite name (UUID+UTF8 )
Expiring: TTL marked
Counter: Only has name and value, timestamp managed by server
Super: Used to manage wide rows, inferior to using composite
columns (DO NOT USE, All sub-columns serialized)
Counter Name
Value
Name
Name
Value
Timestamp
Name
Value
Timestamp
COLUMN (CONT)
Types (CQL3 based)
Standard: Has one primary key.
Composite: Has more than one primary key,
recommended for managing wide rows.
Expiring: Gets deleted during compaction.
Counter: Counts occurrences of an event.
Super: Used to manage wide rows, inferior to using
composite columns (DO NOT USE, All sub-columns
serialized)
DDL : CREATE TABLE test (
user_id varchar,
article_id uuid,
content varchar,
PRIMARY KEY (user_id, article_id)
);