Cassandra - Research Paper Overview
-
Upload
sameiralk -
Category
Technology
-
view
168 -
download
1
description
Transcript of Cassandra - Research Paper Overview
CassandraA Decentralized Structured Storage System
Avinash Lakshman Prashant Malik Facebook Facebook
Presented by Sameera Nelson
Outline …
Introduction
Data Model
System Architecture
Bootstrapping & Scaling
Local Persistence
Conclusion
What is Cassandra ?
Distributed Storage System
Manages Structured Data
Highly available , No SPoF
Not a Relational Data Model
Handle high write throughput
◦ No impact on read efficiency
Motivation
Operational Requirements in Facebook
◦ Performance
◦ Reliability/ Dealing with Failures
◦ Efficiency
◦ Continues Growth
Application◦ Inbox Search Problem, Facebook
Related Work
Google File System◦ Distributed FS, Single master/Slave
Ficus/ Coda
◦ Distributed FS
Farsite
◦ Distributed FS, No centralized server
Bayou◦ Distributed Relational DB System
Dynamo
◦ Distributed Storage system
Data Model
Data Model
Figure from Eben Hewitt’s slides.
• Table• Multidimensional map indexed by key
• Columns • Grouped in to Column Families• Simple• Super (Nested Column Families)
• Column has• Name/ Value/ Timestamp
Data Model
Supported Operations
insert(table; key; rowMutation)
get(table; key; columnName)
delete(table; key; columnName)
Query Language
CREATE TABLE users
( user_id int PRIMARY KEY,
fname text,
lname text );
INSERT INTO users
(user_id, fname, lname) VALUES (1745, 'john', 'smith');
SELECT * FROM users;
System Architecture
Fully Distributed …No Single Point of Failure
Cassandra Architecture
PartitioningData distribution across nodes
ReplicationData duplication across nodes
Cluster MembershipNode management in cluster
adding/ deleting
Partitioning
The Token Ring
Partitioning Partitions using Consistent hashing
Partitioning Assignment in to the relevant partition
Replication
Based on configured replication factor
Replication
Different Replication Policies
◦Rack Unaware
Replicate at N-1 nodes
◦Rack Aware
Zookeeper, using a leader
◦Data center Aware
similar to Rack Aware, leader chosen at
Datacenter level.
Cluster Membership
Based on scuttlebutt
Efficient Gossip based mechanism
Inspired for real life rumor
spreading.
Anti Entropy protocol
◦ Repair replicated data by comparing &
reconciling differences
Cluster Membership
Gossip Based
Cluster Membership
Failure Detection◦ Accrual Failure Detector
If a node is faulty, the suspicion level increases.
Φ(t) k as t kk - threshold variable
◦ If node is correct
Φ(t) = 0
Bootstrapping & Scaling
Bootstrapping & ScalingBootstrapping
◦Node selects random token
◦Locally persisted, gossiped to cluster
Scaling
◦Cassandra bootstrap algorithm initiated by
operator
◦New node get a spitted range of heavily
loaded node
Local Persistence
Local Persistence
Write Operation
Local Persistence
Write Operation
◦Flush to disk after threshold
◦Sequential Entries, Index per each
◦Data file merging
◦Rolling Commit logs
Local Persistence
Read Operation
◦Indexes all data on primary key
◦Maintain column indicesRead
Data
Conclusion
Conclusion
Proven high scalability, performance, and
wide applicability
Very high update throughput, delivering
low latency
Future work
◦ Adding compression
◦ Support atomicity across keys
◦ Secondary index support
Thank You