Cassandra Core Concepts and Design Internals

18

Transcript of Cassandra Core Concepts and Design Internals

Page 1: Cassandra Core Concepts and Design Internals
Page 2: Cassandra Core Concepts and Design Internals

Cassandra Core Concepts andDesign Internals

Page 3: Cassandra Core Concepts and Design Internals

Cassandra Core Concepts and Design Internals

at

New Delhi Cassandra Users Meetup – November

2014

By: Salil Kalia

Page 4: Cassandra Core Concepts and Design Internals

We’re going to talk about:

1. What is Cassandra?

2. High Level Architecture

3. Data Modeling

4. Write Path

5. Read Path

6. Tools

7. Q/A

Page 5: Cassandra Core Concepts and Design Internals

A Database:

✓ Highly available

✓ Fully distributed, with no single point of failure

✓ Free & open source, with deep developer support

✓ Highly performing with near-linearhorizontal scaling

✓ Replicated & durable

What is Cassandra ?

Page 6: Cassandra Core Concepts and Design Internals

Elastic Scalability

Distributed

Decentralized

FaultToleran

tColumn Oriented

TunableConsistenc

y

Highly available

KEY FEATURES

Open Source

Cassandra – Features

Page 7: Cassandra Core Concepts and Design Internals

Google Big Table

Amazon Dynamo DB

[Facebook] Cassandra

Cassandra Evolution

Page 8: Cassandra Core Concepts and Design Internals

✓ Ring based data distribution

✓ Only one type of Server

✓ Highly distributed

✓ All nodes hold data

✓ All nodes answer queries

✓ All nodes are replicas

✓ In-built Multi DC

✓ In-built Snitch feature

High Level Architecture

Page 9: Cassandra Core Concepts and Design Internals

✓ Nodes and Virtual nodes

✓ Primary & Secondary range

✓ Partition Key (Hash)

✓ Partitioner

✓ Client & Coordinator

✓ Replication Factor (RF)

✓ Consistency Level (CL)

Few Common Terms

Page 10: Cassandra Core Concepts and Design Internals

Magic Formula

Write CL + Read CL >RF

Immediate Consistency

Page 11: Cassandra Core Concepts and Design Internals

Keyspace

Table

Partition

Row

Column

Data Modeling

Page 12: Cassandra Core Concepts and Design Internals

✓ Like an RDBMS, Cassandra uses a Table to store data

✓ Partitions within tables

✓ Rows within partitions (or a single row)

✓ CQL to create tables & query data

✓ Partition keys determine where a partition is found

✓ Clustering keys determine ordering of rows within a partition

Data Modeling

Page 13: Cassandra Core Concepts and Design Internals

name age occupation

Salil 32 Tech Lead

Vishal 25 Software Engineer

Akshay 45 Actor

Sheri 29 Singer

cqlsh:demo> create table user (name text primary key,age int, occupation text);

cqlsh:demo> select * from user WHERE name = ’Vishal'

Example: Single Row Partition

✓ User identified by name (PK)

✓ Single row per partition✓ RDBMS like structure

Page 14: Cassandra Core Concepts and Design Internals

Video_id Comment_id Comment

5 1 Nice pic

5 2 Which place?

5 3 lol

6 4 Great!

cqlsh:demo> create tablecomment (video_id int, comment_id int, comment text, primary key ( video_id, comment_id));

cqlsh:demo> select * from comment WHERE Video_id=5;

Example: Multiple Rows Partition

• Video_id - partition key• comment_id – cluster key

* In real world, use UUIDs instead of int for PK

Page 15: Cassandra Core Concepts and Design Internals

Query before data modeling

Denormalize the

data Create multiple views into your data

Cassandra is built for faster

writes Better – as few reads as possible

Data Modeling – Best practices

Page 16: Cassandra Core Concepts and Design Internals

CommitLog – append only logs

Memtables – In memory table

SSTables – created after the data flushes to disk

Compaction – process to merge SSTables

Key components of the Write Path

Page 17: Cassandra Core Concepts and Design Internals

✓ Memtables – In memory table✓ Row Cache – In memory cache stores recent read

rows✓ Bloom Filters – reports if a partition key may found in its corresponding SSTable

✓ Key Caches – in memory (on heap)

✓✓

Partition Summaries – in memory (on heap)

Partition Indexes – on disk

✓ SSTables – on disk

Key components of the Read Path

Page 18: Cassandra Core Concepts and Design Internals

Contact us

Have more queries related to BIG DATA?

Talk To Our Experts!

Our Office

Client Location

Click Here To Know More!

Here’s how TOTHENEW helps your customers outsource across the globe using BIG DATA!