Introduction to NoSQL and Cassandra

104
Introduction to NoSQL and Apache Cassandra Patricio Echagüe [email protected] @patricioe

description

Intro to NoSQL, Cassandra and Hector I gave at Globant Laminar in Buenos Aires Argentina Dec 13th 2012.

Transcript of Introduction to NoSQL and Cassandra

Page 1: Introduction to NoSQL and Cassandra

Introduction to NoSQL and Apache Cassandra

Patricio Echagüe [email protected]

@patricioe

Page 2: Introduction to NoSQL and Cassandra

About me

Present: Relateiq (Data Processing and Scalability)

Hector committer

Past: DataStax (The Cassandra Company)

Cassandra/Hadoop distribution (former Brisk)

Cassandra FS

CQL connection pool

Cassandra contributions

Page 3: Introduction to NoSQL and Cassandra

Trends: “NoSQL”

Page 4: Introduction to NoSQL and Cassandra

2011

Page 5: Introduction to NoSQL and Cassandra

2012

Page 6: Introduction to NoSQL and Cassandra

What is “NoSQL” ?

systems able to store and retrieve great quantities of data with none or little information about the relationships between them.

Generally they don't have a SQL like language for data manipulation and their schema is more relaxed than traditional RDBM systems.

Full ACID is not often guaranteed.

Page 7: Introduction to NoSQL and Cassandra

Brewer's CAP theorem

Consistency: all replicas agree on the same value

Availability: always get an answer from a replica

Partition Tolerance: the system works even if replicas can't talk

You can have 2 of these

Page 8: Introduction to NoSQL and Cassandra

Brewer's CAP theorem

Page 9: Introduction to NoSQL and Cassandra

CAP Classification

Consistency

PartitioningAvailability

Page 10: Introduction to NoSQL and Cassandra

Types

- Relationals- Key-Value stores- Columnar (column-oriented)- Graph databases- Document

Page 11: Introduction to NoSQL and Cassandra

What's eventual consistency?

It is a promise that eventually, in the absence of new writes, all replicas that are responsible for a data item will agree on the same version

Page 12: Introduction to NoSQL and Cassandra

How eventual is eventual?Write to 1 replica and Read from 1 replica of a total

of 3

Page 13: Introduction to NoSQL and Cassandra

How eventual is eventual?Write to 2 replicas and Read from 2 replicas of a total

of 3

Page 14: Introduction to NoSQL and Cassandra

Why is it good?

because, by contacting fewer replicas, read and write operations complete

more quickly, lowering latency.

Page 15: Introduction to NoSQL and Cassandra

Cassandra is a distributed , fault tolerant, scalable, column oriented and tunable consistency data store.

Page 16: Introduction to NoSQL and Cassandra

Cassandra hasC A PBut C is tunable

Page 17: Introduction to NoSQL and Cassandra

What is Apache Cassandra?

Page 18: Introduction to NoSQL and Cassandra

Key Concepts

Multi-Master, Multi-DC

Linearly scalable

Integrated Caching

Performs well with Larger-than-memory Datasets

Tunable consistency

Idempotent (client clock)

Schema Optional

No ACID transactions, No Locking

Page 19: Introduction to NoSQL and Cassandra

Generally complements another system(s)(Not intended to be one-size-fits-all)

You should always use the right tool for the right job

Page 20: Introduction to NoSQL and Cassandra

Speaking Cassandra

Page 21: Introduction to NoSQL and Cassandra

Data Model

“4-Dimensional Hash Table”

A Keyspace contains a collection of Column Families(Controls replication)

A Column Family contains Rows

A Row have a key, and each row has columns(No need to define the columns before hand)

Each column has a name and a value and a timestamp

(TTL is optional)

Page 22: Introduction to NoSQL and Cassandra

Data Model – (RDBMS)

Keyspace (Schema)

Column Family(CF) (table)

Row (row)

Column (column*) → may not be present in all rows

Page 23: Introduction to NoSQL and Cassandra

Data Model – Column Family

Static Column Family- Model my object data

Dynamic Column Family- Precalculated / Prematerialized query results

Nothing stopping you from mixing them!

Page 24: Introduction to NoSQL and Cassandra

Data Model – Static Column Family

Page 25: Introduction to NoSQL and Cassandra

Data Model – Dynamic CF

stats for a specific date

Page 26: Introduction to NoSQL and Cassandra

Data Model – Dynamic CF

Timeline of tweets by a userTimeline of tweets by all of the people a user is followingList of comments sorted by scoreList of friends grouped by stateMetrics for a time bucket

Page 27: Introduction to NoSQL and Cassandra

...

Let's store “foo”

Page 28: Introduction to NoSQL and Cassandra

...

Let's store “foo”

Foo

Page 29: Introduction to NoSQL and Cassandra

But if that node is down?

Foo

Page 30: Introduction to NoSQL and Cassandra

...

Let's store “foo” in 3 nodes.This is the Replication Factor(N)

Foo

Foo

Foo

Page 31: Introduction to NoSQL and Cassandra

...

Now we need to know what nodes the key was written to so we can read it later

Page 32: Introduction to NoSQL and Cassandra

...

The Initial Token specifies the upper value of the key range each node is responsible for

#1<= 'd'

#2<= 'k'

#3<= 'p'

#5<= 'z'

#4<= 'u'

a b c d e f g h I j k l m n …. z

'e f g h I j k '

Page 33: Introduction to NoSQL and Cassandra

...

Gossip is the protocol Cassandra uses to interchange information with nodes in the cluster (a.k.a. Ring)

Page 34: Introduction to NoSQL and Cassandra

Gossip is the protocol Cassandra uses to interchange information with nodes in the cluster (a.k.a. Ring)

For example, what nodes owns the key “foo”

Page 35: Introduction to NoSQL and Cassandra

...

Gossip is the protocol Cassandra uses to interchange information with nodes in the cluster (a.k.a. Ring)

For example, what nodes owns the key “foo”

#1<= 'd'

#2<= 'k'

#3<= 'p'

#5<= 'z'

#4<= 'u'

Client

'foo'

Read 'foo'

'e f g h I j k '

Page 36: Introduction to NoSQL and Cassandra

...

A Partitioner is used to transform the key. “foo1” and “foo2” may end up in different nodes

Page 37: Introduction to NoSQL and Cassandra

...

A Partitioner is used to transform the key. “foo1” and “foo2” may end up in different nodes

The most commonly used is Random Partitioner

“foo1” md5(“foo1”) “A99A0B....”

Page 38: Introduction to NoSQL and Cassandra

...

A Partitioner is used to transform the key. “foo1” and “foo2” may end up in different nodes

The most commonly used is Random Partitioner

#1

#2

#3

#5

#4

'foo1'

'foo2'

Page 39: Introduction to NoSQL and Cassandra

...

A Replica Placement Strategy determines which nodes contain replicas

Page 40: Introduction to NoSQL and Cassandra

...

A Replica Placement Strategy determines which nodes contain replicas

Simple Strategy place them clockwise

#1

#2

#3

#5

#4

'foo1'

'foo1'

'foo1'

Page 41: Introduction to NoSQL and Cassandra

...

A Replica Placement Strategy determines which nodes contain replicas

Network Topology Strategy place them in different DCs

#1

#2#4

#3

#5'foo1'

'foo1'

'foo1'

#1

#2#4

#3

#5'foo1'

DC1:3 DC2:1

Page 42: Introduction to NoSQL and Cassandra

...

Consistency Level determines how many replicas to contact to

Page 43: Introduction to NoSQL and Cassandra

...

Consistency Level determines how many replicas to contact to

CL = 1

#1

#2

#3

#5

#4

'foo1'

'foo1'

'foo1'

Client

Page 44: Introduction to NoSQL and Cassandra

...

Consistency Level determines how many replicas to contact to

CL = QUORUM

#1

#2

#3

#5

#4

'foo1'

'foo1'

'foo1'

Client

Page 45: Introduction to NoSQL and Cassandra

Consistency For Writes

ANY

ONE

TWO

THREE

QUORUM

LOCAL_QUORUM

EACH_QUORUM

ALL

Page 46: Introduction to NoSQL and Cassandra

Consistency For Reads

ONE

TWO

THREE

QUORUM

LOCAL_QUORUM

EACH_QUORUM

ALL

Page 47: Introduction to NoSQL and Cassandra

Consistency In Math Term

(nodes_written + nodes_read) > replication_factor

Cassandra guarantees strong consistency if

R + W > N

Page 48: Introduction to NoSQL and Cassandra

Back to the example..

Consistency Level determines how many replicas to contact to

CL = QUORUM

#1

#2

#3

#5

#4

'foo1'

'foo1'

'foo1'

Client

Page 49: Introduction to NoSQL and Cassandra

...

But what if node #3 is down?

Page 50: Introduction to NoSQL and Cassandra

...

But what if node #3 is down?

#1

#2

#3

#5

#4

'foo1'hint

'foo1'

Client

Page 51: Introduction to NoSQL and Cassandra

...

But what if node #3 is down?

The coordinator nodes will store a hint and will replay that mutation when the down node comes back up.

This is known as Hinted Handoff

Page 52: Introduction to NoSQL and Cassandra

...

Node #5 will replay the hint to node #3 when it comes back online

#1

#2

#3

#5

#4

'foo1'hint

'foo1'

Client

'foo1'

Page 53: Introduction to NoSQL and Cassandra

...

And if node #5 dies before sending the hints to node #3?

#1

#2

#3

#5

#4

'foo1'hint

'foo1'

Client

Page 54: Introduction to NoSQL and Cassandra

...

If using Quorum, node #4 will request for 'foo' to all the replicas

#1

#2

#3

#5

#4

'foo1'hint

'foo1'

Client

''

Page 55: Introduction to NoSQL and Cassandra

...

If the result received do not match, a Read Repair process is performed in the background

#1

#2

#3

#5

#4

'foo1'hint

'foo1'

Client

''

Page 56: Introduction to NoSQL and Cassandra

...

And the missing or not up-to-date value is pushed to the out of date node. #3 in this case

#1

#2

#3

#5

#4

'foo1'hint

'foo1'

Client

'foo''foo' != ''

Page 57: Introduction to NoSQL and Cassandra

...

The last feature to achieve consistency is the Anti Entropy Service (AES)

Should run periodically as part of the cluster maintenance or when a node was down

Page 58: Introduction to NoSQL and Cassandra

Recap Consistency Features

Read Repair

Anti Entropy Service (AES)

Hinted Handoff

Page 59: Introduction to NoSQL and Cassandra

scaling

“z”

“t”

“e”

“o”

“j”

Page 60: Introduction to NoSQL and Cassandra

scaling

“z”

“t”

“e”

“o”

“j”

“?”

Page 61: Introduction to NoSQL and Cassandra

scaling

“z”

“t”

“e”

“o”

“j”

“g”

Nodetool move ?

Page 62: Introduction to NoSQL and Cassandra

Want 2x performance ?!

Add 2x nodes

'No downtime' included!

Page 63: Introduction to NoSQL and Cassandra

Want 2x performance ?!

“z”

“t”

“e”

“o”

“j”

Page 64: Introduction to NoSQL and Cassandra

Want 2x performance ?!

“z”

“t”

“e”

“o”

“j”

“g”

“l”

“q”

“v”

“b”

Page 65: Introduction to NoSQL and Cassandra

With RF= 3 we could lose

“z”

“t”

“e”

“o”

“j”

“g”

“l”

“q”

“v”

“b”

XX

X

Page 66: Introduction to NoSQL and Cassandra

With RF= 3 we could lose

“z”

“t”

“e”

“o”

“j”

“g”

“l”

“q”

“v”

“b”

XX

X

X ?

Page 67: Introduction to NoSQL and Cassandra

Vs others

z

t

e

o

j

g

lq

v

b

Page 68: Introduction to NoSQL and Cassandra

Recap

Replication FactorTokensGossipPartitionerReplica PlacementConsistencyHinted HandoffRead RepairAESClustering

Page 69: Introduction to NoSQL and Cassandra
Page 70: Introduction to NoSQL and Cassandra

Performance

Reads on par with writes

Page 71: Introduction to NoSQL and Cassandra

Scalability

Page 72: Introduction to NoSQL and Cassandra

Internals

Page 73: Introduction to NoSQL and Cassandra

Read and Write path

Page 74: Introduction to NoSQL and Cassandra

Storage - SSTable

- SSTables are sorted

- Immutable (“Merge on read”)

- Newest timestamp wins

Page 75: Introduction to NoSQL and Cassandra

Storage – Compaction

Page 76: Introduction to NoSQL and Cassandra

Storage – Compaction

Merges SSTables together into a larger SSTables

Removes Tombstones

Rebuild primary and secondary indexes

Page 77: Introduction to NoSQL and Cassandra

Storage – Compaction

Two types:

- Size-tiered compaction

- Leveled compaction

Page 78: Introduction to NoSQL and Cassandra

Storage – Compaction

Size-tiered compaction

Performance no guaranteedRow may be across many SSTablesWaste of spaceGood for write heavy opsRows are written once100% more space than SSTables

Page 79: Introduction to NoSQL and Cassandra

Storage – Compaction

Leveled compaction

Grouped into levelsNo overlapping within a levelEach level is ten times as large90% of reads satisfied with 1 SSTableTwice as much I/O

Page 80: Introduction to NoSQL and Cassandra

Recap

SSTableMemtableRow CacheCompaction

Page 81: Introduction to NoSQL and Cassandra

Before - 48 Cassandra on m2.4xlarge. 36 EVcache on m2.xlarge

After - 12 Cassandra on hi1.4xlarge

SSDs and caching

Page 82: Introduction to NoSQL and Cassandra

API Operations

Page 83: Introduction to NoSQL and Cassandra

Five general categories

Retrieving

Write/Update/Remove (all the same op!)Increment counters

Meta Information

Schema Manipulation

CQL Execution

Page 84: Introduction to NoSQL and Cassandra

Insertion/Deletion => Mutation

Again: Every mutation is an insert!- Merge on read- Sstables are immutable- Highest timestamp wins

Page 85: Introduction to NoSQL and Cassandra

CQL

INSERT INTO Hollywood.NerdMovies (user_uuid, fan) VALUES ('cfd66ccc-d857-4e90-b1e5-df98a3d40cd6', 'johndoe') USING CONSISTENCY LOCAL_QUORUM AND TTL 86400;

Page 86: Introduction to NoSQL and Cassandra

Hadoop

Page 87: Introduction to NoSQL and Cassandra

Using a Client

- Hector

http://hector-client.org

- Astyanax

https://github.com/Netflix/astyanax

- Pelops

https://github.com/s7/scale7-pelops

Page 88: Introduction to NoSQL and Cassandra

Using a Client → Hector

- Most popular Java client

- In use at very large installations

- A number of tools and utilities built on top

- Very active community

- MIT Licensed

Page 89: Introduction to NoSQL and Cassandra

Features

- High Level API

- Failover behavior

- High performant connection pool

- JMX counters for management

- Discoverability of new nodes

- Automatic retry of downed hosts

- Suspension of nodes after several timeouts

- Load Balancing: Configurable and extensible

- Locking (Beta)

Page 90: Introduction to NoSQL and Cassandra

Hector's Architecture

Page 91: Introduction to NoSQL and Cassandra

vs JDBC

Hector is operation-oriented

Whereas

JDBC is connection-oriented

Page 92: Introduction to NoSQL and Cassandra

API Abstractions

Thrift

Mutator

Templates

Page 93: Introduction to NoSQL and Cassandra

ColumnFamilyTemplate

Familiar, type-safe approach

- based on template-method design pattern

- generic: ColumnFamilyTemplate<K,N>

(K is the key type, N the column name type)

ColumnFamilyTemplate template = new ThriftColumnFamilyTemplate(keyspaceName, columnFamilyName, StringSerializer.get(), StringSerializer.get());

*** (no generics for clarity)

Page 94: Introduction to NoSQL and Cassandra

ColumnFamilyTemplate

new ThriftColumnFamilyTemplate(keyspaceName,

columnFamilyName,

StringSerializer.get(),

StringSerializer.get());Key Format

Column Name Format- Cassandra calls this a “comparator”- Remember: defines column order in on-disk format

Page 95: Introduction to NoSQL and Cassandra

ColumnFamilyTemplate

ColumnFamilyResult<String, String> res = cft.queryColumns("patricioe");

String value = res.getString("email");

Date startDate = res.getDate(“DateOfBirth”);

Key Format

Column Name Format

Page 96: Introduction to NoSQL and Cassandra

ColumnFamilyTemplate

ColumnFamilyUpdater updater = template.createUpdater(”pato");

updater.setString("companyName",”Relateiq");updater.addKey(”sabina");updater.setString("companyName",”Globant");

template.update(updater);

Inserting data with ColumnFamilyUpdater

Page 97: Introduction to NoSQL and Cassandra

ColumnFamilyTemplate

template.deleteColumn("zznate", "notNeededStuff");template.deleteColumn("zznate", "somethingElse");template.deleteColumn("patricioe", "aDifferentColumnName");...template.deleteRow(“someuser”);

template.executeBatch();

Deleting Data with ColumnFamilyTemplate

Page 98: Introduction to NoSQL and Cassandra

Integrating with existing patterns

Hector Object Mapper -> Apache Gorahttps://github.com/hector-client/hector/tree/master/object-mapper

Hector JPA*:https://github.com/riptano/hector-jpa

Spring IOC

CQL: JDBC Driver and Pool in 1.0!

JdbcTemplate FTW!

Page 99: Introduction to NoSQL and Cassandra

Development Resources

Hector Documentation (http://hector-client.org)

Cassandra Unithttps://github.com/jsevellec/cassandra-unit

Cassandra Maven Pluginhttp://mojo.codehaus.org/cassandra-maven-plugin/

CCM localhost cassandra clusterhttps://github.com/pcmanus/ccm

OpsCenterhttp://www.datastax.com/products/opscenter

Cassandra AMIshttps://github.com/riptano/CassandraClusterAMI

Page 100: Introduction to NoSQL and Cassandra

Want to contribute?

git clone [email protected]:hector-client/hector.git

Page 101: Introduction to NoSQL and Cassandra

Summary

- Take advantage of strengths- idempotence and asynchronicity are your friends- If it's not in the API, you are probably doing it wrong- Seek death is still possible if you model incorrectly- Try Denormalizing (append-only model ?)

Page 102: Introduction to NoSQL and Cassandra

Patricio Echagü[email protected]

@patricioe

Page 103: Introduction to NoSQL and Cassandra

Credits

Nate McCall

Aaron Morton (http://thelastpickle.com)

Datastax (http://www.datastax.com)

http://www.slideshare.net/mikiobraun/cassandra-an-introduction

Page 104: Introduction to NoSQL and Cassandra

Additional Resources

DataStax Documentation: http://www.datastax.com/docs

Apache Cassandra project wiki: http://wiki.apache.org/cassandra/

“The Dynamo Paper”http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf

P. Helland. Building on Quicksandhttp://arxiv.org/pdf/0909.1788

P. Helland. Life Beyond Distributed Transactionshttp://www.ics.uci.edu/~cs223/papers/cidr07p15.pdf

S. Anand. “Netflix's Transition to High-Availability Storage Systems”http://media.amazonwebservices.com/Netflix_Transition_to_a_Key_v3.pdf

“The Megastore Paper”http://research.google.com/pubs/archive/36971.pdf