Introduction to NoSQL and Cassandra

Introduction to NoSQL and Apache Cassandra

Patricio Echagüe patricioe@gmail.com

@patricioe

About me

Present: Relateiq (Data Processing and Scalability)

Hector committer

Past: DataStax (The Cassandra Company)

Cassandra/Hadoop distribution (former Brisk)

Cassandra FS

CQL connection pool

Cassandra contributions

Trends: “NoSQL”

What is “NoSQL” ?

systems able to store and retrieve great quantities of data with none or little information about the relationships between them.

Generally they don't have a SQL like language for data manipulation and their schema is more relaxed than traditional RDBM systems.

Full ACID is not often guaranteed.

Brewer's CAP theorem

Consistency: all replicas agree on the same value

Availability: always get an answer from a replica

Partition Tolerance: the system works even if replicas can't talk

You can have 2 of these

Brewer's CAP theorem

CAP Classification

Consistency

PartitioningAvailability

- Relationals- Key-Value stores- Columnar (column-oriented)- Graph databases- Document

What's eventual consistency?

It is a promise that eventually, in the absence of new writes, all replicas that are responsible for a data item will agree on the same version

How eventual is eventual?Write to 1 replica and Read from 1 replica of a total

How eventual is eventual?Write to 2 replicas and Read from 2 replicas of a total

Why is it good?

because, by contacting fewer replicas, read and write operations complete

more quickly, lowering latency.

Cassandra is a distributed , fault tolerant, scalable, column oriented and tunable consistency data store.

Cassandra hasC A PBut C is tunable

What is Apache Cassandra?

Key Concepts

Multi-Master, Multi-DC

Linearly scalable

Integrated Caching

Performs well with Larger-than-memory Datasets

Tunable consistency

Idempotent (client clock)

Schema Optional

No ACID transactions, No Locking

Generally complements another system(s)(Not intended to be one-size-fits-all)

You should always use the right tool for the right job

Speaking Cassandra

Data Model

“4-Dimensional Hash Table”

A Keyspace contains a collection of Column Families(Controls replication)

A Column Family contains Rows

A Row have a key, and each row has columns(No need to define the columns before hand)

Each column has a name and a value and a timestamp

(TTL is optional)

Data Model – (RDBMS)

Keyspace (Schema)

Column Family(CF) (table)

Row (row)

Column (column*) → may not be present in all rows

Data Model – Column Family

Static Column Family- Model my object data

Dynamic Column Family- Precalculated / Prematerialized query results

Nothing stopping you from mixing them!

Data Model – Static Column Family

Data Model – Dynamic CF

stats for a specific date

Data Model – Dynamic CF

Timeline of tweets by a userTimeline of tweets by all of the people a user is followingList of comments sorted by scoreList of friends grouped by stateMetrics for a time bucket

Let's store “foo”

But if that node is down?

Let's store “foo” in 3 nodes.This is the Replication Factor(N)

Now we need to know what nodes the key was written to so we can read it later

The Initial Token specifies the upper value of the key range each node is responsible for

#1<= 'd'

#2<= 'k'

#3<= 'p'

#5<= 'z'

#4<= 'u'

a b c d e f g h I j k l m n …. z

'e f g h I j k '

Gossip is the protocol Cassandra uses to interchange information with nodes in the cluster (a.k.a. Ring)

For example, what nodes owns the key “foo”

Gossip is the protocol Cassandra uses to interchange information with nodes in the cluster (a.k.a. Ring)

For example, what nodes owns the key “foo”

#1<= 'd'

#2<= 'k'

#3<= 'p'

#5<= 'z'

#4<= 'u'

Client

Read 'foo'

'e f g h I j k '

A Partitioner is used to transform the key. “foo1” and “foo2” may end up in different nodes

The most commonly used is Random Partitioner

“foo1” md5(“foo1”) “A99A0B....”

A Partitioner is used to transform the key. “foo1” and “foo2” may end up in different nodes

The most commonly used is Random Partitioner

'foo1'

'foo2'

A Replica Placement Strategy determines which nodes contain replicas

Simple Strategy place them clockwise

'foo1'

A Replica Placement Strategy determines which nodes contain replicas

Network Topology Strategy place them in different DCs

#5'foo1'

'foo1'

#5'foo1'

DC1:3 DC2:1

Consistency Level determines how many replicas to contact to

CL = 1

'foo1'

Client

CL = QUORUM

'foo1'

Client

Consistency For Writes

QUORUM

LOCAL_QUORUM

EACH_QUORUM

Consistency For Reads

QUORUM

LOCAL_QUORUM

EACH_QUORUM

Consistency In Math Term

(nodes_written + nodes_read) > replication_factor

Cassandra guarantees strong consistency if

R + W > N

Back to the example..

CL = QUORUM

'foo1'

Client

But what if node #3 is down?

'foo1'hint

'foo1'

Client

But what if node #3 is down?

The coordinator nodes will store a hint and will replay that mutation when the down node comes back up.

This is known as Hinted Handoff

Node #5 will replay the hint to node #3 when it comes back online

'foo1'hint

'foo1'

Client

'foo1'

And if node #5 dies before sending the hints to node #3?

'foo1'hint

'foo1'

Client

If using Quorum, node #4 will request for 'foo' to all the replicas

'foo1'hint

'foo1'

Client

If the result received do not match, a Read Repair process is performed in the background

'foo1'hint

'foo1'

Client

And the missing or not up-to-date value is pushed to the out of date node. #3 in this case

'foo1'hint

'foo1'

Client

'foo''foo' != ''

The last feature to achieve consistency is the Anti Entropy Service (AES)

Should run periodically as part of the cluster maintenance or when a node was down

Recap Consistency Features

Read Repair

Anti Entropy Service (AES)

Hinted Handoff

scaling

“z”

“t”

“e”

“o”

“j”

scaling

“z”

“t”

“e”

“o”

“j”

“?”

scaling

“z”

“t”

“e”

“o”

“j”

“g”

Nodetool move ?

Want 2x performance ?!

Add 2x nodes

'No downtime' included!

“z”

“t”

“e”

“o”

“j”

“z”

“t”

“e”

“o”

“j”

“g”

“l”

“q”

“v”

“b”

With RF= 3 we could lose

“z”

“t”

“e”

“o”

“j”

“g”

“l”

“q”

“v”

“b”

With RF= 3 we could lose

“z”

“t”

“e”

“o”

“j”

“g”

“l”

“q”

“v”

“b”

Vs others

Replication FactorTokensGossipPartitionerReplica PlacementConsistencyHinted HandoffRead RepairAESClustering

Performance

Reads on par with writes

Scalability

Internals

Read and Write path

Storage - SSTable

- SSTables are sorted

- Immutable (“Merge on read”)

- Newest timestamp wins

Storage – Compaction

Merges SSTables together into a larger SSTables

Removes Tombstones

Rebuild primary and secondary indexes

Two types:

- Size-tiered compaction

- Leveled compaction

Size-tiered compaction

Performance no guaranteedRow may be across many SSTablesWaste of spaceGood for write heavy opsRows are written once100% more space than SSTables

Leveled compaction

Grouped into levelsNo overlapping within a levelEach level is ten times as large90% of reads satisfied with 1 SSTableTwice as much I/O

SSTableMemtableRow CacheCompaction

Before - 48 Cassandra on m2.4xlarge. 36 EVcache on m2.xlarge

After - 12 Cassandra on hi1.4xlarge

SSDs and caching

API Operations

Five general categories

Retrieving

Write/Update/Remove (all the same op!)Increment counters

Meta Information

Schema Manipulation

CQL Execution

Insertion/Deletion => Mutation

Again: Every mutation is an insert!- Merge on read- Sstables are immutable- Highest timestamp wins

INSERT INTO Hollywood.NerdMovies (user_uuid, fan) VALUES ('cfd66ccc-d857-4e90-b1e5-df98a3d40cd6', 'johndoe') USING CONSISTENCY LOCAL_QUORUM AND TTL 86400;

Hadoop

Using a Client

- Hector

http://hector-client.org

- Astyanax

https://github.com/Netflix/astyanax

- Pelops

https://github.com/s7/scale7-pelops

Using a Client → Hector

- Most popular Java client

- In use at very large installations

- A number of tools and utilities built on top

- Very active community

- MIT Licensed

Features

- High Level API

- Failover behavior

- High performant connection pool

- JMX counters for management

- Discoverability of new nodes

- Automatic retry of downed hosts

- Suspension of nodes after several timeouts

- Load Balancing: Configurable and extensible

- Locking (Beta)

Hector's Architecture

vs JDBC

Hector is operation-oriented

Whereas

JDBC is connection-oriented

API Abstractions

Thrift

Mutator

Templates

ColumnFamilyTemplate

Familiar, type-safe approach

- based on template-method design pattern

- generic: ColumnFamilyTemplate<K,N>

(K is the key type, N the column name type)

ColumnFamilyTemplate template = new ThriftColumnFamilyTemplate(keyspaceName, columnFamilyName, StringSerializer.get(), StringSerializer.get());

*** (no generics for clarity)

new ThriftColumnFamilyTemplate(keyspaceName,

columnFamilyName,

StringSerializer.get(),

StringSerializer.get());Key Format

Column Name Format- Cassandra calls this a “comparator”- Remember: defines column order in on-disk format

ColumnFamilyResult<String, String> res = cft.queryColumns("patricioe");

String value = res.getString("email");

Date startDate = res.getDate(“DateOfBirth”);

Key Format

Column Name Format

ColumnFamilyUpdater updater = template.createUpdater(”pato");

updater.setString("companyName",”Relateiq");updater.addKey(”sabina");updater.setString("companyName",”Globant");

template.update(updater);

Inserting data with ColumnFamilyUpdater

template.deleteColumn("zznate", "notNeededStuff");template.deleteColumn("zznate", "somethingElse");template.deleteColumn("patricioe", "aDifferentColumnName");...template.deleteRow(“someuser”);

template.executeBatch();

Deleting Data with ColumnFamilyTemplate

Integrating with existing patterns

Hector Object Mapper -> Apache Gorahttps://github.com/hector-client/hector/tree/master/object-mapper

Hector JPA*:https://github.com/riptano/hector-jpa

Spring IOC

CQL: JDBC Driver and Pool in 1.0!

JdbcTemplate FTW!

Development Resources

Hector Documentation (http://hector-client.org)

Cassandra Unithttps://github.com/jsevellec/cassandra-unit

Cassandra Maven Pluginhttp://mojo.codehaus.org/cassandra-maven-plugin/

CCM localhost cassandra clusterhttps://github.com/pcmanus/ccm

OpsCenterhttp://www.datastax.com/products/opscenter

Cassandra AMIshttps://github.com/riptano/CassandraClusterAMI

Want to contribute?

git clone git@github.com:hector-client/hector.git

Summary

- Take advantage of strengths- idempotence and asynchronicity are your friends- If it's not in the API, you are probably doing it wrong- Seek death is still possible if you model incorrectly- Try Denormalizing (append-only model ?)

Patricio Echagüepatricioe@gmail.com

@patricioe

Credits

Nate McCall

Aaron Morton (http://thelastpickle.com)

Datastax (http://www.datastax.com)

http://www.slideshare.net/mikiobraun/cassandra-an-introduction

Additional Resources

DataStax Documentation: http://www.datastax.com/docs

Apache Cassandra project wiki: http://wiki.apache.org/cassandra/

“The Dynamo Paper”http://www.allthingsdistributed.com/files/amazon-dynamo-sosp2007.pdf

P. Helland. Building on Quicksandhttp://arxiv.org/pdf/0909.1788

P. Helland. Life Beyond Distributed Transactionshttp://www.ics.uci.edu/~cs223/papers/cidr07p15.pdf

S. Anand. “Netflix's Transition to High-Availability Storage Systems”http://media.amazonwebservices.com/Netflix_Transition_to_a_Key_v3.pdf

“The Megastore Paper”http://research.google.com/pubs/archive/36971.pdf

Introduction to NoSQL and Cassandra

Technology

Transcript of Introduction to NoSQL and Cassandra

Cassandra: Not Just NoSQL, It's MoSQL

NOSQL apache-cassandra-150611122739-lva1-app6891

Writing Space and the Cassandra NoSQL DBMS

NoSQL を知る Cassandra から NoSQL を学ぶ

Cassandra NoSQL Tutorial

Nosql y cassandra

NoSQL com Cassandra

Benchmarking Replication in Cassandra and MongoDB NoSQL ...wjk/publications/haughian-osman-knottenbelt-dexa... · Benchmarking Replication in Cassandra and MongoDB NoSQL Datastores

Progressive NOSQL: Cassandra

NoSQL Databases : MongoDB vs Cassandra

Cassandra nosql eu 2010

Svccg nosql 2011_sri-cassandra

descarga e instalación de cassandra nosql

Cassandra at NoSql Matters 2012

Research Article Evaluating the Cassandra NoSQL Database ...repositorio.unb.br/.../24898/1/ARTIGO_EvaluationCassandraNoSQL.pdf · Research Article Evaluating the Cassandra NoSQL Database

Visão do NOSQL introdução do Cassandra

Manual cassandra NoSQL

NoSQL: Un nuevo paradigma - Apache Cassandra

За гранью NoSQL: NewSQL на Cassandra

Cassandra NoSQL - informatica.gonzalonazareno.orginformatica.gonzalonazareno.org/proyectos/2017-18/Presentacion... · Introducción Cassandra es una base de datos de tipo NoSQL distribuida