A Guide to the Post Relational Revolution

A GUIDE TO THE POST RELATIONAL

REVOLUTION

@iconara

speakerdeck.com/u/iconara(real time!)

Theo / @iconara

Chief Architect atCo-organizer of the local Ruby, Scala and JavaScript user groups

More rep on StackOverflow than both Jeff & Joel

THE WORLDISN’T FLAT

OUT IS THENEW UPwhen scaling up you’re

constrained by Moore’s Law

DISTRIBUTED SYSTEMS ARE

ABOUT TRADEOFFS

WHO NEEDSACID, ANYWAY?

banks, perhaps

JOINS AREA CRUTCH

why split up your data, if all you’re going to do is assemble it over and over again?

OBJECTS DON’TFIT IN TABLES

can you say “impedance mismatch”?

40 YEARS IS A LONG TIME

you didn’t have 256 gigabytes of RAM in 1970

THE RELATIONAL MODEL ISN’T A

GOLDEN HAMMERthe existence of object relational

mappers should be proof enough

WELCOME TO THE POST RELATIONAL

REVOLUTION

POST RELATIONAL STORAGE

KEY/VALUESTORES

the simplest possible database,not exactly a new idea

VALUEKEY

OPAQUE

Riak, Voldemort, LevelDB,Tokyo Cabinet, Berkeley DB

STRUCTUREDKEY/VALUE STORES

sometimes you need just a little bit more

the Bigtable model, “column oriented”, “sparse tables” found in Cassandra and HBase

COLUMN KEYROW KEY

COLUMN KEY

+ TIMESTAMPSORTED

“datastructure server”, e.g. Redis

KEY VALUE VALUE VALUE

LIST OR SET

KEYVALUE VALUE VALUE

SORTED SET OR HASH

KEY KEY KEY

KEY VALUE

INCREMENT, APPEND, SLICE, CAS

DOCUMENT DATABASES

object databases, but for hipsters

complex objects with lists, numbers, stringssecondary indexes* and partial updates,

MongoDB, CouchDB, RavenDB, Lotus Notes

* subject to availability

{ "firstName": "John", "lastName": "Smith", "age": 25, "address": { "streetAddress": "21 2nd Street", "city": "New York", "state": "NY", "postalCode": "10021" }, "phoneNumber": [ { "type": "home", "number": "212 555-1234" }, { "type": "cell", "number": "646 555-4567" } ] }

GRAPHDATABASES

relational, for real

traversal algorithms, extreme data complexity,Neo4j, AllegroGraph, FlockDB

NAME + PROPERTIES

DIVERSITYI haven’t even mentioned search & indexing systems like Solr and Elastic Search, or distributed filesystems

SOMETIMES TABLES ARE GREAT, TOO

but mostly when you rely heavily on GROUP BY, SUM, AVG, etc. and can’t precompute

POST RELATIONAL SCALING

CONSISTENCYAVAILABILITY

PARTITION TOLERANCE(choose any two)

PARTITION TOLERANCE ISN’T

OPTIONAL

CONSISTENCYVS. AVAILABILITY(but in reality, it’s not even that simple)

CONSISTENCYyou can always read what you just wrote,

but keys may become unavailable

AVAILABILITYyou can always read and write,

but you may not always get the latest value

NOT EITHER ORmost databases let you choose

on a query-by-query basis

SHARDINGscaling writes in a consistent system

divide the keyspace into shards, or regions(and store each one redundantly)

SHARD SHARD SHARD

KEYSPACE

REPLICA

DIVIDED BY DATA SIZE

split a shard when it grows too big, move one of the new shards onto a new node

SHARD SHARD SHARD

KEYSPACE

REPLICA

in reality there’s chunks, tablets or “virtual shards”that are distributed over physical shards

SHARD SHARD SHARD

KEYSPACE

REPLICA

HBASE, MONGODBsharding is easy in theory, hard in practice,

lots data needs to be moved when adding nodes

CONSISTENT HASHING

scaling writes in an available system

each node is responsible for a range of the keyspace,keys are hashed and mapped to the first following node,

(optionally) replicated to subsequent nodes

KEYSPACE

hash(key)replication

KEYSPACE

NEW NODE

when a new node is added, only part of the keyspace needs to be moved

KEYSPACE

in practice, “virtual nodes” are evenly distributed over the keyspace, and then mapped onto physical nodes

CASSANDRA, RIAKperfect balance, in theory,

but rings may still need rebalancing

GOSSIP, HINTED HANDOFF, LOG STRUCTURED

STORAGE, COMPACTION, VECTOR CLOCKS, READ REPAIR, JOURNALING, QUORUMS, EVENTUAL

CONSISTENCY, DYNAMO, MAP/REDUCE, 2PC

a few of the things I haven’t mentioned, look them up

LESSONS LEARNED

EVERYTHING THEY TAUGHT YOU

ABOUT DATABASES AT UNIVERSITY

IS WRONG

almost

THINK ABOUT YOUR QUERIES FIRST

don’t optimize for insertion, denormalize heavily, disk is cheap, this ain’t 1970

GIVE A LOT OF THOUGHT TO YOUR

PRIMARY KEYSrange queries over cleverly designedprimary keys can be very powerful,

good keys required for efficient sharding

M04L7NOC5NQSM04L7O05MIU2M04NX42YFUCRM04NYR7VWKJCM04NZA8MJOOAM04NZB88CT14M04NZPOCE8DMM04NZQ9G2T0SM04NZQE7E5VXM04NZSK4V3JNM04NZTRG661RM04NZTSUITJ7M04NZUAILUS5M04NZUG4DTXNM04NZWB9VV0CM04NZWW52T8NM04NZX2JEVO9M04NZX7WD77WM04NZXGOLDEXM04NZXKNQWB3M04NZXLGJ3M6M04NZY7GO39GM04NZZ2SQF1IM04O013HN9L9M04O014DASE6M04O02PE8AD3M04O02PGJBR1M04O03UPTRWGM04O04833ZTLM04O04GH21JFM04O04JQ8B57M04O04UHK3U4M04O056QBNBHM04O05E8XO8NM04O069O8CDKM04O06MG47WKM04O07BHELVDM04O07F30WYXM04O0B39DGEA

M04NZW B9VV0Ctimestamp

2012-02-28 23:59:56 UTCrandom number681 731 004

B9VV0C M04NZWtimestamp2012-02-28 23:59:56 UTC

random number681 731 004

CONSISTENCYIS OVERRATED

when you need it you need it, but most of the time you don’t

DELETING DATA IS NOT TRIVIAL

sometimes delete operations can be more costly than inserts, design your cleaning process early

REDISMONGODB

CASSANDRAour current toolbox

REDISswiss army knife, we use it for “virtual memory”,

counters and even messaging

REDISnot distributed (yet), no automatic failover

MONGODBa very good replacement for MySQL,

replication and automatic failover is fantastic

MONGODBglobal write lock kills performance, easily fragmented,

sharding is complex and (has been) very buggy

MONGODBwe use it for precomputing and storing

metrics for our reporting app

MONGODBwe’re currently pushing around 5K updates/s over three

replica sets, each update incrementing up to 20 numbers

CASSANDRAlow level building blocks, no single point of failure,

great horizontal scalability, TTL on values

CASSANDRAwe use it to store data about website visits,

indexing it to support complex queries

CASSANDRAmillions of rows, some with millions of

columns, adding ~1K new every second

one million writes per second

LEARN SOMETHING NEW TODAY

nosql.mypopescu.comhighscalability.comnosqltapes.com

KTHXBAItwitter.com/iconara

speakerdeck.com/u/iconaraarchitecturalatrocities.com

burtcorp.com

A Guide to the Post Relational Revolution

Technology

Transcript of A Guide to the Post Relational Revolution

Culture in Iran: Post- Islamic Revolution

The Post-Civil-War Industrial Revolution

Ethnic Uzbek Grievances in Post-Tulip Revolution Kyrgyzstan

Overview of Personal Selling Module Two. Industrial Revolution Industrial Revolution Post-Industrial Revolution Post-Industrial Revolution War and Depression.

The Post-Sanction Iranian Revolution; Its Aviation Industry Perspective

Post Revolution Iran

The Post-Chicago Antitrust Revolution: A Retrospective

Post-Revolution Culture in Iran

Post-Green Revolution food systems and the triple burden ... · Post-Green Revolution food systems and the triple burden of malnutrition . ... agricultural productivity of staple

The prevalence and characteristics of relational depth events in ... · Characteristics of Relational Depth Events in Psychotherapy. ... This is an author post-print and is not the

French Revolution & Post Capitalism

Post-Revolution Language Change in the Libyan Media ...

Post-Revolution Georgia and Westward Expansion

Approaching Rule of Law in Post-Revolution Egypt

Sourcing Opportunities in Post Revolution Egypt

Post-Green Revolution food systems and the triple burden of ...Post-Green Revolution food systems and the triple burden of malnutrition Miguel I. Gómez, Christopher B. Barrett, Terri

Chapter 3: Relational Model · Chapter 3: Relational Model Structure of Relational Databases Relational Algebra Tuple Relational Calculus Domain Relational Calculus Extended Relational-Algebra-Operations

The Unfinished Criminal Procedure Revolution of Post ...

Treating Complex Trauma: Relational Healing for Relational ...Post-trauma Responses and Disorders nComplex Posttraumatic Stress Disorder/ (DESNOS) “PTSD plus” urelated to severe

THE EGYPTIAN REVOLUTION AND POST SOCIO …meea.sites.luc.edu/volume15/pdfs/The-Egyptian-Revolution-and-Post...THE EGYPTIAN REVOLUTION AND POST SOCIO ... tackle these important topics