Advanced CouchDB

148
CouchDB relax

description

http://joind.in/2495 PHPBenelux conference January 2011

Transcript of Advanced CouchDB

Page 1: Advanced CouchDB

CouchDBrelax

Page 2: Advanced CouchDB

CouchDBrelax

Sander van de Graaf@svdgraaf

Focus -> practical usage examples

Page 3: Advanced CouchDB

http://joind.in/talk/view/2495

second talk ever, please provide feedback

Page 4: Advanced CouchDB
Page 5: Advanced CouchDB

CONTENTS

• Introduction

• PHP Usage

• Replication/Scalability

• Backend usage

• Couchapps

•Other stuff

Page 6: Advanced CouchDB

CouchDBrelax

Page 7: Advanced CouchDB
Page 8: Advanced CouchDB

NOSQL

Page 9: Advanced CouchDB

IT’S A MOVEMENT

Movement, definitions vary

Page 10: Advanced CouchDB

1998

Back in the day...

Page 11: Advanced CouchDB
Page 12: Advanced CouchDB
Page 13: Advanced CouchDB

Lame movie 1

Page 14: Advanced CouchDB

Another one

Page 15: Advanced CouchDB

And then some more...

Page 16: Advanced CouchDB

XML was introduced

Page 17: Advanced CouchDB

Some game was published

Page 18: Advanced CouchDB

MC Donald’s Happy Meal

Page 19: Advanced CouchDB

Carlo Strozzi

Released NOSQL open source DB

Page 20: Advanced CouchDB

NOSQL == Not Only SQL

Page 21: Advanced CouchDB

“[The NoSQL movement] departs from the relational model altogether, it should therefore have been called more appropriately ‘NoREL’, or something to that effect.”

- Carlo Strozzi

Page 22: Advanced CouchDB

CouchDBrelax

Page 23: Advanced CouchDB
Page 24: Advanced CouchDB
Page 25: Advanced CouchDB
Page 26: Advanced CouchDB
Page 27: Advanced CouchDB

Ubuntu One, contacts sync

Page 28: Advanced CouchDB

NUTSHELL

Page 29: Advanced CouchDB

SPEED

Speed Not diskpace (see cleanup)

Page 30: Advanced CouchDB

APPEND ONLY

Append only storage, happy cup of coffee!

Page 31: Advanced CouchDB

NO REPAIR NEEDED

Page 32: Advanced CouchDB

COMPACTING

Page 33: Advanced CouchDB

HTTP SERVER

caching, loadbalancing, without extra costs :D

Page 34: Advanced CouchDB

CAP

Page 35: Advanced CouchDB

CAP

CouchDB

Page 36: Advanced CouchDB

EVENTUALLY CONSISTENT

CouchDB

CouchDB focus is on Availability + Reliability, and will be consistent after replication.

Page 37: Advanced CouchDB

FULL REST API

Page 38: Advanced CouchDB

REST

• GET

• PUT

• POST

•DELETE

• COPY

• SELECT

• UPDATE

• INSERT

•DELETE

• ...

Page 39: Advanced CouchDB
Page 40: Advanced CouchDB

JSON{ total_rows: 2, offset: 0, rows: [ { id: '_design/foobar', key: '_design/foobar', value: { rev: '5-982b2fc36835715b2aae54609b5d5f1e' } }, { id: 'f0e1fd96eb6e094f74dda8d949000a6a', key: 'f0e1fd96eb6e094f74dda8d949000a6a', value: { rev: '1-86bca407fce8234a63c90ff549b56b10' } }, ]}

Javascript == awesome! :D

Page 41: Advanced CouchDB

REPLICATION

Key feature, relaxed about replication issues, and version conflicts

Page 42: Advanced CouchDB

Welcome to Futon, I prefer a UI

Page 43: Advanced CouchDB

http-console rocks the socks out of telnet

Page 44: Advanced CouchDB

Berkeley

Page 45: Advanced CouchDB

CONTENTS

• Introduction

• PHP Usage

• Replication/Scalability

• Backend usage

• Couchapps

•Other stuff

Page 46: Advanced CouchDB

PHP USAGE

Page 47: Advanced CouchDB

PHP LIBRARIES

• PHPillow (LGPL)

• PHP Object Freezer (BSD)

• PHP On Couch (GPL 2 / 3)

• PHP CouchDB Extension (PHP license)

• SAG for CouchDB (apache)

•Doctrine 2 CouchDB ODM

All are quite nice, doctrine has some rough edges, I use PHP On Couch with custom patch for Zend autoloader easyness

Page 48: Advanced CouchDB

<?PHP

// setup connection for couchdb$client = new Couchdb_Client('http://ponies.couchone.com:5984','rainbows');

// fetch a document$doc = $client->getDoc('awesome_pony');

// updating document$doc->newproperty = array("type","awesome");

try{ $client->storeDoc($doc);}catch (Exception $e){ echo "Document storage failed : " . $e->getMessage();}

PHP On Couch with small ZF autoloader fix

Page 49: Advanced CouchDB

CONTENTS

• Introduction

• PHP Usage

• Replication/Scalability

• Backend usage

• Couchapps

•Other stuff

Page 50: Advanced CouchDB

REPLICATION

Page 51: Advanced CouchDB

DEFINITION

“Replication is the process of sharing information so as to ensure consistency between redundant resources, such as software or hardware components, to improve reliability, fault-tolerance, or accessibility.”

Source: wikipedia

Page 52: Advanced CouchDB
Page 53: Advanced CouchDB
Page 54: Advanced CouchDB

CouchDBrelax

Page 55: Advanced CouchDB

CouchDBrelax

CouchDBrelax

Page 56: Advanced CouchDB

CouchDB

Page 57: Advanced CouchDB

CouchDBrelax

CouchDBrelax

CouchDBrelax

CouchDBrelax

Mysql can do this

Page 58: Advanced CouchDB

CouchDBrelax

CouchDBrelax

Master, Master replication

Page 59: Advanced CouchDB

CouchDBrelax

CouchDBrelax

CouchDBrelax

Page 60: Advanced CouchDB

CouchDBrelax

CouchDBrelax

CouchDBrelax

US NL

BE

Not only locally

Page 61: Advanced CouchDB

P2P WEB

Page 62: Advanced CouchDB

“World Domination”

Page 63: Advanced CouchDB

CLUSTERING“The fun stuff ”

Page 64: Advanced CouchDB

Couchdb doesn’t support partitioning (sharding) itself, couchdb -> http based -> lots of possibilities

Page 65: Advanced CouchDB

CouchDBrelax

CouchDBrelax

loadbalancer

...n

The basics are all the same: easy => couchdb instances 1..n => loadbalancer

Page 66: Advanced CouchDB

CHALLENGES

• Large amounts of data

• Large views (with big/long map/reduce queries)

• LOTS of traffic

• Location based partitions

• For fun and profit

Page 67: Advanced CouchDB

MAP/REDUCE

Page 68: Advanced CouchDB

INPUT

IP Bytes

212.122.174.13 18271

212.122.174.13 191726

212.122.174.13 198

74.119.8.111 91272

74.119.8.111 8371

212.122.174.13 43

Map/Reduce example

Page 69: Advanced CouchDB

MAPPER => REDUCER

IP Bytes

212.122.174.13

18271

212.122.174.13191726

212.122.174.13198

212.122.174.13

43

74.119.8.11191272

74.119.8.1118371

Page 70: Advanced CouchDB

AFTER REDUCE

IP Bytes

212.122.174.13 210238

74.119.8.111 99643

Page 71: Advanced CouchDB

PARTITION INPUT

Partition IP Bytes

0 212.122.174.13 18271

0 212.122.174.13 191726

0 212.122.174.13 198

1 74.119.8.111 91272

1 74.119.8.111 8371

0 212.122.174.13 43

Map/Reduce example

Page 72: Advanced CouchDB

MAPPER => REDUCER

Partition IP Bytes

0 212.122.174.13

18271

0 212.122.174.13191726

0 212.122.174.13198

0 212.122.174.13

43

1 74.119.8.11191272

1 74.119.8.1118371

If data is big enough, you could even need a re-re-re-reducer

Page 73: Advanced CouchDB

AFTER REDUCE

IP Bytes

212.122.174.13 210238

74.119.8.111 99643

Page 74: Advanced CouchDB

• CouchDB Lounge

• Pillow

• BigCouch

CLUSTERING OPTIONS

Page 75: Advanced CouchDB

LOUNGE

•partitioning/clustering

•Nginx module

•meebo.com

• ‘easy’

•http://tilgovi.github.com/couchdb-lounge/

Page 76: Advanced CouchDB

LOUNGE

• dumb_proxy => proxy for simple PUT/GET’s

• smart_proxy => proxy for map/reduce over shards

• replicator => updates all copies, redudantly

it can make sure that there are N copies of a document at every moment

Page 77: Advanced CouchDB

CouchDBrelax

CouchDBrelax

nginx

...n

dumb_proxy

dumb_proxy == ONLY GET/PUT

Page 78: Advanced CouchDB

CouchDBrelax

CouchDBrelax

nginx

...n

smart_proxy

smart_proxy takes care of the map/reduce and re-reducers over multiple nodes

Page 79: Advanced CouchDB

Bonus:

other nginx modules work too

mod_cache, mod_expire, etc.

Page 80: Advanced CouchDB

PILLOW

•Erlang based

• router/rereducer (map/reduce over multiple systems)

• In development (but promising!)

•https://github.com/khellan/Pillow

Page 81: Advanced CouchDB

BIGCOUCH

•Fork

•100% api compatible

•Open Source/Commercial

•https://cloudant.com/#!/solutions/bigcouch

Page 82: Advanced CouchDB

CONTENTS

• Introduction

• PHP Usage

• Replication/Scalability

• Backend usage

• Couchapps

•Other stuff

Page 83: Advanced CouchDB

BACKEND USAGE

Page 84: Advanced CouchDB

PROXIED

CouchDBrelax

proxied via middleware, or via mod_proxy or similiar

Page 85: Advanced CouchDB

DIRECT

CouchDBrelax

or direct, because http based, content is directly available in javascript

Page 86: Advanced CouchDB

NOSQL && SQL HYBRID

• onSave, onCommit hooks available in every major framework

• onSave -> make a JSON representation of your object, and PUT it to couchdb (#protip: only ‘public’ data)

• sql db is leading, you don’t care about versioning in couchdb

• you can use your data directly from couchdb within your frontend javascript

Page 87: Advanced CouchDB

<?phpclass Pony extends Application_models{ public function toArray() { $data = $this->_getData(); unset($data['created_on']); unset($data['created_by']); unset($data['access_level']); unset($data['private_data']); $data['tags'] = $this->getTags(); $data['categories'] = $this->getCategories(); $data['rainbows'] = 'double'; return $data; }}

MODEL

Page 88: Advanced CouchDB

AFTER_SAVE

<?phpclass article_module extends admin_module{ public function after_save() { parent::after_save(); $data = $this->toJson(); $res = CouchDB::put($data); $this->_id = $res->_id; $this->_rev = $res->_rev; }}

Page 89: Advanced CouchDB

RewriteEngine OnRewriteRule /data/(.*) http://127.0.0.1:5984/db/$1 [P,L]

PROXY

Proxy the calls (work around sandbox/other domain error), or use jsonp

Page 90: Advanced CouchDB

JAVASCRIPT

<script type="text/javascript">$.getJSON("/db/ponies/_design/ponies/_view/best-ponies?include_docs=true", function(res){ for(i in res.rows) { doc = res.rows[i].doc; // do stuff } });</script>

Page 91: Advanced CouchDB

CONTENTS

• Introduction

• PHP Usage

• Replication/Scalability

• Backend usage

• Couchapps

•Other stuff

Page 92: Advanced CouchDB

COUCHAPP

CouchDB has it’s own structure for “distributed, scalable web applications” called couchapps

Page 93: Advanced CouchDB

“Distributed, scalable, web applications you say?

omgwtfbbq!?!1!!!11!1!eleven”

Page 94: Advanced CouchDB

_attachments

the magic is in _attachments

Page 95: Advanced CouchDB

CouchDBrelax

CouchDBrelax

CouchDBrelax

distribution via replication

Page 96: Advanced CouchDB

INSTALLATION

Couchapp 0.7.0

installation is easy

Page 97: Advanced CouchDB

$ couchapp init

init a project

Page 98: Advanced CouchDB

LAYOUT

creates a default folder

Page 99: Advanced CouchDB

$ couchapp push http://ponies.couchone.com:5984/rainbows

Page 100: Advanced CouchDB

https://github.com/brandon-beacher/couchapp-tmbundle

couchapp push on save -> textmate

Page 101: Advanced CouchDB

CONTENTS

• Introduction

• PHP Usage

• Replication/Scalability

• Backend usage

• Couchapps

•Other stuff

Page 102: Advanced CouchDB

OTHER STUFF

Page 103: Advanced CouchDB

REWRITES

Page 104: Advanced CouchDB

_REWRITE

Page 106: Advanced CouchDB

such urls make us a sad panda

Page 107: Advanced CouchDB

{ .... "rewrites": [ { "from": "/best-5-ponies", "to": "ponies/_view/best-ponies", "method": "GET", "query": { "descending": true, "limit": 5, "key": "foobar" } } ] }

Page 109: Advanced CouchDB

$ curl "http://ponies.couchone.com/rainbows/_design/ponies/_rewrite/best-5-ponies"

to this

Page 110: Advanced CouchDB

[vhosts]awesomeponies.com = /rainbows/_design/ponies/_rewrite

Page 111: Advanced CouchDB

$ curl "http://ponies.couchone.com/rainbows/_design/ponies/_rewrite/best-5-ponies"

rewrite this

Page 113: Advanced CouchDB
Page 114: Advanced CouchDB

_CHANGES

Page 115: Advanced CouchDB
Page 116: Advanced CouchDB

$ curl -X GET "http://ponies.couchone.com/rainbows/_changes"

Page 117: Advanced CouchDB

{"results":[

],"last_seq":0}

Page 118: Advanced CouchDB

curl -X PUT http://ponies.couchone.come/rainbows/foobar -d '{"type":"awesome"}'

Page 119: Advanced CouchDB

{"results":[{"seq":1,"id":"foobar","changes":[{"rev":"1-aaa8e2a031bca334f50b48b6682fb486"}]}],"last_seq":1}

Page 120: Advanced CouchDB

{"results":[{"seq":1,"id":"foobar","changes":[{"rev":"1-aaa8e2a031bca334f50b48b6682fb486"}]},{"seq":2,"id":"foobar2","changes":[{"rev":"1-e18422e6a82d0f2157d74b5dcf457997"}]}],"last_seq":2}

Page 121: Advanced CouchDB

_CHANGES OPTIONS

• ?since

• Longpolling

• Continuous

Page 122: Advanced CouchDB

$ curl -X GET "http://ponies.couchone.com/rainbows/_changes?since=20"

Page 123: Advanced CouchDB

curl -X GET "http://ponies.couchone.com/rainbows/_changes?feed=longpoll&since=2"

Longpolling: good for little updates, connections stays open until change, then gets closed and you need to reconnect, lots of reconnects for lots of updates

Page 124: Advanced CouchDB

curl -X GET "http://ponies.couchone.com/rainbows/_changes?feed=continuous&since=2"

Connections stays open, and you get updates on the fly!

Page 125: Advanced CouchDB
Page 126: Advanced CouchDB

FILTERS

filters can be used to filter documents from output

Page 127: Advanced CouchDB

function(doc, req){ if(doc.priority == 'high') { return true; } return false;}

we only want high priority documents

Page 129: Advanced CouchDB

function(doc, req){ if(doc.name == req.query.name) { return true; }

return false;}

you can use req for request based filters

Page 130: Advanced CouchDB

curl -X GET"http://ponies.couchone.com/rainbows/_changes?feed=continuous&filter=app/name&name=foobar

Page 131: Advanced CouchDB

SHOWS

Page 132: Advanced CouchDB

function(doc, req) { return { body: "Hello World" }}

Page 133: Advanced CouchDB

curl -X"http://ponies.couchone.com/rainbows/_design/foobar/_show/showfunction/docid"

Page 134: Advanced CouchDB

function(doc) { return { "code": 302, "body": "See other", "headers": { "Location": doc.target } };}

You can also define http headers, we used this for translating public id’s into private storage id’s. In this way, couchdb took care of all the headers and http stuff, and we could use a regular nginx proxy module

Page 135: Advanced CouchDB

LUCENE

Page 136: Advanced CouchDB

[external]fti=/path/to/python /path/to/couchdb-lucene/tools/couchdb-external-hook.py

[httpd_db_handlers]_fti = {couch_httpd_external, handle_external_req, <<"fti">>}

Page 137: Advanced CouchDB

function(doc) { var ret=new Document(); ret.add(doc.message); ret.add(new Date(doc.datetime)); return ret;}

Page 139: Advanced CouchDB

GEOCOUCHhttps://github.com/vmx/couchdb

Page 140: Advanced CouchDB

See Dericks talk yesterday

Page 141: Advanced CouchDB

GEOCOUCH

• Supports bbox

• fork

• outputs via lists, georss possible

• directly useable by google maps

• can read GIS data

• combined with _changes makes interesting usecase

- bbox => all items withing a certain bounding box, polygon is in the works- currently a fork of couchdb, in the works as external module- output can be setup seperately- google maps can use georss- GIS: Geographic Information System (used worldwide?)

Page 142: Advanced CouchDB

function(doc){ if (doc.geo && doc.geo.latitude != '' && doc.geo.longitude != '') { emit( { type: "Point", coordinates: [parseFloat(doc.geo.latitude), parseFloat(doc.geo.longitude)] }, [doc._id, doc] ); }}

SPATIAL INDEXin spatial/points.js

Page 143: Advanced CouchDB

http://ponies.couchone.com/rainbows/_design/unicorns/_spatial/points?bbox=0,0,180,90

Worldwide search

{"update_seq":3,"rows":[ { "id":"augsburg", "bbox":[10.898333,48.371667,10.898333,48.371667], "value":["augsburg",[10.898333,48.371667]] }]}

Page 144: Advanced CouchDB

if (GBrowserIsCompatible()){ map = new GMap2(document.getElementById('map')); var geoXML = new GGeoXml('http://ponies.couchone.com/rainbows/url-to-georss-view'); map.addOverlay(geoXML);}

GEORSS && GOOGLE MAPS

Page 145: Advanced CouchDB

curl -X GET "http://ponies.couchone.com/rainbows/_design/alarmeringen/_spatial/points?bbox=51.711369,4.218407,52.136520,4.745740";

Page 146: Advanced CouchDB

Q?

Page 147: Advanced CouchDB

http://www.couchone.com/get

Page 148: Advanced CouchDB

http://joind.in/talk/view/2495

second talk ever, please provide feedback