MongoDB is the MashupDB

73
MongoDB is the MashupDB Mongo Austin February 15, 2011 WYNNNETHERLAND Stash the Hash!

description

My slides from MongoAustin 2011

Transcript of MongoDB is the MashupDB

Page 1: MongoDB is the MashupDB

MongoDB is the MashupDBMongo Austin ★ February 15, 2011

WYNNNETHERLAND

Stash the Hash!

Page 2: MongoDB is the MashupDB

whoami

Page 3: MongoDB is the MashupDB

+

Page 4: MongoDB is the MashupDB
Page 5: MongoDB is the MashupDB

Two years ago...

Page 6: MongoDB is the MashupDB
Page 7: MongoDB is the MashupDB

No SQL?x

x

Page 8: MongoDB is the MashupDB

MongoDB x

Page 9: MongoDB is the MashupDB

Let's just skip to here already!

x

Page 10: MongoDB is the MashupDB

One year ago

Page 11: MongoDB is the MashupDB

NoSQL Smackdown!

CTO of Amazon

Page 12: MongoDB is the MashupDB

Today

Page 13: MongoDB is the MashupDB

is it web scale?

Page 14: MongoDB is the MashupDB

x

Page 15: MongoDB is the MashupDB

xxHow do we get to here?

Page 16: MongoDB is the MashupDB

Enlightenment

Page 17: MongoDB is the MashupDB

Two years with Almost

^

Page 18: MongoDB is the MashupDB

Our journey

Page 20: MongoDB is the MashupDB

“We the Tweeple of the United States, in order to form a more perfect government, establish communication, and promote transparency do hereby tweet the Congress of the United States of America.”

Page 21: MongoDB is the MashupDB

The idea: A Twitter directory for the US Congressand aggregator

^

Page 23: MongoDB is the MashupDB

When does NOSQL make sense?★Your data is stored and retrieved mainly by primary key, without

complex joins.★You have a non-trivial amount of data, and the thought of managing

lots of RDBMS shards and replication failure scenarios gives you the fear.

http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores/

Page 25: MongoDB is the MashupDB

Tokyo Cabinet

Page 26: MongoDB is the MashupDB
Page 27: MongoDB is the MashupDB

CouchDB

Apache CouchDB is a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API.

http://couchdb.apache.org/

Page 28: MongoDB is the MashupDB

SQL CouchDB

Prede!ned, explicit schema Dynamic, implicit schema

Uniform tables of data Collection of named documents with varying structure

Normalized. Objects spread across tables. Duplication reduced.

Denormalized. Docs usually self contained. Data often duplicated.

Must know schema to read/write a complete object Must know only document name

Dynamic queries of static schemas Static queries of dynamic schemas

http://damienkatz.net/files/What is CouchDB.pdfhttp://damienkatz.net/files/What is CouchDB.pdf

Page 29: MongoDB is the MashupDB

SQL CouchDB

Prede!ned, explicit schema Dynamic, implicit schema

Uniform tables of data Collection of named documents with varying structure

Normalized. Objects spread across tables. Duplication reduced.

Denormalized. Docs usually self contained. Data often duplicated.

Must know schema to read/write a complete object Must know only document name

Dynamic queries of static schemas Static queries of dynamic schemas

http://damienkatz.net/files/What is CouchDB.pdfhttp://damienkatz.net/files/What is CouchDB.pdf

The devil's in the details

Page 30: MongoDB is the MashupDB

why I

Page 31: MongoDB is the MashupDB

Mongo speaks JSON

Page 32: MongoDB is the MashupDB

Document Storage (BSON){ author: 'joe', created: Date('03-28-2009'), title: 'Yet another blog post', text: 'Here is the text...', tags: [ 'example', 'joe' ], comments: [ { author: 'jim', comment: 'I disagree' }, { author: 'nancy', comment: 'Good post' } ]}

http://www.mongodb.org/display/DOCS/BSON

Sure wish JSON did this...

B is for Binary

Page 33: MongoDB is the MashupDB

That looks like JSON

Page 34: MongoDB is the MashupDB

versatility

Page 35: MongoDB is the MashupDB
Page 37: MongoDB is the MashupDB

Introducing★Built For Speed★Dynamic Queries and Indexes★Replication and Failover★Sharding★Map / Reduce★Geospatial stuffs

can be very fast

Page 38: MongoDB is the MashupDB

MongoDB is great for★Websites

★Caching

★High volume, low value

★High scalability

★Storage of program objects and JSON

stash the hash

Your own queryable API mirror

Page 39: MongoDB is the MashupDB

Not as great for★Highly transactional

★Financial stuffs

★Problems requiring SQL

Page 40: MongoDB is the MashupDB

Installation★mkdir -p /data/db

★download pre-built for OSX and unzip to /usr/local/

★cp -R /usr/local/pathtomongo/bin /usr/local/bin

★gem install mongo

★gem install mongo_ext

★gem install mongo_mapper

Native C extensions ( go turbo! )

Ruby driver for MongoDB

Page 41: MongoDB is the MashupDB

brew install mongodb

Page 42: MongoDB is the MashupDB

Contents of mongo/bin★mongod - The MongoDB server

★mongo - the JavaScript interactive shell

★mongoexport - export data as JSON or csv

★mongoimport - As advertised

★mongodump - Like mysqldump

★mongorestore - Restore from mongodump "les

★mongos - Auto-sharding module (getting better with every build)

what's in the box?

why? What did you think it was?

Page 43: MongoDB is the MashupDB

Some new terms

Page 44: MongoDB is the MashupDB

Well that one isn't new...

When I say

databasethink

database

Page 45: MongoDB is the MashupDB

Databases in MongoDB★Made up of multiple collections

★Are created on-the-#y when "rst referenced

Page 46: MongoDB is the MashupDB

When I say

collectionthink

table

Page 47: MongoDB is the MashupDB

Collections in MongoDB★Schema-less

★For grouping documents into smaller query sets (speed)

★ Indexable by one or more key

★Are created on-the-#y when "rst referenced

★Capped collections: Fixed size, older records dropped after limit reached

but typed!

Page 48: MongoDB is the MashupDB

When I say

documentthink

record or row

Page 49: MongoDB is the MashupDB

The Ruby ecosystem

Page 50: MongoDB is the MashupDB

The Ruby driver

Page 51: MongoDB is the MashupDB

Queryingdb.collection.find({'first_name': 'John'}) # finds all Johns

db.collection.find({'first_name': /^wynn/i}) # regex

db.collection.find_first({'_id':1}) # finds first with _id of 1

db.collection.find({'age': {'$gte': 21}}) # finds possible drinkers

db.collection.find({'author.first_name':'John'}) # subdocument

db.collection.find({$where:'this.age >= 6 && this.age <= 18'})

Page 52: MongoDB is the MashupDB

Querying$in, $nin, $all, $ne, $gt, $gte, $lt, $lte, $size, $where

:fields (like :select in active record)

:limit, :offset for pagination

:sort ascending or descending [['foo', 1], ['bar', -1]]

count and group (uses map/reduce)

db.collection.mapReduce(mapfunction,reducefunction[,options]);

More

Page 53: MongoDB is the MashupDB

Updating$inc, $set, $unset, $push, $pushAll, $addToSet, $pop, $pull, $pullAll, $rename, $bit

Page 54: MongoDB is the MashupDB
Page 55: MongoDB is the MashupDB

Breakfast with

John Nunemaker

Page 56: MongoDB is the MashupDB

MongoMapper ★Mongo is not MySQL

★DSL for modeling domain should also teach you Mongo

★Now in version 0.8.6

from @jnunemaker

I voted for "Nunemapper"

Page 57: MongoDB is the MashupDB

Features★Typecasting

★Callbacks (ActiveSupport Callbacks)

★Validations

★Connection and database can differ per document

★Create and Update with single or multiple

★Delete and Destroy and _all counterparts

★Find: id, ids, :all, :"rst, :last

★Associations

Be careful. Ordering can be tricky.

Page 58: MongoDB is the MashupDB

Exampleclass User include MongoMapper::Document key :name, String, :required => true, :length => 5..100 key :email, String, :required => true, :index => true key :age, Integer, :numeric => true key :active, Boolean, :default => true one :address many :articlesend

class Address include MongoMapper::Document key :street, String key :city, String key :state, String, :length => 2 key :zip, Integer, :numeric => true, :length => 5end

Included as module, not subclassed

Page 59: MongoDB is the MashupDB

QueriesUser.where(:age.gt => 27).sort(:age).allUser.where(:age.gt => 27).sort(:age.desc).allUser.where(:age.gt => 27).sort(:age).limit(1).allUser.where(:age.gt => 27).sort(:age).skip(1).limit(1).all

Page 60: MongoDB is the MashupDB

Scopesclass User include MongoMapper::Document

# plain old vanilla scopes with fancy queries scope :johns, where(:name => 'John')

# plain old vanilla scopes with hashes scope :bills, :name => 'Bill'

# dynamic scopes with parameters scope :by_name, lambda { |name| where(:name => name) } scope :by_ages, lambda { |low, high| where(:age.gte => low, :age.lte => high) }

# Yep, even plain old methods work as long as they return a query def self.by_tag(tag) where(:tags => tag) end

# You can even make a method that returns a scope def self.twenties; by_ages(20, 29) end

key :name, String key :tags, Arrayend

Page 61: MongoDB is the MashupDB

Scopes# simple scopespp User.johns.firstpp User.bills.first

# scope with argpp User.by_name('Frank').first

# scope with two argspp User.by_ages(20, 29).all

# chaining class methods on scopespp User.by_ages(20, 40).by_tag('ruby').all

# scope made using method that returns scopepp User.twenties.all

Page 63: MongoDB is the MashupDB

Mongomatic

Page 64: MongoDB is the MashupDB

Mongomaticrequire 'mongomatic' class User < Mongomatic::Base  def validate    self.errors.add "name", "can't be empty" if self["name"].blank?    self.errors.add "email", "can't be empty" if self["email"].blank?  endend # set the db for all models:Mongomatic.db = Mongo::Connection.new.db("mongomatic_test")# or you can set it for a specific model:User.db = Mongo::Connection.new.db("mongomatic_test_user") 

Page 65: MongoDB is the MashupDB

MongomaticUser.empty?=> trueu = User.new(:name => "Ben")=> #<user:0x00000100d0cbf8 @doc="{"name"=">"Ben"}, @removed=false>u.valid?=> falseu["email"] = "[email protected]"u.valid?=> trueu.insert=> BSON::ObjectId('4c32834f0218236321000001') User.empty?=> false

Page 66: MongoDB is the MashupDB

MongoDB is open source

Page 67: MongoDB is the MashupDB

Mongo runs in the cloud

Page 68: MongoDB is the MashupDB

#mongodb

Page 69: MongoDB is the MashupDB

How can you help?★We need an awesome admin GUI

★Port some plugins (might get easier with ActiveModel support coming soon)

★Build something cool

Page 70: MongoDB is the MashupDB

Lessons learned in production★The laws of computing are still in effect

★ Indexes are important no matter what the salesman told ya about performance

★Data modeling. Deep or Wide?

★MongoDB and MongoMapper are in active development

the fine print

The answer is yes!

Very responsive yet very volatile changes!

Upserts FTW!

Page 71: MongoDB is the MashupDB

10gen

Page 72: MongoDB is the MashupDB

Get the backstory on