MongoDB is the MashupDB
-
Upload
wynn-netherland -
Category
Technology
-
view
104 -
download
0
description
Transcript of MongoDB is the MashupDB
MongoDB is the MashupDBMongo Austin ★ February 15, 2011
WYNNNETHERLAND
Stash the Hash!
whoami
+
Two years ago...
No SQL?x
x
MongoDB x
Let's just skip to here already!
x
One year ago
NoSQL Smackdown!
CTO of Amazon
Today
is it web scale?
x
xxHow do we get to here?
Enlightenment
Two years with Almost
^
Our journey
“We the Tweeple of the United States, in order to form a more perfect government, establish communication, and promote transparency do hereby tweet the Congress of the United States of America.”
The idea: A Twitter directory for the US Congressand aggregator
^
When does NOSQL make sense?★Your data is stored and retrieved mainly by primary key, without
complex joins.★You have a non-trivial amount of data, and the thought of managing
lots of RDBMS shards and replication failure scenarios gives you the fear.
http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores/
Key value storesProject VoldemortRingoScalarisKaiDynomiteMemcacheDBThruDBCouchDBCassandraHBaseHypertableRedisTokyo Cabinet/TyrantRiakNeo4J
Some very coolprojects in this space
Tokyo Cabinet
CouchDB
Apache CouchDB is a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API.
http://couchdb.apache.org/
SQL CouchDB
Prede!ned, explicit schema Dynamic, implicit schema
Uniform tables of data Collection of named documents with varying structure
Normalized. Objects spread across tables. Duplication reduced.
Denormalized. Docs usually self contained. Data often duplicated.
Must know schema to read/write a complete object Must know only document name
Dynamic queries of static schemas Static queries of dynamic schemas
http://damienkatz.net/files/What is CouchDB.pdfhttp://damienkatz.net/files/What is CouchDB.pdf
SQL CouchDB
Prede!ned, explicit schema Dynamic, implicit schema
Uniform tables of data Collection of named documents with varying structure
Normalized. Objects spread across tables. Duplication reduced.
Denormalized. Docs usually self contained. Data often duplicated.
Must know schema to read/write a complete object Must know only document name
Dynamic queries of static schemas Static queries of dynamic schemas
http://damienkatz.net/files/What is CouchDB.pdfhttp://damienkatz.net/files/What is CouchDB.pdf
The devil's in the details
why I
Mongo speaks JSON
Document Storage (BSON){ author: 'joe', created: Date('03-28-2009'), title: 'Yet another blog post', text: 'Here is the text...', tags: [ 'example', 'joe' ], comments: [ { author: 'jim', comment: 'I disagree' }, { author: 'nancy', comment: 'Good post' } ]}
http://www.mongodb.org/display/DOCS/BSON
Sure wish JSON did this...
B is for Binary
That looks like JSON
versatility
Runs like Hayes.Hits like Mays.
Introducing★Built For Speed★Dynamic Queries and Indexes★Replication and Failover★Sharding★Map / Reduce★Geospatial stuffs
can be very fast
MongoDB is great for★Websites
★Caching
★High volume, low value
★High scalability
★Storage of program objects and JSON
stash the hash
Your own queryable API mirror
Not as great for★Highly transactional
★Financial stuffs
★Problems requiring SQL
Installation★mkdir -p /data/db
★download pre-built for OSX and unzip to /usr/local/
★cp -R /usr/local/pathtomongo/bin /usr/local/bin
★gem install mongo
★gem install mongo_ext
★gem install mongo_mapper
Native C extensions ( go turbo! )
Ruby driver for MongoDB
brew install mongodb
Contents of mongo/bin★mongod - The MongoDB server
★mongo - the JavaScript interactive shell
★mongoexport - export data as JSON or csv
★mongoimport - As advertised
★mongodump - Like mysqldump
★mongorestore - Restore from mongodump "les
★mongos - Auto-sharding module (getting better with every build)
what's in the box?
why? What did you think it was?
Some new terms
Well that one isn't new...
When I say
databasethink
database
Databases in MongoDB★Made up of multiple collections
★Are created on-the-#y when "rst referenced
When I say
collectionthink
table
Collections in MongoDB★Schema-less
★For grouping documents into smaller query sets (speed)
★ Indexable by one or more key
★Are created on-the-#y when "rst referenced
★Capped collections: Fixed size, older records dropped after limit reached
but typed!
When I say
documentthink
record or row
The Ruby ecosystem
The Ruby driver
Queryingdb.collection.find({'first_name': 'John'}) # finds all Johns
db.collection.find({'first_name': /^wynn/i}) # regex
db.collection.find_first({'_id':1}) # finds first with _id of 1
db.collection.find({'age': {'$gte': 21}}) # finds possible drinkers
db.collection.find({'author.first_name':'John'}) # subdocument
db.collection.find({$where:'this.age >= 6 && this.age <= 18'})
Querying$in, $nin, $all, $ne, $gt, $gte, $lt, $lte, $size, $where
:fields (like :select in active record)
:limit, :offset for pagination
:sort ascending or descending [['foo', 1], ['bar', -1]]
count and group (uses map/reduce)
db.collection.mapReduce(mapfunction,reducefunction[,options]);
More
Updating$inc, $set, $unset, $push, $pushAll, $addToSet, $pop, $pull, $pullAll, $rename, $bit
Breakfast with
John Nunemaker
MongoMapper ★Mongo is not MySQL
★DSL for modeling domain should also teach you Mongo
★Now in version 0.8.6
from @jnunemaker
I voted for "Nunemapper"
Features★Typecasting
★Callbacks (ActiveSupport Callbacks)
★Validations
★Connection and database can differ per document
★Create and Update with single or multiple
★Delete and Destroy and _all counterparts
★Find: id, ids, :all, :"rst, :last
★Associations
Be careful. Ordering can be tricky.
Exampleclass User include MongoMapper::Document key :name, String, :required => true, :length => 5..100 key :email, String, :required => true, :index => true key :age, Integer, :numeric => true key :active, Boolean, :default => true one :address many :articlesend
class Address include MongoMapper::Document key :street, String key :city, String key :state, String, :length => 2 key :zip, Integer, :numeric => true, :length => 5end
Included as module, not subclassed
QueriesUser.where(:age.gt => 27).sort(:age).allUser.where(:age.gt => 27).sort(:age.desc).allUser.where(:age.gt => 27).sort(:age).limit(1).allUser.where(:age.gt => 27).sort(:age).skip(1).limit(1).all
Scopesclass User include MongoMapper::Document
# plain old vanilla scopes with fancy queries scope :johns, where(:name => 'John')
# plain old vanilla scopes with hashes scope :bills, :name => 'Bill'
# dynamic scopes with parameters scope :by_name, lambda { |name| where(:name => name) } scope :by_ages, lambda { |low, high| where(:age.gte => low, :age.lte => high) }
# Yep, even plain old methods work as long as they return a query def self.by_tag(tag) where(:tags => tag) end
# You can even make a method that returns a scope def self.twenties; by_ages(20, 29) end
key :name, String key :tags, Arrayend
Scopes# simple scopespp User.johns.firstpp User.bills.first
# scope with argpp User.by_name('Frank').first
# scope with two argspp User.by_ages(20, 29).all
# chaining class methods on scopespp User.by_ages(20, 40).by_tag('ruby').all
# scope made using method that returns scopepp User.twenties.all
Mongomatic
Mongomaticrequire 'mongomatic' class User < Mongomatic::Base def validate self.errors.add "name", "can't be empty" if self["name"].blank? self.errors.add "email", "can't be empty" if self["email"].blank? endend # set the db for all models:Mongomatic.db = Mongo::Connection.new.db("mongomatic_test")# or you can set it for a specific model:User.db = Mongo::Connection.new.db("mongomatic_test_user")
MongomaticUser.empty?=> trueu = User.new(:name => "Ben")=> #<user:0x00000100d0cbf8 @doc="{"name"=">"Ben"}, @removed=false>u.valid?=> falseu["email"] = "[email protected]"u.valid?=> trueu.insert=> BSON::ObjectId('4c32834f0218236321000001') User.empty?=> false
MongoDB is open source
Mongo runs in the cloud
#mongodb
How can you help?★We need an awesome admin GUI
★Port some plugins (might get easier with ActiveModel support coming soon)
★Build something cool
Lessons learned in production★The laws of computing are still in effect
★ Indexes are important no matter what the salesman told ya about performance
★Data modeling. Deep or Wide?
★MongoDB and MongoMapper are in active development
the fine print
The answer is yes!
Very responsive yet very volatile changes!
Upserts FTW!
10gen
Get the backstory on