Schema design with MongoDB (Dwight Merriman)

28
Schema Design with MongoDB Dwight Merriman CEO 10gen

Transcript of Schema design with MongoDB (Dwight Merriman)

Page 1: Schema design with MongoDB (Dwight Merriman)

Schema Design with MongoDB

Dwight MerrimanCEO

10gen

Page 2: Schema design with MongoDB (Dwight Merriman)

What is document-oriented?

• JSON objects• Not relational• Not OODB– Database schema != program “schema”

Page 3: Schema design with MongoDB (Dwight Merriman)

Terms

• Row -> JSON document• Tables -> collections• Indexes -> index• Join -> embedding and linking

Page 4: Schema design with MongoDB (Dwight Merriman)

Choose a schema that

• makes queries easy• makes queries fast• facilitates atomicity• facilitates sharding

Page 5: Schema design with MongoDB (Dwight Merriman)

Key question: embed vs. link

• “Contains relationship” : embed• Embed = “pre-joined”• Links: client/server turnarounds• On a close call, embed. Use rich documents.• Note the 4MB object size limit– Arbitrary limit but pushes one towards good

designs

Page 6: Schema design with MongoDB (Dwight Merriman)
Page 7: Schema design with MongoDB (Dwight Merriman)
Page 8: Schema design with MongoDB (Dwight Merriman)
Page 9: Schema design with MongoDB (Dwight Merriman)
Page 10: Schema design with MongoDB (Dwight Merriman)
Page 11: Schema design with MongoDB (Dwight Merriman)

Map-Reduce

Page 12: Schema design with MongoDB (Dwight Merriman)

{ _id : … }

• Should be:– Unique– Invariant– Ideally, not reused (delete; insert)– ObjectID type often best for a sharded collection

Page 14: Schema design with MongoDB (Dwight Merriman)

Single “Table” Inheritance Works Well

> t.find(){ type:’irregular-shape’, area:99 }{ type:’circle’, area:3.14, radius:1 }{ type:’square’, area:4, d:2 }{ type:’rect’, area:8, x:2, y:4 }

> t.find( { radius : { $gt : 2.0 } } )

> t.ensureIndex( { radius : 1 } ) // fine

Page 15: Schema design with MongoDB (Dwight Merriman)

(1) Full tree in one document

{ comments: [ {by: "mathias", text: "...", replies: []} {by: "eliot", text: "...", replies: [ {by: "mike", text: "...", replies: []} ]} ] }

•Pros:• Single document to fetch per page• One location on disk for whole tree• You can see full structure easily

•Cons:• Hard to search• Hard to get back partial results• 4MB limit

Page 16: Schema design with MongoDB (Dwight Merriman)

(2) Parent Links> t = db.tree1;

> t.find(){ "_id" : 1 }{ "_id" : 2, "parent" : 1 }{ "_id" : 3, "parent" : 1 }{ "_id" : 4, "parent" : 2 }{ "_id" : 5, "parent" : 4 }{ "_id" : 6, "parent" : 4 }

> // find children of node 4> t.ensureIndex({parent:1})> t.find( {parent : 4 } ){ "_id" : 5, "parent" : 4 }{ "_id" : 6, "parent" : 4 }

// hard to get all descendants

Page 17: Schema design with MongoDB (Dwight Merriman)

(3) Array of Ancestors> t = db.mytree;

> t.find(){ "_id" : "a" }{ "_id" : "b", "ancestors" : [ "a" ], "parent" : "a" }{ "_id" : "c", "ancestors" : [ "a", "b" ], "parent" : "b" }{ "_id" : "d", "ancestors" : [ "a", "b" ], "parent" : "b" }{ "_id" : "e", "ancestors" : [ "a" ], "parent" : "a" }{ "_id" : "f", "ancestors" : [ "a", "e" ], "parent" : "e" }{ "_id" : "g", "ancestors" : [ "a", "b", "d" ], "parent" : "d" }

> t.ensureIndex( { ancestors : 1 } )

> // find all descendents of b:> t.find( { ancestors : 'b' }){ "_id" : "c", "ancestors" : [ "a", "b" ], "parent" : "b" }{ "_id" : "d", "ancestors" : [ "a", "b" ], "parent" : "b" }{ "_id" : "g", "ancestors" : [ "a", "b", "d" ], "parent" : "d" }

> // get all ancestors of f:> anc = db.mytree.findOne({_id:'f'}).ancestors[ "a", "e" ]> db.mytree.find( { _id : { $in : anc } } ){ "_id" : "a" }{ "_id" : "e", "ancestors" : [ "a" ], "parent" : "a" }

Page 18: Schema design with MongoDB (Dwight Merriman)

Atomicity

• Atomicity at the document level– $operators– Compare and swap

Page 19: Schema design with MongoDB (Dwight Merriman)

Compare and Swap

> t=db.inventory > s = t.findOne( {sku:'abc'} )> --s.qty;> t.update({_id:s._id, qty:qty_old}, s);> db.getLastError(){"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked

Page 20: Schema design with MongoDB (Dwight Merriman)

Compare and Swap

> t=db.inventory > s = t.findOne( {sku:'abc'} )> --s.qty;> t.update({_id:s._id, qty:qty_old}, s);> db.getLastError(){"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked

Oops?

Page 21: Schema design with MongoDB (Dwight Merriman)

Compare and Swap - Better

> t=db.inventory > s = t.findOne( {sku:'abc'} )> obj_old = Object.extend({}, s);> --s.qty;> // t.update({_id:s._id, qty:qty_old}, s);> t.update( obj_old , s);> print( db.getLastError().ok ? “worked” : “try again” );

Page 22: Schema design with MongoDB (Dwight Merriman)

Compare and Swap – versions

update( { _id : myid, ver : last_ver }, { $set : { x : “abc”, y : 99 }, $inc : { ver : 1 } } )

Page 23: Schema design with MongoDB (Dwight Merriman)

Of course, don’t both doing CASwhen you don’t have to

> t.update( { sku : “abc”, qty : {$gt:0} }, { $inc : { qty : -1 } } )

db.getLastError()

{ "updatedExisting" : true , "n" : 1 , "ok" : 1 }

{ "updatedExisting" : false , "n" : 0 , "ok" : 1 }

Page 24: Schema design with MongoDB (Dwight Merriman)

Compare and Swap

> t=db.inventory > s = t.findOne( {sku:'abc'} )> --s.qty;> t.update({_id:s._id, qty:qty_old}, s);> db.getLastError(){"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked

Page 25: Schema design with MongoDB (Dwight Merriman)

Compare and Swap

> t=db.inventory > s = t.findOne( {sku:'abc'} )> --s.qty;> t.update({_id:s._id, qty:qty_old}, s);> db.getLastError(){"err" : , "updatedExisting" : true , "n" : 1 , "ok" : 1} // it worked

Page 26: Schema design with MongoDB (Dwight Merriman)

Sharding and Schemas

• Shard key selection• Restrictions on unique indexes• Consider using may collections when that is

natural 10 SN

Page 27: Schema design with MongoDB (Dwight Merriman)

Other Considerations

• Capped collections

Page 28: Schema design with MongoDB (Dwight Merriman)

Questions?

Get involved with the MongoDB project!Coding, drivers, frameworks, documentation,

translation, consulting, evangelism, suggestions, vote on jira…spread the word.