Node.js, MongoDB and You: Part I
Mitch Pirtle jsDay 2014, Verona Italy - @jsdayit
First, tell me about yourselves.
New to Node?
Come on, be honest!
New to Node?
New to MongoDB?
Does your Javascript totally suck?
Ok my Javascript totally sucks.
Now about me.
Mitch Pirtle
• Recovering Joomla! founder
• Mongo Master
• Starting companies since 1995
• Musician, skate punk, football coach
• American idiot living in Turin
Important Mitch Facts
• I am not cool. However I have been called perky.
• I am not Rich. My name is Mitch. Such is life.
• I am internet famous. Just to be clear: Internet Famous + $1.50 = $1.50
About this talk.
Ok, technically there are three talks today.
• Session 1: All about MongoDB (this one)!
• Session 2: All about Node.js (that’s next)
• Session 3: The coolness of both together
That’s a lotta lotta stuff to
cover
STAY ALERT
All About MongoDB
• Brief introduction to MongoDB
• CONSOLE!
• Really cool discoveries and surprises
• Shameful admissions and painful stories
In The Beginning• We had relational databases. Back then they
were called “databases” and that’s where you stored your data.
• Primary focus: atomicity, consistency, reliability.
• Was normal to spend 6 hours. ON ONE QUERY.
• I love vacuum tubes, keep you warm in winter.
• Life was good.
What Happened
• Hello, Internet!
• Databases became immediate source of pain for scale, performance
• Traffic grew, along with it came bigger expectations, infinitely more complexity, a slew of new platforms, and Big Data™
That sure looks like a nail to me.
Troubled Relations
• Web languages gravitated toward objects, not 3NF entites/relations
• Size of data needed to live on more than one physical machine
• Performance requirements needed to be far better
Along came sharding
• Can split your data across multiple machines
• Also splits your query load across multiple machines
• Like RAID for your data, right?
What sharding brought along for the ride
• How do you back this stuff up?
• How do you spread a group query across N machines again?
• How do you run a join query that spans a sharded table?
All those hours, spent mastering 3NF and
procedural programming
IMPORTANT LESSON:
It is REALLY hard to scale a relational database
engine.
The common approach pushed logic out of the database back
into the application tier.
Then why use a relational database in the first place?
Then there was…
The Promises of MongoDB• Speed - crazy whack-daddy fast
• Simplicity - JSON documents FTW
• Embedded documents
• 16MB limit
• Scale - sharding, multimaster out of the box
• Yes, I said whack-daddy.
ENOUGH TALKBRING ON THE CONSOLE
Wait, there’s more• Fulltext: Allows for compound indexes, supports
many languages
• Sharding: You can scale collections across N machines
• GridFS: Simple interface to store files in your database (CONSOLE!)
• Multimaster: Replica Sets make it possible for read slaves, failover, redundancy
Now some cool stories
Mini Case Study: Totsy• First ecommerce site to rely on MongoDB for all
data. Everything. Even product images and associated media.
• I suspected it would be fast.
• I suspected we could develop quickly. (This was important, as they only let me hire one guy.)
So how fast was it?
Launch story• Went live with MongoDB on a quad-core
consumer grade el-cheapo machine, only 2GB RAM.
• I was terrified.
• Over a million moms waiting for the launch.
• Upon launch, load was 0.05. Highest it ever got was around 0.5.
Was development quicker?
Development impact• Simple models make for less code. There were
no sixteen-table joins, no ORM, one result had all the data needed from a single query.!
• Less code makes for less bugs. No more six-hour query debugging marathons. No more learning why UNION was faster than JOIN…
• Less bugs leaves time for more code. Did I mention they only let me hire one guy?
Even moar impact
• Used GridFS for all media storage.
• Allowed free MD5 checking for duplicates.
• Allowed storage of metadata per file (views, comments, rates, whatever else we wanted).
• No need for NFS, clumsy rsync cronjobs, high costs of NAS or iSCSI.
Now some sad stories.
The perils of schemaless
• Started prototyping quickly enough
• Made a couple changes to user model
• Made some more changes…
• WHUPS WHY FIFTEEN KINDS OF USER?!?!
Remember: Always update existing data
when changing models.
Everything in the database!
• Backups were brutal
• Forgot to separate GridFS data from main database
• Totally unprepared for the operational impact
Remember: Operational impact BEFORE you
launch.
Stump the Geek™
Thanks!
• AboutMe
• @mitchitized - Twitter
• spacemonkey - GitHub
• LinkedIn - I’M AVAILABLE!
Top Related