An Introduction to Mongo DB
-
Upload
weareesynergy -
Category
Technology
-
view
62 -
download
1
description
Transcript of An Introduction to Mongo DB
Introduction to MongoDB Matthew Bates Solution Architect MongoDB EMEA
2
MongoDB
The leading NoSQL database
Document Database
Open-Source
General Purpose
3
Relational Database Challenges
Data Types
• Unstructured data
• Semi-structured data
• Polymorphic data
Volume of Data
• Petabytes of data
• Trillions of records
• Millions of queries per second
Agile Development
• Iterative
• Short development cycles
• New workloads
New Architectures
• Horizontal scaling
• Commodity servers
• Cloud computing
4
Document Data Model
Relational - Tables { first_name: ‘Paul’,! surname: ‘Miller’,! city: ‘London’,! location: {!
!type: “Point”, !coordinates :!! ![-0.128, 51.507]!!},!
cars: [ ! { model: ‘Bentley’,! year: 1973,! value: 100000, … },! { model: ‘Rolls Royce’,! year: 1965,! value: 330000, … }! }!}!
Document - Collections
5
Dynamic Schema Documents
name: “jeff”, eyes: “blue”, height: 72, boss: “ben”}
{name: “brendan”, aliases: [“el diablo”]}
name: “ben”, hat: ”yes”}
{name: “matt”, pizza: “DiGiorno”, height: 72, boss: 555.555.1212}
{name: “will”, eyes: “blue”, birthplace: “NY”, aliases: [“bill”, “la ciacco”], gender: ”???”, boss: ”ben”}
6
Document Model
• Agility and flexibility – dynamic schema – Data models can evolve easily
– Companies can adapt to changes quickly
• Intuitive, natural data representation – Remove impedance mismatch
– Many types of applications are a good fit
• Reduces the need for joins, disk seeks – Programming is more simple
– Performance can be delivered at scale
7
Simplify Development
8
Simplify Development
9
Rich Database Interaction
10
MongoDB: An Operational Database
11
Shell Command-line shell for interacting directly with database
Shell and Drivers
Drivers Drivers for most popular programming languages and frameworks
> db.collection.insert({company:“10gen”, product:“MongoDB”}) > > db.collection.findOne() {
“_id” : ObjectId(“5106c1c2fc629bfe52792e86”), “company” : “10gen” “product” : “MongoDB”
}
Java
Python
Perl
Ruby
Haskell
JavaScript
12
• Ad-hoc queries • Real-time aggregation • Rich query capabilities
• Index support - secondary, compound, text, geo and more
• Strong (traditional) consistency • Geospatial features • Support for most programming languages
• Flexible schema
MongoDB is Full Featured
13
Features In Practice
Queries • Find Paul’s cars • Find everybody in London with a car
built between 1970 and 1980
Geospatial • Find all of the car owners within 5km of Trafalgar Sq.
Text Search • Find all the cars described as having leather seats
Aggregation • Calculate the average value of Paul’s car collection
Map Reduce • What is the ownership pattern of colors
by geography over time? (is purple trending up in China?)
{ first_name: ‘Paul’,! surname: ‘Miller’,! city: ‘London’,! location: {!
!type: “Point”, !coordinates :!! ![-0.128, 51.507]!!},!
cars: [ ! { model: ‘Bentley’,! year: 1973,! value: 100000, … },! { model: ‘Rolls Royce’,! year: 1965,! value: 330000, … }! }!}!
14
Query Example
Rich Queries • Find Paul’s cars • Find everybody in London with a car
built between 1970 and 1980
db.cars.find({ first_name: ‘Paul’!
}) db.cars.find({
!city: ‘London’, ”cars.year" : { $gte : 1970, $lte : 1980 }
}) !
{ first_name: ‘Paul’,! surname: ‘Miller’,! city: ‘London’,! location: {!
!type: “Point”, !coordinates :!! ![-0.128, 51.507]!!},!
cars: [ ! { model: ‘Bentley’,! year: 1973,! value: 100000, … },! { model: ‘Rolls Royce’,! year: 1965,! value: 330000, … }! }!}!
15
Geo Spatial Example
db.cars.find( { location:
{ $near : { $geometry : { type: 'Point' , coordinates :
[-0.128, 51.507] } },
$maxDistance :5000 } } ) !
Geospatial • Find all of the car owners within 5km of Trafalgar Sq.
{ first_name: ‘Paul’,! surname: ‘Miller’,! city: ‘London’,! location: {!
!type: “Point”, !coordinates :!! ![-0.128, 51.507]!!},!
cars: [ ! { model: ‘Bentley’,! year: 1973,! value: 100000, … },! { model: ‘Rolls Royce’,! year: 1965,! value: 330000, … }! }!}!
16
Aggregation Framework Example
db.cars.aggregate( [
{$match : {"first_name" : "Paul"}}, {$project : {"first_name":1,"cars":1}}, {$unwind : "$cars"}, { $group : {_id:"$first_name", average : { $avg : "$cars.value"}}}
]) { "_id" : "Paul", "average" : 215000 } !
Aggregation • Calculate the average value of Paul’s car collection
{ first_name: ‘Paul’,! surname: ‘Miller’,! city: ‘London’,! location: {!
!type: “Point”, !coordinates :!! ![-0.128, 51.507]!!},!
cars: [ ! { model: ‘Bentley’,! year: 1973,! value: 100000, … },! { model: ‘Rolls Royce’,! year: 1965,! value: 330000, … }! }!}!
Scalability
18
Automatic Sharding
• Three types of sharding: hash-based, range-based, tag-aware
• Increase or decrease capacity as you go
• Automatic balancing
19
Query Routing
• Multiple query optimization models
• Each sharding option appropriate for different apps
Availability
21
Replica Sets
• Replica Set – two or more copies
• “Self-healing” shard
• Addresses many concerns:
- High Availability
- Disaster Recovery
- Maintenance
22
Replica Set Benefits
Business Needs Replica Set Benefits
High Availability Automated failover
Disaster Recovery Hot backups offsite
Maintenance Rolling upgrades
Low Latency Locate data near users
Workload Isolation Read from non-primary replicas
Data Privacy Restrict data to physical location
Data Consistency Tunable Consistency
Performance
24
Better Data Locality
Performance
In-Memory Caching
In-Place Updates
25
Disk Seeks and Data Locality
Seek = 5+ ms Read = really really fast
User Comment
Article
26
Disk Seeks and Data Locality
Article
User
Comment Comment Comment Comment Comment
Use Cases
28
MongoDB Use Cases
Big Data Product & Asset Catalogs
Security & Fraud
Internet of Things
Database-as-a-Service
Mobile Apps
Customer Data Management
Data Hub
Social & Collaboration
Content Management
Intelligence Agencies
Top Investment and Retail Banks
Top US Retailer
Top Global Shipping Company
Top Industrial Equipment Manufacturer
Top Media Company
Top Investment and Retail Banks
29
• Document Model – Simplify development – Simplify scale out – Improve performance
• MongoDB – leading NoSQL database – Rich general purpose database – Fully featured – Built-in HA (High Availability) and automated failover – Built-in horizontal scale-out
Summary
30
Getting Started
http://www.mongodb.org/downloads
31
Online Documentation
http://docs.mongodb.org
32
MongoDB University
http://university.mongodb.com
33
For More Information
Resource Location
MongoDB Downloads mongodb.com/download
Free Online Training education.mongodb.com
Webinars and Events mongodb.com/events
White Papers mongodb.com/white-papers
Case Studies mongodb.com/customers
Presentations mongodb.com/presentations
Documentation docs.mongodb.org
Additional Info [email protected]
Resource Location