Webinar: Utilisations courantes de MongoDB
-
Upload
mongodb -
Category
Technology
-
view
741 -
download
2
description
Transcript of Webinar: Utilisations courantes de MongoDB
When to use MongoDB
Part 1 of a series
Real-Time
Analytics with
Mongodb
April 12th
Content
Management with
MongoDB
May 17th
@forjared
TodayLast 10 years
Emerging NoSQL Space
RDBMS
Data
Warehou
se
NoSQL
RDBMS
Data
Warehou
se
The beginning
RDBMS
Qualities of NoSQL
Workloads
Flexible data models
• Lists, Nested Objects
• Sparse schemas
• Semi-structured data
• Agile Development
High Throughput
• Lots of reads
• Lots of writes
Large Data Sizes
• Aggregate data size
• Number of objects
Low Latency
• Both reads and writes
• Millisecond latency
Cloud Computing
• Run anywhere
• No assumptions about hardware
• No / Few Knobs
Commodity Hardware
• Ethernet
• Local disks
MongoDB was designed for
this
Flexible data models
• Lists, Nested Objects
• Sparse schemas
• Semi-structured data
• Agile Development
High Throughput
• Lots of reads
• Lots of writes
Large Data Sizes
• Aggregate data size
• Number of objects
Low Latency
• Both reads and writes
• Millisecond latency
Cloud Computing
• Run anywhere
• No assumptions about hardware
• No / Few Knobs
Commodity Hardware
• Ethernet
• Local disks
• JSON based
object model
• Dynamic
schemas
• Replica Sets to
scale reads
• Sharding to
scale writes
• 1000’s of shards
in a single DB
• Partitioning of
data
• In-memory
cache
• Scale-out
working set
• Scale-out to
overcome
hardware
limitations
• Designed for
“typical” OS and
local file system
Example customers
User Data Management High Volume Data Feeds
Content Management Operational Intelligence Product Data Management
USE CASES THAT
LEVERAGE NOSQL
High Volume Data Feeds
• More machines, more sensors, more data
• Variably structured
Machine Generated
Data
• High frequency tradingStock Market
Data
• Multiple sources of data
• Each changes their format constantly
Social Media Firehose
High Volume Data Feed
Data
Sources
Asynchronous writes
Flexible document
model can adapt to
changes in sensor
format
Write to memory with
periodic disk flush
Data
SourcesData
SourcesData
Sources
Scale writes over
multiple shards
Operational Intelligence
• Large volume of state about users
• Very strict latency requirementsAd Targeting
• Expose report data to millions of customers
• Report on large volumes of data
• Reports that update in real time
Customer Facing
Dashboards
• Need to join the conversation _now_Social Media Monitoring
Operational Intelligence
Dashboards
API
Low latency readsParallelize queries
across replicas and
shards
In database
aggregation
Flexible schema
adapts to changing
input dataCan use same cluster
to collect, store, and
report on data
Behavioral Profiles
1
2
3
See Ad
See Ad
4
Click
Convert
{ cookie_id: ‚1234512413243‛,advertiser:{
apple: {actions: [
{ impression: ‘ad1’, time: 123 },{ impression: ‘ad2’, time: 232 },{ click: ‘ad2’, time: 235 },{ add_to_cart: ‘laptop’,
sku: ‘asdf23f’, time: 254 },
{ purchase: ‘laptop’, time: 354 } ]
}}
}
Rich profiles
collecting multiple
complex actions
Scale out to support
high throughput of
activities tracked
Indexing and
querying to support
matching, frequency
capping
Dynamic schemas
make it easy to track
vendor specific
attributes
Product Data
• Diverse product portfolio
• Complex querying and filtering
E-Commerce Product Catalog
• Scale for short bursts of high volume traffic
• Scalable, but consistent view of inventoryFlash Sales
Product Data
{ sku: ‚00e8da9b‛,type: ‚MP3‛,details: {
artist: ‚John Coltrane‛,title: ‚A love supreme‛, length: 123
}}
{ sku: ‚00a9f3a‛,type: ‚Book‛,details: {
author: ‚David Eggers‛,title: ‚You shall know our velocity‛, isbn: ‚0-9703355-5-5‛
}}
Flexible data model
for similar, but
different objects
Indexing and rich
query API for easy
searching and sorting
db.products.find({ ‚details.author”: ‚David Eggers‛ }).sort({ ‚title‛ : -1 });
Content Management
• Comments and user generated content
• Personalization of content, layoutNews Site
• Generate layout on the fly for each device that connects
• No need to cache static pages
Multi-Device rendering
• Store large objects
• Simple modeling of metadataSharing
Content Management
{ camera: ‚Nikon d4‛,location: [ -122.418333, 37.775 ]
}
{ camera: ‚Canon 5d mkII‛,people: [ ‚Jim‛, ‚Carol‛ ], taken_on: ISODate("2012-03-07T18:32:35.002Z")
}
{ origin: ‚facebook.com/photos/xwdf23fsdf‛,license: ‚Creative Commons CC0‛, size: {
dimensions: [ 124, 52 ],units: ‚pixels‛
}}
Flexible data model
for similar, but
different objects
Horizontal scalability
for large data sets
Geo spatial indexing
for location based
searchesGridFS for large
object storage
User Data Management
• User state and session management
Video Games
• Scale out to large graphs
• Easy to search and processSocial Graphs
• Authentication, Authorization and Accounting
Identity Management
User Game State
Flexible documents
supports new game
features without
schema migration
Sharding enables
whole data set to be
in memory, ensuring
low latency
JSON data model
maps well to
HTML5/JS & Flash
based clients
Easy to store entire
player state in a
single document.
Social Graph
Social Graphs
Documents enable
disk locality of all
profile data for a user
Sharding partitions
user profiles across
available servers
Native support for
Arrays makes it easy
to store connections
inside user profile
IS MY USE CASE A GOOD
FIT FOR MONGODB?
Good fits for MongoDB
Application Characteristic Why MongoDB might be a good fit
Large number of objects to
store
Sharding lets you split objects across multiple
servers
High write or read throughput Sharding + Replication lets you scale read and
write traffic across multiple servers
Low Latency Access Memory Mapped storage engine caches
documents in RAM, enabling in-memory
performance. Data locality of documents can
significantly improve latency over join based
approaches
Variable data in objects Dynamic schema and JSON data model enable
flexible data storage without sparse tables or
complex joins
Cloud based deployment Sharding and replication let you work around
hardware limitations in clouds.
Thanks!
Real-Time Analytics
April 12th
MongoDB and AWS CloudFormation
April 25th
New Aggregation Framework
May 10th
Content Management
May 17th