Webinar: Gaining Insights into MongoDB with MongoDB Cloud Manager and New Relic
Webinar: MongoDB and Hadoop - Working Together to provide Business Insights
-
Upload
mongodb -
Category
Technology
-
view
113 -
download
3
description
Transcript of Webinar: MongoDB and Hadoop - Working Together to provide Business Insights
![Page 1: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/1.jpg)
MongoDB & Hadoop:Providing Business Insights
Thomas BoydSenior Solutions Architect, MongoDB
![Page 2: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/2.jpg)
2
What is MongoDB?
The leading NoSQL database
Document Database
Open-Source
General Purpose
![Page 3: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/3.jpg)
3
RDBMS
MongoDB Document Model
MongoDB
{
_id : ObjectId("4c4ba5e5e8aabf3"),
employee_name: "Dunham, Justin",
department : "Marketing",
title : "Product Manager, Web",
report_up: "Neray, Graham",
pay_band: “C",
benefits : [
{ type : "Health",
plan : "PPO Plus" },
{ type : "Dental",
plan : "Standard" }
]
}
![Page 4: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/4.jpg)
4
What is Hadoop?
“The Apache Hadoop software library is a framework that allows for the distributed processing of large data sets across clusters of computers using simple programming models.”*
*source: hadoop.apache.org
• Large datasets• Analytics• Batch• Map-Reduce
![Page 5: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/5.jpg)
5
Enterprise IT Stack
EDWHadoop
Man
agem
ent
& M
on
ito
rin
gS
ecurity &
Au
ditin
g
RDBMS
CRM, ERP, Collaboration, Mobile, BI
OS & Virtualization, Compute, Storage, Network
RDBMS
Applications
Infrastructure
Data Management
Online Data Offline Data
![Page 6: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/6.jpg)
6
Consideration: Online vs. Offline
• Long-running• High-Latency• Availability is lower
priority
• Real-time• Low-latency• High availability
Online Offlinevs.
![Page 7: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/7.jpg)
7
Consideration: Online vs. Offline
Online Offlinevs.
![Page 8: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/8.jpg)
8
Hadoop is good for…
Risk Modeling Churn AnalysisRecommendation
Engine
Ad TargetingTransaction
AnalysisTrade
Surveillance
Network Failure Prediction
Search Quality Data Lake
![Page 9: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/9.jpg)
9
MongoDB is good for…
360 Degree View of the Customer
Mobile & Social Apps
Fraud Detection
User Data Management
Content Management &
DeliveryReference Data
Product CatalogsMachine to
Machine AppsData Hub
![Page 10: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/10.jpg)
10
MongoDB and Hadoop: Complementary
• “Data Lake”• In-depth analytics
• Real-time systems• Light-weight analytical
workloads
![Page 11: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/11.jpg)
11
Use MongoDB+Hadoop Together
E-Commerce
• Products & Inventory• Real-time
recommendations• Customer profile• Session management• Customer clickstream• Fraud detection
• Transaction history• Clickstream history• Recommendation
model• Fraud modeling
Analysis
MongoDB Connector for
Hadoop
![Page 12: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/12.jpg)
12
Example – Fraud Detection
Payments
• Fraud modeling
Nightly Analysis
MongoDB Connector for
Hadoop
Results Cache
• Online payments processing
3rd Party Data Sources
Fraud Detection
queryonly
query only
![Page 13: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/13.jpg)
13
Customer example – Global Travel Firm
Travel
• Flights, hotels and cars
• Real-time offers• User profiles,
reviews• User metadata
(previous purchases, clicks, views)
• User segmentation• Offer recommendation
engine• Ad serving engine• Bundling engine
Algorithms
MongoDB Connector for
Hadoop
![Page 14: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/14.jpg)
14
Customer example – MetLife
Insurance
• Insurance policies• Demographic data• Customer web data• Call center data• Real-time churn
detection
• Customer action analysis
• Churn prediction algorithms
Churn Analysis
MongoDB Connector for
Hadoop
![Page 15: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/15.jpg)
15
Customer example – Criteo
Ad-Serving
• Catalogs and products
• User profiles• Clicks• Views• Transactions
• User segmentation• Recommendation
engine• Prediction engine
Algorithms
MongoDB Connector for
Hadoop
![Page 16: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/16.jpg)
16
• Java Map-Reduce, Stream Map-Reduce, Pig, & Hive access to MongoDB– MongoDB as input
• mongo.job.input.format=com.hadoop.MongoInputFormat• mongo.input.uri=mongodb://my-db:27017/db1.collection1
– MongoDB as output• mongo.job.output.format=com.hadoop.MongoOutputFormat• mongo.input.uri=mongodb://my-db:27017/db1.collection2
– Using MongoDB backup files• mongo.job.output.format=com.hadoop.BSONFileOutputFormat• mapred.output.dir=file:///results.bson
– Xxx
What is MongoDB-Hadoop Connector?
![Page 17: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/17.jpg)
17
• Version 1.1.0, July 2013
– Pig support
– Hive support
– Streaming support
– Read/Write MongoDB backups
– Update writes
– Much more….
Enhancing MongoDB-Hadoop Connector
• Version 1.2.0, December 2013
– Apache Hadoop 2.2 support
– Multiple collections as M-R
source
– Multiple mongos support
– Custom splitting support
– Performance improvements
![Page 18: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/18.jpg)
18
• Rich query language
• Native secondary indexes
• Geospatial indexes & search
• Text indexes & search
• Aggregation framework
• Javascript Map-Reduce
• Client-side analytics
MongoDB Native Analytics
![Page 19: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/19.jpg)
19
Resources
White paper: Big Data: Examples and Guidelines for the Enterprise Decision Maker
http://www.mongodb.com/lp/whitepaper/big-data-nosql
Recorded Webinar Series: Thrive with Big Data
http://www.mongodb.com/lp/big-data-series
Recorded Webinar: What’s New with MongoDB Hadoop Integration
http://www.mongodb.com/presentations/webinar-whats-new-mongodb-hadoop-integration Documentation: MongoDB Connector for
Hadoophttp://docs.mongodb.org/ecosystem/tools/hadoop/
Trouble Tickets http://jira.mongodb.org (project = Hadoop Integration)
Subscriptions, support, consulting, training https://www.mongodb.com/products/how-to-buy
Resource Location
![Page 20: Webinar: MongoDB and Hadoop - Working Together to provide Business Insights](https://reader036.fdocuments.net/reader036/viewer/2022081602/54c63f9b4a7959b07d8b4613/html5/thumbnails/20.jpg)