Real time ecommerce analytics with MongoDB at Gilt Groupe (Michael Bryzek & Michael Nutt)
-
Upload
mongosf -
Category
Technology
-
view
12.166 -
download
0
Transcript of Real time ecommerce analytics with MongoDB at Gilt Groupe (Michael Bryzek & Michael Nutt)
1
www.gilt.com/invite/michael
Real Time Ecommerce Analytics at Gilt Groupe
Michael Bryzek, CTO & Founder
Michael Nutt, Senior Engineer
Mongo SF - April 30, 2010
We’re hiring: [email protected]
2
What is Gilt Groupe?
The world’s best brands at up to 70% off
Sales start every day at noon
Simple, luxurious online experience
Relentless focus on the customer
. . .
A fast growing young company
3
4
What does noon look like in Tech?
5
What does noon look like?
6
MongoDB at Gilt Groupe
Real time analytics is a sweet spot for MongoDB
Two production examples we’ll share today at Gilt Groupe:
1. Selecting product to sell based on real time data
2. Hummingbird: Real time visualization of site traffic
7
Using MongoDB for Real Time Analytics
Goal: Improve conversion of our gifts section (www.gilt.com/gifts) by ensuring good products are being promoted at the right time
Challenge: High traffic makes it hard to collect and analyze data in a scalable and fast way
Approach:1. Capture data in real time in MongoDB2. Analyze w/ Map Reduce3. Update txn systems4. Repeat
8
Step 1: Data Capture
• Java server speaks JSON/HTTP, writes to MongoDB
• Each page view receive a list of every item on the page and its position via AJAX
• Purchase data sent by background job post purchase
9
Step 1: Data Capture
_db = new Mongo().getDB("gifts");_listing_visits = _db.getCollection("listing_visits");
--------------------------------------------------
BasicDBObject record = new BasicDBObject();record.put("gift_product_look_guid", de.giftProductLookGuid);record.put("product_look_guid", de.productLookGuid);record.put("sale_id", info.saleId);record.put("user_guid", info.userGuid);record.put("subsite_id", info.subsiteId);record.put("created_at", info.createdAt);record.put("position", position);_listing_visits.save(record);
Storing Data in Java
10
Step 2: Map
Calculate a score for each item based on page views, conversion, inventory, and merchandising input
m = function(){
[snip] if ( hourly.visits > 0 && this.quantity_sold > 0 ) {
var rate = this.quantity_sold / hourly.visits;points = parseInt(100*rate);v += points;explanation += "Conversion rate of " + points + "% ”;
} else if ( hourly.visits == null || hourly.visits == 0 ) {v += 500;explanation += "Product has never been seen (500 points). ";
}
[snip] emit( { gift_product_look_guid : this._id }, { score : v, explanation : explanation}});}
11
Step 2: Reduce
Reduce is a passthrough
r = function( pid , values ){ return values[0];}
Map Reduce run every 15 minutes via CRON – results stored in a collection named “scores”
res = db.gift_product_looks.mapReduce( m , r , { out : "scores" } );
12
Step 3: Update Transactional Systems
• Mongo + server run on ec2
• Send “scores” collection back to our primary data center, storing latest scores in our primary relational database
• Gift items are always sorted by score – transactional system only needed an “order by score desc” clause
13
Questions before we move on to
Hummingbird?
14
/tracking.gif?events=&prop1=women&server=www.gilt.com&products=&pageName=sales%3A+women&channel=sale&prop4=sale+category+page&u=http%3A%2F%2Fwww.gilt.com%2Fsale%2Fwomen&guid=418237ca-2bc6-932e-84c2-d4f02d9fd5bf&gen=f&uid=25423567&cb=443460396
Tracking Pixels
15
GILT
Omniture
DataWarehouse Users
16
GILT
Omniture
DataWarehouse Users
24 hourslater...
17
Node.js
Asynchronous, evented web framework
http://nodejs.org
18
var mongo = require(’lib/mongodb’);
var db = new mongo.Db('hummingbird', new mongo.Server('localhost', 27017, {}), {});
db.createCollection('visits', function(err, collection) { db.collection('visits', function(err, collection) { collection.insert(env); });});
19
DEMO
20