Social Data and Log Analysis Using MongoDB

73
Social Data and Log Analysis Using MongoDB 2011/03/01(Tue) #mongotokyo doryokujin

Transcript of Social Data and Log Analysis Using MongoDB

Page 1: Social Data and Log Analysis Using MongoDB

Social Data and Log Analysis Using MongoDB2011/03/01(Tue) #mongotokyo

doryokujin

Page 2: Social Data and Log Analysis Using MongoDB

Self-Introduction

• doryokujin (Takahiro Inoue), Age: 25

• Education: University of Keio• Master of Mathematics March 2011 ( Maybe... )

• Major: Randomized Algorithms and Probabilistic Analysis

• Company: Geisha Tokyo Entertainment (GTE)• Data Mining Engineer (only me, part-time)

• Organized Community: • MongoDB JP, Tokyo Web Mining

Page 3: Social Data and Log Analysis Using MongoDB

My Job

• I’m a Fledgling Data Scientist

• Development of analytical systems for social data

• Development of recommendation systems for social data

• My Interest: Big Data Analysis

• How to generate logs scattered many servers

• How to storage and access to data

• How to analyze and visualization of billions of data

Page 4: Social Data and Log Analysis Using MongoDB

Agenda

• My Company’s Analytic Architecture

• How to Handle Access Logs

• How to Handle User Trace Logs

• How to Collaborate with Front Analytic Tools

• My Future Analytic Architecture

Page 5: Social Data and Log Analysis Using MongoDB

Agenda

• My Company’s Analytic Architecture

• How to Handle Access Logs

• How to Handle User Trace Logs

• How to Collaborate with Front Analytic Tools

• My Future Analytic Architecture

Of Course Everything With

Hadoop, Mongo Map Reduce

Hadoop, Schema Free

REST Interface, JSON

Capped Collection,Modifier Operation

Page 6: Social Data and Log Analysis Using MongoDB

My Company’s Analytic Architecture

Page 7: Social Data and Log Analysis Using MongoDB

Social Game (Mobile): Omiseyasan

• Enjoy arranging their own shop (and avatar)

• Communicate with other users by shopping, part-time, ...

• Buy seeds of items to display their own shop

Page 8: Social Data and Log Analysis Using MongoDB

Data Flow

Flash ComposeServer

User Game Save Data

Access Logs

User Registration / Charge

User Trace Logs

Access

Page 9: Social Data and Log Analysis Using MongoDB

Back-end Architecture

User Registration / Charge

User Trace LogsAccess Logs User Game

Save Data

Pretreatment: Trimming,Validation, Filtering,...

As a Central Data Server

Back Up To S3

PyMongo

Dumbo (Hadoop Streaming)

Page 10: Social Data and Log Analysis Using MongoDB

Front-end Architecture

Social Data Analysis Data Analysis

Web UI

sleepy.mongoose(REST Interface)

PyMongo

Page 11: Social Data and Log Analysis Using MongoDB

Environment• MongoDB: 1.6.4

• PyMongo: 1.9

• Hadoop: CDH2 ( soon update to CDH3 )

• Dumbo: Simple Python Module for Hadoop Streaming

• Cassandra: 0.6.11

• R, Neo4j, jQuery, Munin, ...

• [Data Size (a rough estimate)]

• Access Log 15GB / day ( gzip ) - 2,000M PV

• User Trace Log 5GB / day ( gzip )

Page 12: Social Data and Log Analysis Using MongoDB

How to Handle Access Logs

Page 13: Social Data and Log Analysis Using MongoDB

How to Handle Access Logs

User Registration / Charge

User Trace LogsAccess Logs User Game

Save Data

Pretreatment: Trimming,Validation, Filtering, ...

As a Data Server

Back Up To S3

Page 14: Social Data and Log Analysis Using MongoDB

Access Data Flow

user_access

user_pageview

daily_pageviewagent_pageview

hourly_pageview

Access Logs

Pretreatment

1st Map Reduce

2nd Map Reduce

Group by

Caution: need MongoDB >= 1.7.4

Page 15: Social Data and Log Analysis Using MongoDB

Hadoop

• Using Hadoop: Pretreatment Raw Records

• [Map / Reduce]

• Read all records

• Split each record by ‘¥s’

• Filter unnecessary records (such as *.swf)

• Check records whether correct or not

• Insert (save) records to MongoDB

※ write operations won’t yet fully utilize all cores

Page 16: Social Data and Log Analysis Using MongoDB

110.44.178.25 - - [19/Nov/2010:04:40:40 +0900] "GET /playshop.4ce13800/battle/

BattleSelectAssetPage.html;jsessionid=9587B0309581914AB7438A34B1E51125-n15.at3?collec\

tion=12&opensocial_app_id=00000&opensocial_owner_id=00000 HTTP/1.0" 200 6773 "-"

"DoCoMo/2.0 ***"

110.44.178.26 - - [19/Nov/2010:04:40:40 +0900] "GET /playshop.4ce13800/shopping/battle/

ShoppingBattleTopPage.html;jsessionid=D901918E3CAE46E6B928A316D1938C3A-n11.a\

p1?opensocial_app_id=00000&opensocial_owner_id=11111 HTTP/1.0" 200 15254 "-"

"DoCoMo/2.0 ***"

110.44.178.27 - - [19/Nov/2010:04:40:40 +0900] "GET /playshop.4ce13800/battle/

BattleSelectAssetDetailPage;jsessionid=202571F97B444370ECB495C2BCC6A1D5-n14.at11?asse\

t=53&collection=9&opensocial_app_id=00000&opensocial_owner_id=22222 HTTP/1.0" 200

11616 "-" "SoftBank/***"

...(many records)

Access Logs

Page 17: Social Data and Log Analysis Using MongoDB

> db.user_trace.find({user: "7777", date: "2011-02-12"}).limit(0)

.forEach(printjson)

{

"_id" : "2011-02-12+05:39:31+7777+18343+Access",

"lastUpdate" : "2011-02-19",

"ipaddr" : "202.32.107.166",

"requestTimeStr" : "12/Feb/2011:05:39:31 +0900",

"date" : "2011-02-12",

"time" : "05:39:31",

"responseBodySize" : 18343,

"userAgent" : "DoCoMo/2.0 SH07A3(c500;TB;W24H14)",

"statusCode" : "200",

"splittedPath" : "/avatar2-gree/MyPage,

"userId" : "7777",

"resource" : "/avatar2-gree/MyPage;jsessionid=...?

battlecardfreegacha=1&feed=...&opensocial_app_id=...&opensocial_viewer_id=...&

opensocial_owner_id=..."

}

Collection: user_trace

Page 18: Social Data and Log Analysis Using MongoDB

1st Map Reduce

• [Aggregation]

• Group by url, date, userId

• Group by url, date, userAgent

• Group by url, date, time

• Group by url, date, statusCode

• Map Reduce operations runs in parallel on all shards

Page 19: Social Data and Log Analysis Using MongoDB

map = Code("""

function(){

emit({

path:this.splittedPath,

userId:this.userId,

date:this.date

},1)}

""")

reduce = Code("""

function(key, values){

var count = 0;

values.forEach(function(v) {

count += 1;

});

return {"count": count, "lastUpdate": today};

}

""")

• this.userId

• this.userAgent

• this. timeRange

• this.statusCode

1st Map Reduce with PyMongo

Page 20: Social Data and Log Analysis Using MongoDB

# ( mongodb >= 1.7.4 )

result = db.user_access.map_reduce(map,

reduce,

marge_out="user_pageview",

full_response=True,

query={"date": date})

• About output collection, there are 4 options: (MongoDB >= 1.7.4)• out : overwrite collection if already exists• marge_output : merge new data into the old output collection• reduce_output : reduce operation will be performed on the two values

(the same key on new result and old collection) and the result will be written to the output collection.

• full_responce (=false) : If True, return on stats on the operation. If False, No collection will be created, and the whole map-reduce operation will happen in RAM. The Result set fits within the 8MB/doc limit (16MB/doc in 1.8?).

Page 21: Social Data and Log Analysis Using MongoDB

Map Reduce (>=1.7.4):out option in JavaScript

• "collectionName" : If you pass a string indicating the name of a collection, then the output will replace any existing output collection with the same name.

• { merge : "collectionName" } : This option will merge new data into the old output collection. In other words, if the same key exists in both the result set and the old collection, the new key will overwrite the old one.

• { reduce : "collectionName" } : If documents exists for a given key in the result set and in the old collection, then a reduce operation (using the specified reduce function) will be performed on the two values and the result will be written to the output collection. If a finalize function was provided, this will be run after the reduce as well.

• { inline : 1} : With this option, no collection will be created, and the whole map-reduce operation will happen in RAM. Also, the results of the map-reduce will be returned within the result object. Note that this option is possible only when the result set fits within the 8MB limit.

http://www.mongodb.org/display/DOCS/MapReduce

Page 22: Social Data and Log Analysis Using MongoDB

> db.user_pageview.find({

"_id.userId": "7777",

"_id.path": "/.*MyPage$/",

"_id.date": {$lte: "2011-02-12"}

).limit(1).forEach(printjson)

#####

{

"_id" : {

"date" : "2011-02-12",

"path" : "/avatar2-gree/MyPage",

"userId" : "7777",

},

"value" : {

"count" : 10,

"lastUpdate" : "2011-02-19"

}

}

• Regular Expression

• <, >, <=, >=

Collection: user_pageview

Page 23: Social Data and Log Analysis Using MongoDB

map = Code("""

function(){

emit({

"path" : this._id.path,

"date": this._id.date,

},{

"pv": this.value.count,

"uu": 1

});

}

""")

reduce = Code("""

function(key, values){

var pv = 0;

var uu = 0;

values.forEach(function(v){

pv += v.pv;

uu += v.uu;

});

return {"pv": pv, "uu": uu};

}

""")

2nd Map Reduce with PyMongo

Page 24: Social Data and Log Analysis Using MongoDB

map = Code("""

function(){

emit({

"path" : this._id.path,

"date": this._id.date,

},{

"pv": this.value.count,

"uu": 1

});

}

""")

reduce = Code("""

function(key, values){

var pv = 0;

var uu = 0;

values.forEach(function(v){

pv += v.pv;

uu += v.uu;

});

return {"pv": pv, "uu": uu};

}

""")

2nd Map Reduce with PyMongo

Must be the same key ({“pv”: NaN} if not)

Page 25: Social Data and Log Analysis Using MongoDB

# ( mongodb >= 1.7.4 )

result = db.user_pageview.map_reduce(map,

reduce,

marge_out="daily_pageview",

full_response=True,

query={"date": date})

Page 26: Social Data and Log Analysis Using MongoDB

> db.daily_pageview.find({

"_id.date": "2011-02-12",

"_id.path": /.*MyPage$/

}).limit(1).forEach(printjson)

{

"_id" : {

"date" : "2011-02-12",

"path" : "/avatar2-gree/MyPage",

},

"value" : {

"uu" : 53536,

"pv" : 539467

}

}

Collection: daily_pageview

Page 27: Social Data and Log Analysis Using MongoDB

Current Map Reduce is Imperfect• [Single Threads per node]

• Doesn't scale map-reduce across multiple threads

• [Overwrite the Output Collection]• Overwrite the old collection ( no other options like “marge” or

“reduce” )

# mapreduce code to merge output (MongoDB < 1.7.4)

result = db.user_access.map_reduce(map,

reduce,

full_response=True,

out="temp_collection",

query={"date": date})

[db.user_pageview.save(doc) for doc in temp_collection.find()]

Page 29: Social Data and Log Analysis Using MongoDB

How to HandleUser Trace Logs

Page 30: Social Data and Log Analysis Using MongoDB

How to Handle User TRACE Logs

User Registration / Charge

User Trace LogsAccess Logs Game Save

Data

Pretreatment: Trimming,Validation, Filtering, ...

As a Data Server

Back Up To S3

Page 31: Social Data and Log Analysis Using MongoDB

User Trace / Charge Data Flow

user_trace

user_charge

daily_charge

daily_trace

User Trace Logs

Pretreatment

User Registration / Charge

Page 32: Social Data and Log Analysis Using MongoDB

User Trace Log

Page 33: Social Data and Log Analysis Using MongoDB

Hadoop• Using Hadoop: Pretreatment Raw Records

• [Map / Reduce]• Split each record by ‘¥s’

• Filter Unnecessary Records

• Check records whether user behaves dishonestly

• Unify format to be able to sum up ( Because raw records are written by free format )

• Sum up records group by “userId” and “actionType”

• Insert (save) records to MongoDB

※ write operations won’t yet fully utilize all cores

Page 34: Social Data and Log Analysis Using MongoDB

An Example of User Trace Log

UserId ActionType ActionDetail

Page 35: Social Data and Log Analysis Using MongoDB

An Example of User Trace Log-----Change------ActionLogger a{ChangeP} (Point,1371,1383) ActionLogger a{ChangeP} (Point,2373,2423)

------Get------ActionLogger a{GetMaterial} (syouhinnomoto,0,-1) ActionLogger a{GetMaterial} usesyouhinnomoto ActionLogger a{GetMaterial} (omotyanomotoPRO,1,6)

-----Trade-----ActionLogger a{Trade} buy 3 itigoke-kis from gree.jp:00000 #逆からみれば売った事に

-----Make-----ActionLogger a{Make} make item kuronekono_nActionLogger a{MakeSelect} make item syouhinnomoto ActionLogger a{MakeSelect} (syouhinnomoto,0,1)

-----PutOn/Off-----ActionLogger a{PutOff} put off 1 ksuterasActionLogger a{PutOn} put 1 burokkus @2500

-----Clear/Clean-----ActionLogger a{ClearLuckyStar} Clear LuckyItem_1 4 times

-----Gatcha-----ActionLogger a{Gacha} Play gacha with first free play:わくわくおみせ服ガチャActionLogger a{Gacha} Play gacha:わくわくおみせ服ガチャ

The value of “actionDerail” must be unified format

Page 36: Social Data and Log Analysis Using MongoDB

> db.user_trace.find({date:"2011-02-12”,

actionType:"a{Make}",

userId:”7777"}).forEach(printjson)

{

"_id" : "2011-02-12+7777+a{Make}",

"date" : "2011-02-12"

"lastUpdate" : "2011-02-19",

"userId" : ”7777",

"actionType" : "a{Make}",

"actionDetail" : {

"make item ksutera" : 3,

"make item makaron" : 1,

"make item huwahuwamimiate" : 1,

     …

   }

}

Collection: user_trace

Sum up values group by “userId” and “actionType”

Page 37: Social Data and Log Analysis Using MongoDB

> db.daily_trace.find({

date:{$gte:"2011-02-12”,$lte:”2011-02-19”},

actionType:"a{Make}"}).forEach(printjson)

{

"_id" : "2011-02-12+group+a{Make}",

"date" : "2011-02-12",

"lastUpdate" : "2011-02-19",

"actionType" : "a{Make}",

"actionDetail" : {

"make item kinnokarakuridokei" : 615,

"make item banjo-" : 377,

"make item itigoke-ki" : 135904,

...

},

...

}...

Collection: daily_trace

Page 38: Social Data and Log Analysis Using MongoDB

User Charge Log

Page 39: Social Data and Log Analysis Using MongoDB

// TOP10 Users at 2011-02-12 abount Accounting

> db.user_charge.find({date:"2011-02-12"})

.sort({totalCharge:-1}).limit(10).forEach(printjson)

{

"_id" : "2011-02-12+7777+Charge",

"date" : "2011-02-12",

"lastUpdate" : "2011-02-19",

"totalCharge" : 10000,

"userId" : ”7777",

"actionType" : "Charge",

"boughtItem" : {

"アクセサリーの素EX" : 13,

"コネルギー+6000" : 3,

"アクセサリーの素PRO" : 20

}

}

{…

Collection: user_charge

Sum up values group by “userId” and “actionType”

Page 40: Social Data and Log Analysis Using MongoDB

> db.daily_charge.find({date:"2011-02-12",T:"all"})

.limit(10).forEach(printjson)

{

"_id" : "2011-02-12+group+Charge+all+all",

"date" : "2011-02-12",

"total" : 100000,

"UU" : 2000,

"group" : {

  "わくわくポイント" : 1000000,

  "アクセサリー" : 1000000, ...

},

"boughtItemNum" : {

"料理の素EX" : 8,

"アクセサリーの素" : 730, ...

},

"boughtItem" : {

"料理の素EX" : 10000,

"アクセサリーの素" : 100000, ...

}

}

Collection: daily_charge

Page 41: Social Data and Log Analysis Using MongoDB

Categorize Users

Page 42: Social Data and Log Analysis Using MongoDB

user_registration

user_category

• [Categorize Users]

• by play term

• by total amount of charge

• by registration date

• [ Take an Snapshot of Each Category’s Stats per Week]

Attribution

Attribution

Attribution

Attribution

Categorize Usersuser_trace

user_charge

user_savedata

user_pageview

Page 43: Social Data and Log Analysis Using MongoDB

> db.user_registration.find({userId:”7777"}).forEach(printjson)

{

"_id" : "2010-06-29+7777+Registration",

"userId" : ”7777"

"actionType" : "Registration",

"category" : {

“R1” : “True”, # categorize whether resign or not

“T” : “ll” # categorize play term

     …

  },

  “firstCharge” : “2010-07-07”, # date when first charge

  “lastLogin” : “2010-09-30”, # date when last access

  “playTerm” : 94,

  “totalCumlativeCharge” : 50000, # total amount of accounting

  “totalMonthCharge” : 10000, # total amount of accounting recent a month

  …

}

Collection: user_registration

Tagging User

Page 44: Social Data and Log Analysis Using MongoDB

> var cross = new Cross() # User Definition Function

> MCResign = cross.calc(“2011-02-12”,“MC”,1)

# each value is the number of the user

# Charge(yen)/Term(day)

0(z) ~¥1k(s) ~¥10k(m) ¥100k~(l) total

~1day(z) 50000 10 5 0 50015

~1week(s) 50000 100 50 3 50153

~1month(m) 100000 200 100 1 100301

~3month(l) 100000 300 50 6 100356

month~(ll) 0 0 0 0 0

Collection: user_category

Page 45: Social Data and Log Analysis Using MongoDB

How to Collaborate WithFront Analytic Tools

Page 46: Social Data and Log Analysis Using MongoDB

Front-end Architecture

Social Data Analysis Data Analysis

Web UI

sleepy.mongoose(REST Interface)

PyMongo

Page 47: Social Data and Log Analysis Using MongoDB

Web UI and Mongo

Page 48: Social Data and Log Analysis Using MongoDB

Data Table: jQuery.DataTables[ Data Table ]

• Want to Share Daily Summary

• Want to See Data from Many Viewpoint

• Want to Implement Easily

• jQuery.DataTables

1 Variable length pagination

2 On-the-fly filtering

3 Multi-column sorting with data

type detection

4 Smart handling of column widths

5 Scrolling options for table

viewport

6 ...

Page 49: Social Data and Log Analysis Using MongoDB

Graph: jQuery.HighCharts[ Graph ]

• Want to Visualize Data

• Handle Time Series Data Mainly

• Want to Implement Easily

• jQuery.HighCharts

1. Numerous Chart Types

2. Simple Configuration Syntax

3. Multiple Axes

4. Tooltip Labels

5. Zooming

6. ...

Page 50: Social Data and Log Analysis Using MongoDB

sleepy.mongoose

• [REST Interface + Mongo]

• Get Data by HTTP GET/POST Request

• sleepy.mongoose

‣ request as “/db_name/collection_name/_command”

‣made by a 10gen engineer: @kchodorow

‣ Sleepy.Mongoose: A MongoDB REST Interface

Page 51: Social Data and Log Analysis Using MongoDB

//start server

> python httpd.py

…listening for connections on http://localhost:27080

//connect to MongoDB

> curl --data server=localhost:27017 'http://localhost:27080/_connect’

//request example

> http://localhost:27080/playshop/daily_charge/_find?criteria={}&limit=10&batch_size=10

{"ok": 1, "results": [{“_id": “…”, ”date":… },{“_id”:…}], "id": 0}}

sleepy.mongoose

Page 52: Social Data and Log Analysis Using MongoDB

JSON: Mongo <---> Ajax

JSONGet

• jQuery library and MongoDB are compatible

• It is not necessary to describe HTML tag(such as <table>)

sleepy.mongoose(REST Interface)

Page 53: Social Data and Log Analysis Using MongoDB

Example: Web UI

Page 54: Social Data and Log Analysis Using MongoDB
Page 55: Social Data and Log Analysis Using MongoDB
Page 56: Social Data and Log Analysis Using MongoDB

R and Mongo

Page 57: Social Data and Log Analysis Using MongoDB

> db.user_registration.find({userId:”7777"}).forEach(printjson)

{

"_id" : "2010-06-29+7777+Registration",

"userId" : ”7777"

"actionType" : "Registration",

"category" : {

“R1” : “True”, # categorize whether resign or not

“T” : “ll” # categorize play term

     …

  },

  “firstCharge” : “2010-07-07”, # date when first charge

  “lastLogin” : “2010-09-30”, # date when last access

  “playTerm” : 94,

  “totalCumlativeCharge” : 50000, # total amount of accounting

  “totalMonthCharge” : 10000, # total amount of accounting recent a month

  …

}

Collection: user_registration

Want to know the relationbetween user attributions

Page 58: Social Data and Log Analysis Using MongoDB

##### LOAD LIBRARY #####

library(RCurl)

library(rjson)

##### CONF #####

today.str <- format(Sys.time(), "%Y-%m-%d")

url.base <- "http://localhost:27080"

mongo.db <- "playshop"

mongo.col <- "user_registration"

mongo.base <- paste(url.base, mongo.db, mongo.col, sep="/")

mongo.sort <- ""

mongo.limit <- "limit=100000"

mongo.batch <- "batch_size=100000"

R Code: Access MongoDBUsing sleepy.mongoose

Page 59: Social Data and Log Analysis Using MongoDB

##### FUNCTION #####

find <- function(query){

mongo <- fromJSON(getURL(url))

docs <- mongo$result

makeTable(docs) # My Function

}

# Example

# Using sleepy.mongoose https://github.com/kchodorow/sleepy.mongoose

mongo.criteria <- "_find?criteria={ ¥

\"totalCumlativeCharge\":{\"$gt\":0,\"$lte\":1000}}"

mongo.query <- paste(mongo.criteria, mongo.sort, ¥

mongo.limit, mongo.batch, sep="&")

url <- paste(mongo.base, mongo.query, sep="/")

user.charge.low <- find(url)

R Code: Access MongoDBUsing sleepy.mongoose

Page 60: Social Data and Log Analysis Using MongoDB

# Result: 10th Document

[[10]][[10]]$playTerm[1] 31

[[10]]$lastUpdate[1] "2011-02-24"

[[10]]$userId[1] "7777"

[[10]]$totalCumlativeCharge[1] 10000

[[10]]$lastLogin[1] "2011-02-21"

[[10]]$date[1] "2011-01-22"

[[10]]$`_id`[1] "2011-02-12+18790376+Registration"

...

The Result

Page 61: Social Data and Log Analysis Using MongoDB

# Result: Translate Document to Table

playTerm totalWinRate totalCumlativeCharge totalCommitNum totalWinNum [1,] 56 42 1000 533 224 [2,] 57 33 1000 127 42 [3,] 57 35 1000 654 229 [4,] 18 31 1000 49 15 [5,] 77 35 1000 982 345 [6,] 77 45 1000 339 153 [7,] 31 44 1000 70 31 [8,] 76 39 1000 229 89 [9,] 40 21 1000 430 92[10,] 26 40 1000 25 10...

Make a Data Table from The Result

Page 62: Social Data and Log Analysis Using MongoDB

Scatter Plot / Matrix

Each Category

(User Attribution)

# Run as a batch command$ R --vanilla --quiet < mongo2R.R

Page 63: Social Data and Log Analysis Using MongoDB

Munin and MongoDB

Page 65: Social Data and Log Analysis Using MongoDB

My FutureAnalytic Architecture

Page 66: Social Data and Log Analysis Using MongoDB

user_access

user_trace

User Trace Logs

Access Logs

capped collection(per hour) Trimming

FilteringSum Up

RealTime(hourly)

Flume

daily/hourly_access

daily/hourly_trace

capped collection(per hour)

MapReduceModifierSum Up

RealTime(hourly)

Realtime Analysiswith MongoDB

Page 67: Social Data and Log Analysis Using MongoDB

Flume

Server A

Server B

Server C

Server D

Server E

Server F

Collector MongoDB

Access LogUser Trace Log

Hourly / Realtime

Flume Plugin

Page 68: Social Data and Log Analysis Using MongoDB

> db.flume_capped_21.find().limit(1).forEach(printjson)

{

"_id" : ObjectId("4d658187de9bd9f24323e1b6"),

"timestamp" : "Wed Feb 23 2011 21:52:06 GMT+0000 (UTC)",

"nanoseconds" : NumberLong("562387389278959"),

"hostname" : "ip-10-131-27-115.ap-southeast-1.compute.internal",

"priority" : "INFO",

"message" : "202.32.107.42 - - [14/Feb/2011:04:30:32 +0900] "GET /avatar2-gree.4d537100/res/swf/avatar/18051727/5/useravatar1582476746.swf?opensocial_app_id=472&opensocial_viewer_id=36858644&o

pensocial_owner_id=36858644 HTTP/1.1" 200 33640 "-" "DoCoMo/2.0 SH01C(c500;TB;W24H16)"",

"metadata" : {}

}

An Output FromMongo-Flume Plugin

Mongo Flume Plugin: https://github.com/mongodb/mongo-hadoop/tree/master/flume_plugin

Page 69: Social Data and Log Analysis Using MongoDB

Summary

Page 70: Social Data and Log Analysis Using MongoDB

Summary

• Almighty as a Analytic Data Server

• schema-free: social game data are changeable

• rich queries: important for analyze many point of view

• powerful aggregation: map reduce

• mongo shell: analyze from mongo shell are speedy and handy

• More...

• Scalability: using Replication, Sharding are very easy

• Node.js: It enable us server side scripting with Mongo

Page 71: Social Data and Log Analysis Using MongoDB

My Presentation・「MongoDBを用いたソーシャルアプリのログ解析」 ~解析基盤構築からフロントUIまで、MongoDBを最大限に活用する~ :

http://www.slideshare.net/doryokujin/mongodb-uimongodb

・「MongoDBとAjaxで作る解析フロントエンド&GraphDBを用いたソーシャルデータ解析」:

http://www.slideshare.net/doryokujin/mongodbajaxgraphdb-5774546

・「HadoopとMongoDBを活用したソーシャルアプリのログ解析」:

http://www.slideshare.net/doryokujin/hadoopmongodb

・「GraphDB徹底入門」~構造や仕組み理解から使いどころ・種々のGraphDBの比較まで幅広く~ :

http://www.slideshare.net/doryokujin/graphdbgraphdb

Page 72: Social Data and Log Analysis Using MongoDB

I ♥ MongoDB JP

• continue to be a organizer of MongoDB JP

• continue to propose many use cases of MongoDB

• ex: Social Data, Log Data, Medical Data, ...

• support MongoDB users

• by document translation, user-group, IRC, blog, book, twitter,...

• boosting services and products using MongoDB

Page 73: Social Data and Log Analysis Using MongoDB

[Contact me]twitter: doryokujinskype: doryokujin mail: [email protected]: http://d.hatena.ne.jp/doryokujin/MongoDB JP: https://groups.google.com/group/mongodb-jp?hl=ja

Thank you for coming to Mongo Tokyo!!