Browsemap: Collaborative Filtering at LinkedIn
-
Upload
lili-wu -
Category
Technology
-
view
268 -
download
1
description
Transcript of Browsemap: Collaborative Filtering at LinkedIn
Recruiting Solutions Recruiting Solutions Recruiting Solutions 1
Browsemap: Collaborative Filtering At LinkedIn
Lili Wu, Sam Shah, Sean Choi, Mitul Tiwari, Christian Posse
RSWeb 2014 with RecSys
2
Agenda § Motivation § Architecture § Applications § Lessons Learned
3
Profile Browsemap: People who viewed this profile also viewed… Count co-views
Collaborative filtering for member profile
4
Collaborative filtering for job page
Job Browsemap: People who viewed this job also viewed… Count co-views
5
company group portfolio
… many CF based recommenders
6
• Many different entities
• Similar problems with different requirement • Fast product development cycle
• Hybrid recommender systems
• Handle LinkedIn data volume and traffic
Challenges
7
Challenges
è Horizontal Platform
• Many different entities
• Similar problems with different requirement • Fast product development cycle
• Hybrid recommender systems
• Handle LinkedIn data volume and traffic
8
Browsemap
Collaborative Filtering Platform at LinkedIn
9
Browsemap Platform
• Scalability Ø Online/offline architecture Ø Hundreds of millions of entities, billions of
monthly page views • Browsemap Domain Specific Language (DSL)
Ø Code reuse through modular components Ø Flexible computation workflow construction
• Data are used by hybrid recommenders
10 10 10
Browsemap Architecture
HDFS
User Activity
Data
Frontend Services
Results Queries
Hadoop
Browsemap Engine
Browsemap DSL Online
Query API
Key-value storage
Voldemort
11 11 11
Browsemap Architecture
HDFS
Frontend Services
Results Queries
Hadoop
Browsemap Engine
Browsemap DSL Online
Query API
Key-value storage
Voldemort
User Activity
Data High Throughput
12 12 12
Browsemap Architecture
HDFS
Frontend Services
Results Queries
Hadoop
Browsemap Engine
Browsemap DSL Online
Query API
Key-value storage
Voldemort
User Activity
Data Low Latency
13
Browsemap Domain Specific Language (DSL)
Module Collection
Co-view counting
Spam User Filtering
Expired Job Filtering
Expired Job Filtering
Cold-start techniques
Co-view counting
…
Cold-start techniques
… Job browsemap
���
Job ��� Company
…
Spam User Filtering
Co-view counting
…
Cold-start techniques
…
Spam User Filtering
Company browsemap
14
• Support all entity types • Adjust to each product requirement
• Scale
Recap
Voldemort
15
Agenda ü Motivation ü Architecture § Applications § Lessons Learned
16 16 16
Applications – CF based recommenders Profile Browsemap
Portfolio Browsemap
Job Browsemap Group Browsemap
Hiring Browsemap
Company Browsemap
Influencer Browsemap
17 17 17
Applications – Hybrid Recommender Systems
Suggested Profile Update
Swee Lim
18 18 18
Applications – Hybrid Recommender Systems
Suggested Profile Update
Goal: for each member,
find companies he may want to follow
19 19 19
Applications – Hybrid Recommender Systems
Google Cisco Member followed companies
Linkedin, Facebook
Juniper, Arista Companies user may
be interested in
…
…
Member info: • Content-based features
title, industry, location, … • Collaborative filtering feature
Co-follow Browsemaps: People who follow this company also follow these companies
20 20 20
Applications – Hybrid Recommender Systems
Question: For a company C, will member M like it?
Approach: Logistic regression Features:
member location company location 1 if yes, 0 if no
company is in the list of the co-follow browsemaps ? 1 if yes, 0 if no
…
21 21 21
Applications – Hybrid Recommender Systems
Collaborative Filtering is important: • Surface implicit connection between companies • Based on Member’s preference
22
Agenda ü Motivation ü Architecture ü Applications § Lessons Learned
Lesson 1: Tall oaks grow from little acorns
23
Lesson 1: Tall oaks grow from little acorns
24
Lesson 1: Tall oaks grow from little acorns
25
Lesson 1: Tall oaks grow from little acorns
26
A generic horizontal platform is essential
Lesson 2: One hand washes the other
27
Job Browsemap
Similar Jobs
Collaborative filtering: “Follower audience”
Content based: “Leader audience”
Lesson 3: You can’t get blood out of a stone
28
Job 1 Job 2 Job 3 (new)
Need to handle cold start problem
(view time)
merge
Leverage Browsing History Personalized Backfill
Lesson 4: A chain is only as strong as its weakest link
29
CF: Relies solely on user activities Good data is crucial
§ Mistakes can be hard to detect / debug
§ Simple mistakes can have big impact e.g. “jobid” à “id”
§ Need prevention mechanism Ø Improve tracking Ø Unit test Ø Browsemap platform data-check :
input volume, coverage/metrics analysis
Lesson 5: User experience matters
50% CTR
30
500% more applications
ª Put recommendations in user’s flow
31
§ Collaborative filtering is important for LinkedIn
§ Browsemap is in production for 3+ years § Horizontal platform is crucial
Conclusion
32
§ Questions?
Thank you !