Mendeley Suggest: Engineering a Personalised Article Recommender System
-
Upload
kris-jack -
Category
Technology
-
view
8.639 -
download
3
description
Transcript of Mendeley Suggest: Engineering a Personalised Article Recommender System
![Page 1: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/1.jpg)
Mendeley Suggest: Engineering a
Personalised Article Recommender System
Kris Jack, PhDChief Data Scientist
https://twitter.com/_krisjack
![Page 2: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/2.jpg)
➔ What's Mendeley?
➔ What's Mendeley Suggest?
➔ Computation Layer
➔ Serving Layer➔ Architecture➔ Technologies➔ Deployment
➔ Conclusions
Overview
![Page 3: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/3.jpg)
What's Mendeley?
![Page 4: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/4.jpg)
➔ Mendeley is a platform that connects researchers, research data and apps
Mendeley Open API
![Page 5: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/5.jpg)
➔ Mendeley is a platform that connects researchers, research data and apps
➔ Startup company with ~20 R&D engineers
Mendeley Open API
![Page 6: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/6.jpg)
What's Mendeley Suggest?
![Page 7: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/7.jpg)
Use Case
➔ Good researchers are on top of their game➔ Difficult with the amount being produced
➔ There must be a technology that can help
➔ Help researchers by recommending relevant research
![Page 8: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/8.jpg)
Mendeley Suggest
![Page 9: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/9.jpg)
Computation Layer
![Page 10: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/10.jpg)
Mendeley Suggest
![Page 11: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/11.jpg)
Mendeley Suggest
![Page 12: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/12.jpg)
Mendeley Suggest
![Page 13: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/13.jpg)
Running on Amazon's Elastic Map Reduce
On demand use and easy to cost
![Page 14: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/14.jpg)
Nor
ma l
ised
Am
azon
Ho u
rs
No. Good Recommendations/10
Mahout'sPerformance
1.5M Users, 50M ArticlesComputation Layer
![Page 15: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/15.jpg)
Nor
ma l
ised
Am
azon
Ho u
rs
No. Good Recommendations/10
Costly & Bad Costly & Good
Cheap & Bad Cheap & Good
Mahout'sPerformance
1.5M Users, 50M ArticlesComputation Layer
![Page 16: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/16.jpg)
Nor
ma l
ised
Am
azon
Ho u
rs
No. Good Recommendations/10
Costly & Bad Costly & Good
Cheap & Bad Cheap & Good
Mahout'sPerformance
1.5M Users, 50M ArticlesComputation Layer
![Page 17: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/17.jpg)
Nor
ma l
ised
Am
azon
Ho u
rs
No. Good Recommendations/10
Costly & Bad Costly & Good
Cheap & Bad Cheap & Good
Mahout'sPerformance
1.5M Users, 50M ArticlesComputation Layer
![Page 18: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/18.jpg)
Nor
ma l
ised
Am
azon
Ho u
rs
No. Good Recommendations/10
0
1K
2K
3K
4K
5K
6K
7K
0 0.5 1 1.5 2 2.5
Costly & Bad Costly & Good
Cheap & Bad Cheap & Good
3
Mahout'sPerformance
1.5M Users, 50M ArticlesComputation Layer
![Page 19: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/19.jpg)
Nor
ma l
ised
Am
azon
Ho u
rs
No. Good Recommendations/10
0
1K
2K
3K
4K
5K
6K
7K
0 0.5 1 1.5 2 2.5
Costly & Bad Costly & Good
Cheap & Bad Cheap & Good
6.5K, 1.5Orig. item-based
3
Mahout'sPerformance
1.5M Users, 50M ArticlesComputation Layer
![Page 20: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/20.jpg)
Nor
ma l
ised
Am
azon
Ho u
rs
No. Good Recommendations/10
0
1K
2K
3K
4K
5K
6K
7K
0 0.5 1 1.5 2 2.5
Costly & Bad Costly & Good
Cheap & Bad Cheap & Good
6.5K, 1.5Orig. item-based
Cust. item-based➔2.4K, 1.5
3
Mahout'sPerformance
1.5M Users, 50M ArticlesComputation Layer
![Page 21: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/21.jpg)
Nor
ma l
ised
Am
azon
Ho u
rs
No. Good Recommendations/10
0
1K
2K
3K
4K
5K
6K
7K
0 0.5 1 1.5 2 2.5
Costly & Bad Costly & Good
Cheap & Bad Cheap & Good
6.5K, 1.5Orig. item-based
Cust. item-based➔2.4K, 1.5
3
-4.1K(63%)
Mahout'sPerformance
ParitionersMR allocation
1.5M Users, 50M ArticlesComputation Layer
![Page 22: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/22.jpg)
Nor
ma l
ised
Am
azon
Ho u
rs
No. Good Recommendations/10
0
1K
2K
3K
4K
5K
6K
7K
0 0.5 1 1.5 2 2.5
Costly & Bad Costly & Good
Cheap & Bad Cheap & Good
6.5K, 1.5Orig. item-based
Cust. item-based➔2.4K, 1.5
3
Mahout'sPerformance
1.5M Users, 50M ArticlesComputation Layer
![Page 23: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/23.jpg)
Nor
ma l
ised
Am
azon
Ho u
rs
No. Good Recommendations/10
0
1K
2K
3K
4K
5K
6K
7K
0 0.5 1 1.5 2 2.5
Costly & Bad Costly & Good
Cheap & Bad Cheap & Good
6.5K, 1.5Orig. item-based
Cust. item-based➔2.4K, 1.5
Orig. user-based➔1K, 2.5
3
Mahout'sPerformance
1.5M Users, 50M ArticlesComputation Layer
![Page 24: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/24.jpg)
Nor
ma l
ised
Am
azon
Ho u
rs
No. Good Recommendations/10
0
1K
2K
3K
4K
5K
6K
7K
0 0.5 1 1.5 2 2.5
Costly & Bad Costly & Good
Cheap & Bad Cheap & Good
6.5K, 1.5Orig. item-based
Cust. item-based➔2.4K, 1.5
Orig. user-based➔1K, 2.5
3
-1.4K(58%)
+1 (67%)
Mahout'sPerformance
1.5M Users, 50M ArticlesComputation Layer
![Page 25: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/25.jpg)
Nor
ma l
ised
Am
azon
Ho u
rs
No. Good Recommendations/10
0
1K
2K
3K
4K
5K
6K
7K
0 0.5 1 1.5 2 2.5
Costly & Bad Costly & Good
Cheap & Bad Cheap & Good
6.5K, 1.5Orig. item-based
Cust. item-based➔2.4K, 1.5
Orig. user-based➔1K, 2.5
3
Cust. user-based➔0.3K, 2.5
Mahout'sPerformance
1.5M Users, 50M ArticlesComputation Layer
![Page 26: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/26.jpg)
Nor
ma l
ised
Am
azon
Ho u
rs
No. Good Recommendations/10
0
1K
2K
3K
4K
5K
6K
7K
0 0.5 1 1.5 2 2.5
Costly & Bad Costly & Good
Cheap & Bad Cheap & Good
6.5K, 1.5Orig. item-based
Cust. item-based➔2.4K, 1.5
Orig. user-based➔1K, 2.5
3
Cust. user-based➔0.3K, 2.5
-0.7K(70%)
Mahout'sPerformance
-4.1K(63%)
1.5M Users, 50M ArticlesComputation Layer
![Page 27: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/27.jpg)
Nor
ma l
ised
Am
azon
Ho u
rs
No. Good Recommendations/10
0
1K
2K
3K
4K
5K
6K
7K
0 0.5 1 1.5 2 2.5
Costly & Bad Costly & Good
Cheap & Bad Cheap & Good
6.5K, 1.5Orig. item-based
Cust. item-based➔2.4K, 1.5
Orig. user-based➔1K, 2.5
3
Cust. user-based➔0.3K, 2.5
-6.2K(95%)
Mahout'sPerformance
+1 (67%)
1.5M Users, 50M ArticlesComputation Layer
![Page 28: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/28.jpg)
Mahout as the Computation Layer
➔ Out of the box, didn't work so well for us➔ Needed to understand Hadoop better➔ Contributed patch back to community (user-user)
➔ Next step, the serving layer...
![Page 29: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/29.jpg)
Serving Layer
![Page 30: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/30.jpg)
MendeleyHadoopCluster
UserLibraries
Cascading
Architecture
ComputationLayer
![Page 31: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/31.jpg)
AWS
MendeleyHadoopCluster
DynamoDB ElasticBeanstalk
ElasticBeanstalk
ElasticBeanstalk
UserLibraries
Map Reduce
Architecture
ComputationLayer
ServingLayer
![Page 32: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/32.jpg)
➔ Spring dependency injection framework➔ Context-wide integration testing is easy, including pre-loading
of test data➔ Allows other Spring features (cache, security, messaging)
➔ Spring MVC 3.2.M1➔ Annotated controllers, type conversion 'for free'➔ Asynchronous Servlet 3.0 supports thread 'parking'
➔ AlternatorDB➔ In-memory DynamoDB implementation for testing
Technologies
![Page 33: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/33.jpg)
Recommendation<K>
LongRecommendation UuidRecommendation
DocumentRecommendationGroupRecommendation PersonRecommendation
➔ Build once, employ in several use cases
Technologies
![Page 34: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/34.jpg)
➔ AWS ElasticBeanstalk➔ Managed, auto-scaling, health-checking .war container
➔ Jenkins continuous integration (CI) server➔ Maven build tool (useful dependency management)➔ beanstalk-maven-plugin (push a button to deploy)
➔ Deploys to ElasticBeanstalk➔ Replaces existing application version if required➔ 'Zero downtime' updates (tested at ~300ms)➔ Triggered by Jenkins
Deployment
![Page 35: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/35.jpg)
Putting it all together... $$$
➔ Real-time article recommendations for 2 million users➔ 20 requests per second➔ $65.84/month
➔ $34.24 ElasticBeanstalk➔ $28.17 DynamoDB➔ $2.76 bandwidth
➔ $30 to update the computation layer periodically
![Page 36: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/36.jpg)
Conclusions
![Page 37: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/37.jpg)
Conclusions
➔ Mendeley Suggest is a personalised article recommender➔ Built by small team for big data➔ Uses Mahout as computation layer
➔ Needs some love out of the box➔ Serves from AWS
➔ Reduces maintenance costs and is reliable➔ Intend to release Mendeley Suggest to all users this year
![Page 38: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/38.jpg)
We're Hiring!
➔ Data Scientist➔ apply recommender technologies to Mendeley's data
➔ work on improving the quality of Mendeley's research catalogue
➔ starting in first quarter of 2013
➔ 6 month secondment in KNOW Center, TU Graz, Austria as part of the EC FP7 TEAM project (http://team-project.tugraz.at/)
➔ http://www.mendeley.com/careers/
![Page 39: Mendeley Suggest: Engineering a Personalised Article Recommender System](https://reader034.fdocuments.net/reader034/viewer/2022052303/554f0d9ab4c90577698b55a4/html5/thumbnails/39.jpg)
www.mendeley.com