NoTube: User Profiling (Beancounter)

17
WP 3 User profiling and Recommenda5on (Part 1) BBC, Prone+cs, VUA 1 Wednesday, March 28, 12

description

 

Transcript of NoTube: User Profiling (Beancounter)

Page 1: NoTube: User Profiling (Beancounter)

WP  3User  profiling  and  

Recommenda5on  (Part  1)BBC,  Pro-­‐ne+cs,  VUA

1

Wednesday, March 28, 12

Page 2: NoTube: User Profiling (Beancounter)

Contents

26-27 March 2012 2NoTube 3rd Review

Overview

User profilingGeneral goal & approachFrom activity streams to profileIssuesAnalyticsBeancounter

RecommendationsGeneral goal & approachSemantic recommendationStatistical recommendationHybrid recommendation

Exploitation

Conclusions

Wednesday, March 28, 12

Page 3: NoTube: User Profiling (Beancounter)

Overview

26-27 March 2012 3

TV Program Enrichment

SemanticPattern-based

Recommendation Strategy

RDF GraphTV

Programs

Semantic ContentPatterns for

TV Programs

HybridRecommendation

Strategy

StatisticalSimilarity-based

Recommendation StrategyUser Ratings &

Demographics(BBC EPG

Data)

EPG Metadata(BBC)

Recommendation Service

SimilarityClusters

of Programs

User Data Analysis

End-UsersEnd Users

NoTube 3rd Review

Wednesday, March 28, 12

Page 4: NoTube: User Profiling (Beancounter)

Overview

26-27 March 2012 3

TV Program Enrichment

SemanticPattern-based

Recommendation Strategy

RDF GraphTV

Programs

Semantic ContentPatterns for

TV Programs

HybridRecommendation

Strategy

StatisticalSimilarity-based

Recommendation StrategyUser Ratings &

Demographics(BBC EPG

Data)

EPG Metadata(BBC)

Recommendation Service

SimilarityClusters

of Programs

User Data Analysis

End-UsersEnd Users

BEANCOUNTER

NoTube 3rd Review

Wednesday, March 28, 12

Page 5: NoTube: User Profiling (Beancounter)

User profiling approach

26-27 March 2012 4NoTube 3rd Review

users’ interests and behaviours could be inferred from their activities on the Social Web

• from tweets,• liked facebook resources,• song listened• ...

interests in topics are represented using Linked Data web identifiers

• to access a wealth of open and machine-readable data• to publish profiles in compliance with the LOD paradigm• to leverage on the graph-based model of such data sets

Wednesday, March 28, 12

Page 6: NoTube: User Profiling (Beancounter)

User profiling: Challenge

26-27 March 2012 5NoTube 3rd Review

main challenge: extracting meaningful data from different sources of user activities

to produce LOD identifiers from activities:• “follow-your-nose”, record-linkage based approach• semantic-annotation-based approach, NLP techniques on raw text

interests are weighted to represent their descriptiveness user profiles are syndicated using JSON, JSON-P and RDF

Wednesday, March 28, 12

Page 7: NoTube: User Profiling (Beancounter)

26-27 March 2012 6NoTube 3rd Review

User profiling: Follow-your-nose

facebook.com/pages/Shoeshine/ dbpedia.org/resource/

“follow-your-nose”, record-linkage based

record linkage is “the problem of recognising those records in two files which represent identical persons, objects or events

(said to be matched).”

we adopted a text retrieval version, incremental constrained multiple text searches

Wednesday, March 28, 12

Page 8: NoTube: User Profiling (Beancounter)

26-27 March 2012 7NoTube 3rd Review

User profiling: Semantic Annotation

for some activities the “follow-your-noise” approach is not suitable

Tweet, or text resources need Natural Language Processing techniques

• semantic annotation using LUpedia (WP4)

lookup for LOD identifiers from:

• tweet text• #hashtags definitions • linked Web pages

Wednesday, March 28, 12

Page 9: NoTube: User Profiling (Beancounter)

26-27 March 2012 8NoTube 3rd Review

User profiling: Semantic Annotation

Wednesday, March 28, 12

Page 10: NoTube: User Profiling (Beancounter)

26-27 March 2012 8NoTube 3rd Review

User profiling: Semantic Annotation

Bubbles Devere is the best thing ever. #littlebritain

Wednesday, March 28, 12

Page 11: NoTube: User Profiling (Beancounter)

26-27 March 2012 8NoTube 3rd Review

User profiling: Semantic Annotation

Bubbles Devere is the best thing ever. #littlebritain

Brilliant british humor by Matt Lucas & David Walliams - whole range of facinating characters portraying diversity of british society

Wednesday, March 28, 12

Page 12: NoTube: User Profiling (Beancounter)

26-27 March 2012 8NoTube 3rd Review

User profiling: Semantic Annotation

Bubbles Devere is the best thing ever. #littlebritain

Brilliant british humor by Matt Lucas & David Walliams - whole range of facinating characters portraying diversity of british society

http://dbpedia.org/resource/Matt_Lucashttp://dbpedia.org/resource/David_Walliams

WP4 Enrichment

Wednesday, March 28, 12

Page 13: NoTube: User Profiling (Beancounter)

26-27 March 2012 9NoTube 3rd Review

User profiling: Issues

non-deterministic record-linkage and semantic annotation could introduce noise

• noisy data leads to misleading profiles• recommendations could be affected

hence, we introduced interest weights

• to minimise the effect of potential noise eliminating poorly descriptive interests giving them lower weights

• to represent the evolution of a single interest recurring interest over time gain more weights

Wednesday, March 28, 12

Page 14: NoTube: User Profiling (Beancounter)

26-27 March 2012 10NoTube 3rd Review

Analytics

“people are usually interested in information about themselves”

from Doppler annual report

Wednesday, March 28, 12

Page 15: NoTube: User Profiling (Beancounter)

26-27 March 2012 11NoTube 3rd Review

NoTube Beancounter

The User profiling and analytics components has been lovingly called “Beancounter” since the early days

built on top of experience and experiments made during the 3 years of the project

a scalable, activity-streams-oriented set of processes

• filtering, slicing, fast key lookups• many analysis are really just “counting the beans”• analysis deserves an high performance architecture

Wednesday, March 28, 12

Page 16: NoTube: User Profiling (Beancounter)

26-27 March 2012 12NoTube 3rd Review

NoTube Beancounter

REST platform

crawler

analysis engine

key value

{activities

{analysis

{profilesprofiler

Wednesday, March 28, 12

Page 17: NoTube: User Profiling (Beancounter)

Acknowledgements

26-27 March 2012 13NoTube 3rd Review

Wednesday, March 28, 12