Relevance based ranking of video comments on YouTube
-
Upload
traian-rebedea -
Category
Technology
-
view
1.192 -
download
3
Transcript of Relevance based ranking of video comments on YouTube
![Page 1: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/1.jpg)
Authors
University Politehnica of Bucharest
Relevance-Based Ranking of Video Comments on YouTube
Andrei ȘerbănoiuTraian Rebedea [email protected]
![Page 2: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/2.jpg)
Overview
• Introduction• Motivation• System architecture • Classification of relevant comments• Ranking of relevant comments• Results• Conclusions
12.04.23 Sesiunea de Licenţe - Iulie 2012 2
![Page 3: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/3.jpg)
Introduction• Text classification and raking for comments on YouTube
videos– First: classification whether the comment is relevant or not
for the given video file– Second: ranking the relevant comments
• Focus on identifying relevant information• Comments have a very small number of words –
sometimes less than 10, on average of the order of tens
• Relevance is evaluated with respect to the information collected from other online sources about the video
12.04.23 CSCS 2013 – Bucharest, Romania 3
![Page 4: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/4.jpg)
Existing research• We have not been able to identify any previous
research in the direction of identifying relevant comments
• YouTube research– Identify the relevant features of community acceptance
(comments with many “likes”)– Extract the sentiment orientation– Differentiate between clean and noisy comments
• Other research– Ranking Comments on the Social Web (uses Digg)
12.04.23 CSCS 2013 – Bucharest, Romania 4
![Page 5: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/5.jpg)
Motivation
• The Police – Every Breath You Take
12.04.23 CSCS 2013 – Bucharest, Romania 5
![Page 6: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/6.jpg)
Motivation
• Most commented video• “10 questions that every intelligent Christian
must answer”• 1,429,425 comments on 30th May 2013 (early
morning)
• How many of these comments are spam?• Which ones would be most relevant to the
video?12.04.23 CSCS 2013 – Bucharest, Romania 6
![Page 7: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/7.jpg)
Solution• Ranking of the comments according to relevance
• Steps:1. Automatically link video with other online sources
relevant to it2. Filter comments to remove noisy comments3. Rank the remaining comments according to
relevance computed using NLP techniques
• Our solution works for music videos
12.04.23 CSCS 2013 – Bucharest, Romania 7
![Page 8: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/8.jpg)
System architecture
12.04.23 CSCS 2013 – Bucharest, Romania 8
![Page 9: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/9.jpg)
Processing pipeline
comments = fetchYouTubeComments();comments = filterComments(comments);commentTopics=createCommentTopics(comments)resources = getResources(wikipedia,allmusic,lyrics);for(int i=0;i<commentTopics.length;i++){
computeRelevance(commentTopics[i], resources);}
12.04.23 CSCS 2013 – Bucharest, Romania 9
![Page 10: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/10.jpg)
Preprocessing
• Comments retrieved with YouTube Data API– Only used last 100 comments per video
• Filter comments not written in English using JLangDetect
• Extracted the main topics for each comment using Mallet => 5 topics per comment
• Expanding the topics with synonyms and hypernyms from WordNet
12.04.23 CSCS 2013 – Bucharest, Romania 10
![Page 11: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/11.jpg)
Pre-classification of comments• Objective: to reduce the number of comments considered for
ranking by identifying noise• Classification based on a neural network by using a set of
simple linguistic features• Multilayered Perceptron implemented in Weka
• Features– Number of non-ASCII characters– Number of capital letters– Number of newlines– Number of digits– Number of trivial and swear-words– Number of words in comment– Average word size– Number of punctuation marks– Common text spam count
12.04.23 CSCS 2013 – Bucharest, Romania 11
![Page 12: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/12.jpg)
Pre-classification of comments
• Trained on a small corpus with 100 relevant comments and 100 noisy comments
• Examples of noisy comments:– "Step 1: Pause this videoStep 2: Google 'Rainymood'Step 3: Click the first linkStep 4: Unpause this videoStep 5: Thumbs? up this comment, enjoy and thankme later"– "Those 3,175 haters listen to? 'Techno'. “– " IF YOU LIKE DIRTY DIANA SONG THE SINGER '' STEFANO GIORGINI '' DID A GREAT? REMAKE
STEFANO IS A VERY GOOD SINGER SONGWRITER I THINK YOU WILL LIKE HIS VERSION JUST LOOK FOR '' STEFANO GIORGINI '' DIRTY DIANA" "
12.04.23 CSCS 2013 – Bucharest, Romania 12
![Page 13: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/13.jpg)
Pre-classification of comments
• Results of pre-classification stage
12.04.23 CSCS 2013 – Bucharest, Romania 13
Type of Instances No. Instances %
Correctly Classified Instances 174 87.46
Incorrectly Classified Instances 26 12.54
Total Number of Instances 200 -
![Page 14: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/14.jpg)
Relevance scoring stage
• Initial approach• Extract topics from comments as previously
mentioned (Mallet + WordNet)
• Fetch Wikipedia articles for artist and song name
• Score computed based in number of appearances of the topics from the comments in the articles
12.04.23 CSCS 2013 – Bucharest, Romania 14
![Page 15: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/15.jpg)
Relevance scoring stage
• Second approach: topic-based scoring• Similar to the previous one, but topics are also
extracted from the Wikipedia articles with Mallet
• Scoring is done based on:– Number of topics extracted from each comment– Wikipedia topic matches for each comment
12.04.23 CSCS 2013 – Bucharest, Romania 15
![Page 16: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/16.jpg)
Relevance scoring stage• Third approach• Multiple-source topic-based scoring
• Additional source added to the Wikipedia articles– Information from allmusic.com website on artists and
songs– Information from song lyrics
• Topics matched between comments and Wikipedia + Allmusic articles, plus exact match of lyrics
• Final relevance score is a weighted sum of the previous factors
12.04.23 CSCS 2013 – Bucharest, Romania 16
![Page 17: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/17.jpg)
Results
12.04.23 CSCS 2013 – Bucharest, Romania 17
Comment Relevance
maybe your friend should know that being english, have a picture in abbey road and "sing" all you need is love" won't make one direction? a group like the beatles...
662
my mom said she doesn't like the beatles and she said that john was only good to look at? not to hear. my dad said, " haha so true!." i'm an orphan now.
968
you shouldn't be listening to the beatles since these seem to turn your friends into enemies! beatles are all about peace!? you are not getting their message!
983
please read this ! hey i know u just wanna listen to the song but i still have to write this hoping someone will see it and that someone will care .i'm a? young musician from croatia so this spam is my only chance to get noticed.please check out my channel and i promise u won't be sorry.i appreciate your time because music means everything to me, thank you! ?
1309
i didn't mean fight other places. i meant focus on the hurt people in your own country first, then expand to the others. if people don't agree with peace that's an opinion. not a fact, and people often take offense to opinions. there isn't? anything to take offense to, they say something that's all it is. they said it, don't put meaning to it. world peace - i meant the whole world having peace there
1639
![Page 18: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/18.jpg)
Results
• Difficult to assess whether the impact of the relevance measure
• Interpreting the comments is subjective – Need human annotators
• The order of the comments is completely different from the one presented now on YouTube (correlation lower than 0.031 for the first 100 comments)
• Method 1 is also not correlated with the other two methods
• Methods 2 and 3 have a higher correlation: 0.124
12.04.23 CSCS 2013 – Bucharest, Romania 18
![Page 19: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/19.jpg)
Conclusions• 2-stage method for ranking comments on YouTube• The first stage removes noisy comments• The second stage tries to link the comments with
information from other web pages relevant for the video
• Relevance is computed based on topic-modeling with Mallet
• • Results are encouraging, but need to find a more
rigorous method of assessing them• Results are better than the usual results provided by
YouTube, however the processing time for each video should not be neglected
12.04.23 CSCS 2013 – Bucharest, Romania 19
![Page 20: Relevance based ranking of video comments on YouTube](https://reader033.fdocuments.net/reader033/viewer/2022060111/5562f90dd8b42a275f8b4892/html5/thumbnails/20.jpg)
Thank you!
• Questions?
• Discussion
12.04.23 CSCS 2013 – Bucharest, Romania 20