Lak12 - Leeds - Deriving Group Profiles from Social Media
description
Transcript of Lak12 - Leeds - Deriving Group Profiles from Social Media
1
Deriving Group Profiles from Social Media to Facilitate the Design of Simulated Environments
for Learning
Ahmad Ammari, Lydia Lau, Vania DimitrovaThe University of Leeds, UK
atLearning Analytics and Knowledge 2012, Vancouver, Canada
2
In this presentation …
• Vision of ImREAL as motivation• Potential of semantics in smart social
spaces for learning applications• Experimental study on combining
semantics and machine learning for group profiling of digital traces
• Lessons learned• Future challenges
Vision
3
In a simulator for learning
In the real world
Forethought Reflection
Immersive Reflective Experience based Adaptive
Learning
Consortium (2010-13)
University of Leeds, UK - Project Coordinator/Scientific Coordinator
Trinity College Dublin, Ireland
Graz University of Technology, Austria
University of Erlangen-Nuremberg, Germany
Delft University of Technology, The Netherlands
Imaginary Srl, Italy
EmpowerTheUser Ltd, Ireland4
Smart Social Spaces – semantic underpinning
5
Sensors& collectors
Ontologies
Noise filtration
Groupprofiling
Semanticdata browsers
Semantic augmentation service
Smart social spaces
Semanticquery service
Open social spaces
Closed social spaces
ViewpointSemantic service
6
1. Sensors& collectors
Ontologies
2. Noise filtration
3. Groupprofiling
Smart social spacesInterpersonal skills for
Job interview
This talk …Controlled
YouTube-likeenvironment
+ supervised machine learning
+ unsupervised machine learning
Noise Filtration Service• Input: social media content (e.g. YouTube
comments)• Filters the noise from social media content
by removing the content that are not useful to generate social profiles
• Output: clean social media content, author IDs
Support service to social profiling services. Clean content reflects awareness of
authors in domain aspects (e.g. Job Interview concepts)
7
8
The Social Noise Filtration Service: Methodology
Experimentally Controlled Comments
Public Comments
On YouTube
Analyze
Pre-Process
Term – Comment Matrix
(Training Corpus)
SCORE
SCORES
Train / Test Classification Models
Predict & Filter Noisy Social Data Content
Noise
Clean
Select Target Videos
Semantically Enriched Bag of Words (BoW)Ground Truth Corpus
9
Example CommentsComment score
I think trying to decipher gestures as to have a general meaning is a bit too vague. You have to put the background, education, personality, and the culture of the individual into consideration. Gestures are often misunderstood and not the clearest form of communication. For example…
8.0
…I will comment that most of us have grown up with being told that strong eye contact (without looking psychotic) is good … However, I agree that you notice if someone is not used to it and seems intimidated. At this point it is a good to look away periodically.
7.7
Interview on Wednesday, hope it goes well 0.68
Group Profiling
10
… …
NoiseRelevant
Clustering– based Group
Profiles
Groups of comment authors are derived based on content similarity in their comments.Each group profile shows:1. Important Job Interview Terms / Concepts
used by group2. The Locations, Gender, and Age Groups of
authors in group3. Sample Comments written by authors in
group
Demographic – based Group
Profiles
Groups of comment authors can also be customized based on user predefined demographics.Example: What are the important Job Interview Concepts for Adult, Female authors living in USA & UK?
P1
P2
Usernames of the authors of relevant comments and their demographic characteristics (Gender, Age Group, and Location) are mined from the YouTube User Profiles
Exploration experiment
Purpose is to answer the following:
Q1: Can we generate useful group profiles to aid training professionals in identifying learning needs?
Q2: Can we derive learning domain concepts to augment learner models?
11
Dataset used
12
Data Property ValueNumber of Job Interview-related YouTube Videos 17Number of Comments Retrieved 1465Number of Remaining Comments after Noise Filtration 471 (32%)Number of Unique Comment Authors 393Comment to Author Ratio 1.20
Sample Output Clustering–based Group Profiles
13
Third largest group – Size: 36 Authors, 9% of population
Sample Output Demographic–based Group Profiles
Location: GB – Age: From 20 To 40 years
14
Location: US – Age: From 20 To 40 years
Frequent Job Interview Concepts
Interview_good, eye_contact, eyes, interviewer, hope, helpful
Frequent Job Interview Concepts
Good_Interview, people, company, interviewer, time, girl, experience,
answer, money, questions, nervous, education, fingers, hands
Location: Asia– Age: From 20 To 40 years
Frequent Job Interview Concepts
questions, answers, candidate, interview_guide, money, pay,
job_guide, watch
Lessons Learned
• On noise filtration– Choice of threshold for noise filtration?–What is “inappropriate” content?– Can “promotional” content be detected?
• On potential of group profiles to aid training professionals and learner model augmentation– Authentic comments were liked–Would be useful to know more about the
viewpoints within a group15
Future work
• Increase use of semantics (e.g. For viewpoints extraction)
• Improve quality of group profiling (e.g. By understanding the impact of clusters sorted by age)
• How to get more accurate demographic data (e.g. ‘Place’ from YouTube was not reliable)
16
17
Deriving Group Profiles from Social Media to Facilitate the Design of
Simulated Environments for Learning
Ahmad Ammari, Lydia Lau, Vania DimitrovaThe University of Leeds, UK
http://www.imreal-project.eu/