WeKnowIt Emerging, Collective Intelligence for personal, organisational and social use
The WeKnowIt Project
-
Upload
symeon-papadopoulos -
Category
Technology
-
view
529 -
download
1
description
Transcript of The WeKnowIt Project
Emerging, Collective Intelligence for Personal, Organisational
and Social Use
Symeon Papadopoulos, Yiannis Kompatsiaris (CERTH-ITI)
www.weknowit.eu
Trento, April 20
ICMR 2011
overview
motivation & concept
approach
conclusions
intelligence layers
results
motivation
• Users upload, tag, share, connect & search
availability of massive amounts of user-generated content and data
• Existing applications are limited to simple user data management or shallow analysis
• Potential for much more if we mine the data and exploit them in the right context
collective intelligence
…a form of intelligence emerging from online user activities
Collective Intelligence >> sum of individuals’ intelligences
an example
one of my photos @ flickr
my tags: wki experiment bcn (…pretty uninformative)
my location: N/A
one of my photos @ flickr
an example
others’ photos @ flickr
tags
location
one of my photos @ flickr
an example alternative views / trends / facts
one of my photos @ flickr
an example my friend’s photos @ flickr
what did he visit next?
one of my photos @ flickr
an example
related Linked Data
collective intelligence @ weknowit
personal intelligence
media intelligence
mass intelligence
social intelligence
organizational intelligence
motivation & concept
approach
conclusions
results
overview
intelligence layers
personal intelligence
personal intelligence
media intelligence
Visual Exploration
media intelligence
mass intelligence
mass intelligence
Tag Clustering
social intelligence
Visualise Communities
social intelligence
Community Browser
organisational intelligence
Event-based Knowledge Sharing
organisational intelligence
Distributed Group Management
architecture / integration
Service Integration
Knowledge and Content Storage
Scenario-driven Service Composition
use case: emergency response
Personal Intelligence
>> Login, Upload
>> Spam detection
>> Personalized Access
Social Intelligence
>> ER Alert Service
>> Reputation Service
Mass Intelligence
>> Clustering
>> Enrichment from additional sources
Media Intelligence
Photo arrives at ER control centre
>> Automatic localisation of photo
>> Photo & speech auto-tagging
Organisational Intelligence
>> Log Merging & Viewing
>> Incident Information Access
use case: travel
Personal Intelligence
>> Personal Recommendations
Mass Intelligence
>> Landmark & Event detection
>> Ranked facet lists of POIs
>> Hybrid Image Clustering
Social Intelligence
>> Group profiling & recommendations
>> Friends position, alert
Travel Preparation
Media Intelligence
>> Image Localisation
>> Tag suggestions Mobile Guidance
Post Travel
case: community detection in social media (1/2)
• Structural similarity + Local expansion (highly efficient and scalable approach)
• Not necessary to know the number
of clusters
• Noise resilient (not all nodes need to be part of a
community)
• Generic approach adaptable to
many applications (depending on node – edge
representation)
+
S. Papadopoulos, Y. Kompatsiaris, A. Vakali. “A Graph-based Clustering Scheme for Identifying Related Tags in Folksonomies”. In Proceedings of DaWaK'10, Springer-Verlag, 65-76
case: community detection in social media (2/2)
tags: sagrada familia, cathedral, barcelona
taken: 12 May 2009 lat: 41.4036, lon: 2.1743
PHOTOS & METADATA SPATIAL CLUSTERING + TEMPORAL ANALYSIS
COMMUNITY DETECTION
CLASSIFICATION TO LANDMARKS/EVENTS
VISUAL
TAG
HYBRID
S. Papadopoulos, C. Zigkolis, Y. Kompatsiaris, A. Vakali. “Cluster-based Landmark and Event Detection on Tagged Photo Collections”. In IEEE Multimedia Magazine 18(1), pp. 52-63, 2011
intelligence layers
motivation & concept
approach
conclusions
overview
results
• User modeling & interaction (CURIO, attention streams)
• Media understanding
(photo/text localization, photo/speech auto-tagging)
• Media organization
(graph-based clustering, faceted search, event detection)
• Community analysis & management
(administration, browsing, reputation, notification)
• Knowledge representation & management
(Event Model F, dgFOAF)
results: research
Integrated Prototypes
• ER (desktop & mobile)
• Travel (trip planning, mobile guidance, post-travel photo management)
Stand-alone applications
• WKI image recognizer
• VIRAL (visual search and automatic localization)
• ClustTour (city exploration by use of photo clusters)
• Semaplorer++
• STEVIE (mobile POI management)
results: applications
http://www.weknowit.eu/tr
results: exploitation
VIRAL evaluation by Vodafone 360
http://mklab.iti.gr/wki-apps
results: public APIs
...so far
• CI emerges from massive online activities
• it is hard to extract and manage
• ...but is definitely worth the effort.
in the future...
• other domains: news, finance, e-gov
• real-time CI
• CI Linked Data
conclusions
thank you!
Presentation online @ http://www.weknowit.eu > news
Additional Slides
content in weknowit
massive Web 2.0
WKI user-contributed
standard training data
offline model creation, training
online user profiling, method invocation
Standard annotated corpora used for training. • Single-modality: text (Brown corpus), speech (TIMIT database), image (Corel database) • Single-source: prepared by a single person/organization • Consistent quality: absence of spam, malicious or erroneous data • Small-moderate volume: Manually produced
Massive user generated content and feedback from Web 2.0 applications • Multi-modality: e.g. image + tags, image + geo-location + time • Multi-source: may be generated by different applications, user communities, e.g. Flickr, Panoramio, PhotoBucket • Inconsistent quality: noise, spam, ambiguity • Huge volume: Massively produced and disseminated
Online content and user actions by WeKnowIt users. It is mainly used for triggering WeKnowIt services and for providing context to them, e.g. user profile, input content to be used as example for querying, etc.
Statistical approaches
Probabilistic models (pLSA, Bag-Of-Words) Graph-based approaches (SNA, community detection)
Content analysis
Text models (n-gram, LDA, CRF) Image processing (visual feature extraction) Speech modeling (spectral analysis, HMM)
Knowledge Based
Lookup (WordNet) Thesaurus Lookup (GeoPlanet) Concept detection (Wikipedia, domain ontologies)
Variety of approaches depending on content-metadata input.
standard training data
WKI user-contributed
massive Web 2.0 – unstructured
massive Web 2.0 – semi-structured
standard
technical approach
massive Web 2.0 WKI user-contributed standard training data
Visual analysis
Text annotation
Get recommendations
POI recommendation
POI clustering
Search place POI
csxPOIs
Locations
WP1
WP2
WP3
WP5
Tag normalization
Text classification
ClustTour
Local tag community detector
Tag processing
Topics
WP2
WP3
Emergency alert service
Community analysis tool
Social connections
WP4
Entity facet extraction - ranking WP3
Entities
Speech search
Semantic photo query
Log merger
Semaplorer(++)
Events
WP5
WP2
Event model F + M3O
CURIO
VERACITY
Common data model
Representation
WP5
WP6
WP1 Speech Indexing
Data Storage
Storage
WP6
WP2
Account Manager
Login
Community administration platform
Group Management
WP1
WP4
WP5
Access
Manage Item Comment
Tag
Search Knowledge Base Lexical Spam Detector
Users messaging WP1
GUI
WP3
Mobile app
Desktop proto
Post ER tool
ER
Travel preparation
Mobile guidance
Post-travel logging
CSG
WP7
Named entity detection
WP6
System Integration
weknowit work structure
WP1: Personal Intelligence
WP2: Media Intelligence
WP3: Mass Intelligence
WP4: Social Intelligence
WP6: Architecture / Integration
WP7.I Use Case: ER
WP7.II Use Case: Travel
WP8: Dissemination & Exploitation
research development
WP9: Management
management
dissemination & exploitation
WP5: Organisational Intelligence
Causality Pattern in Event-Model-F
• Event (cause) implies other event (effect)
• Causal relationship holds under some justification
• Causes and effects are events, and only events
OntoMDE
• Specification of MoOn using eCore and OAM as UML2 class diagram
• Transformation steps implemented
• Evaluation with ontologies of different complexity
Content vs. Structure Concepts
Activities
• Collective Intelligence Workshops and Special Session
• Summer schools
Publications
• 8 journal publications
• Trans. on MultiMedia, IEEE MultiMedia, J. of Web Semantics, MTAP, etc.
• 59 conference papers
• ACM MultiMedia, SIGIR, CVPR, ESWC, WWW, ICIP, WSDM, etc.
• 2 CI book chapters + 1 CI White Paper
• 3 patent applications
results: dissemination