The WeKnowIt Project

42
Emerging, Collective Intelligence for Personal, Organisational and Social Use Symeon Papadopoulos, Yiannis Kompatsiaris (CERTH-ITI) www.weknowit.eu Trento, April 20 ICMR 2011

description

A summary presentation of the WeKnowIt European research project that was given at the ICMR2011 conference.

Transcript of The WeKnowIt Project

Page 1: The WeKnowIt Project

Emerging, Collective Intelligence for Personal, Organisational

and Social Use

Symeon Papadopoulos, Yiannis Kompatsiaris (CERTH-ITI)

www.weknowit.eu

Trento, April 20

ICMR 2011

Page 2: The WeKnowIt Project

overview

motivation & concept

approach

conclusions

intelligence layers

results

Page 3: The WeKnowIt Project

motivation

• Users upload, tag, share, connect & search

availability of massive amounts of user-generated content and data

• Existing applications are limited to simple user data management or shallow analysis

• Potential for much more if we mine the data and exploit them in the right context

Page 4: The WeKnowIt Project

collective intelligence

…a form of intelligence emerging from online user activities

Collective Intelligence >> sum of individuals’ intelligences

Page 5: The WeKnowIt Project

an example

one of my photos @ flickr

my tags: wki experiment bcn (…pretty uninformative)

my location: N/A

Page 6: The WeKnowIt Project

one of my photos @ flickr

an example

others’ photos @ flickr

tags

location

Page 7: The WeKnowIt Project

one of my photos @ flickr

an example alternative views / trends / facts

Page 8: The WeKnowIt Project

one of my photos @ flickr

an example my friend’s photos @ flickr

what did he visit next?

Page 9: The WeKnowIt Project

one of my photos @ flickr

an example

related Linked Data

Page 10: The WeKnowIt Project

collective intelligence @ weknowit

personal intelligence

media intelligence

mass intelligence

social intelligence

organizational intelligence

Page 11: The WeKnowIt Project

motivation & concept

approach

conclusions

results

overview

intelligence layers

Page 12: The WeKnowIt Project

personal intelligence

Page 13: The WeKnowIt Project

personal intelligence

Page 14: The WeKnowIt Project

media intelligence

Visual Exploration

Page 15: The WeKnowIt Project

media intelligence

Page 16: The WeKnowIt Project

mass intelligence

Page 17: The WeKnowIt Project

mass intelligence

Tag Clustering

Page 18: The WeKnowIt Project

social intelligence

Visualise Communities

Page 19: The WeKnowIt Project

social intelligence

Community Browser

Page 20: The WeKnowIt Project

organisational intelligence

Event-based Knowledge Sharing

Page 21: The WeKnowIt Project

organisational intelligence

Distributed Group Management

Page 22: The WeKnowIt Project

architecture / integration

Service Integration

Knowledge and Content Storage

Scenario-driven Service Composition

Page 23: The WeKnowIt Project

use case: emergency response

Personal Intelligence

>> Login, Upload

>> Spam detection

>> Personalized Access

Social Intelligence

>> ER Alert Service

>> Reputation Service

Mass Intelligence

>> Clustering

>> Enrichment from additional sources

Media Intelligence

Photo arrives at ER control centre

>> Automatic localisation of photo

>> Photo & speech auto-tagging

Organisational Intelligence

>> Log Merging & Viewing

>> Incident Information Access

Page 24: The WeKnowIt Project

use case: travel

Personal Intelligence

>> Personal Recommendations

Mass Intelligence

>> Landmark & Event detection

>> Ranked facet lists of POIs

>> Hybrid Image Clustering

Social Intelligence

>> Group profiling & recommendations

>> Friends position, alert

Travel Preparation

Media Intelligence

>> Image Localisation

>> Tag suggestions Mobile Guidance

Post Travel

Page 25: The WeKnowIt Project

case: community detection in social media (1/2)

• Structural similarity + Local expansion (highly efficient and scalable approach)

• Not necessary to know the number

of clusters

• Noise resilient (not all nodes need to be part of a

community)

• Generic approach adaptable to

many applications (depending on node – edge

representation)

+

S. Papadopoulos, Y. Kompatsiaris, A. Vakali. “A Graph-based Clustering Scheme for Identifying Related Tags in Folksonomies”. In Proceedings of DaWaK'10, Springer-Verlag, 65-76

Page 26: The WeKnowIt Project

case: community detection in social media (2/2)

tags: sagrada familia, cathedral, barcelona

taken: 12 May 2009 lat: 41.4036, lon: 2.1743

PHOTOS & METADATA SPATIAL CLUSTERING + TEMPORAL ANALYSIS

COMMUNITY DETECTION

CLASSIFICATION TO LANDMARKS/EVENTS

VISUAL

TAG

HYBRID

S. Papadopoulos, C. Zigkolis, Y. Kompatsiaris, A. Vakali. “Cluster-based Landmark and Event Detection on Tagged Photo Collections”. In IEEE Multimedia Magazine 18(1), pp. 52-63, 2011

Page 27: The WeKnowIt Project

intelligence layers

motivation & concept

approach

conclusions

overview

results

Page 28: The WeKnowIt Project

• User modeling & interaction (CURIO, attention streams)

• Media understanding

(photo/text localization, photo/speech auto-tagging)

• Media organization

(graph-based clustering, faceted search, event detection)

• Community analysis & management

(administration, browsing, reputation, notification)

• Knowledge representation & management

(Event Model F, dgFOAF)

results: research

Page 29: The WeKnowIt Project

Integrated Prototypes

• ER (desktop & mobile)

• Travel (trip planning, mobile guidance, post-travel photo management)

Stand-alone applications

• WKI image recognizer

• VIRAL (visual search and automatic localization)

• ClustTour (city exploration by use of photo clusters)

• Semaplorer++

• STEVIE (mobile POI management)

results: applications

http://www.weknowit.eu/tr

Page 30: The WeKnowIt Project

results: exploitation

VIRAL evaluation by Vodafone 360

Page 31: The WeKnowIt Project

http://mklab.iti.gr/wki-apps

results: public APIs

Page 32: The WeKnowIt Project

...so far

• CI emerges from massive online activities

• it is hard to extract and manage

• ...but is definitely worth the effort.

in the future...

• other domains: news, finance, e-gov

• real-time CI

• CI Linked Data

conclusions

Page 33: The WeKnowIt Project

thank you!

Presentation online @ http://www.weknowit.eu > news

Page 34: The WeKnowIt Project

Additional Slides

Page 35: The WeKnowIt Project

content in weknowit

massive Web 2.0

WKI user-contributed

standard training data

offline model creation, training

online user profiling, method invocation

Standard annotated corpora used for training. • Single-modality: text (Brown corpus), speech (TIMIT database), image (Corel database) • Single-source: prepared by a single person/organization • Consistent quality: absence of spam, malicious or erroneous data • Small-moderate volume: Manually produced

Massive user generated content and feedback from Web 2.0 applications • Multi-modality: e.g. image + tags, image + geo-location + time • Multi-source: may be generated by different applications, user communities, e.g. Flickr, Panoramio, PhotoBucket • Inconsistent quality: noise, spam, ambiguity • Huge volume: Massively produced and disseminated

Online content and user actions by WeKnowIt users. It is mainly used for triggering WeKnowIt services and for providing context to them, e.g. user profile, input content to be used as example for querying, etc.

Page 36: The WeKnowIt Project

Statistical approaches

Probabilistic models (pLSA, Bag-Of-Words) Graph-based approaches (SNA, community detection)

Content analysis

Text models (n-gram, LDA, CRF) Image processing (visual feature extraction) Speech modeling (spectral analysis, HMM)

Knowledge Based

Lookup (WordNet) Thesaurus Lookup (GeoPlanet) Concept detection (Wikipedia, domain ontologies)

Variety of approaches depending on content-metadata input.

standard training data

WKI user-contributed

massive Web 2.0 – unstructured

massive Web 2.0 – semi-structured

standard

technical approach

Page 37: The WeKnowIt Project

massive Web 2.0 WKI user-contributed standard training data

Visual analysis

Text annotation

Get recommendations

POI recommendation

POI clustering

Search place POI

csxPOIs

Locations

WP1

WP2

WP3

WP5

Tag normalization

Text classification

ClustTour

Local tag community detector

Tag processing

Topics

WP2

WP3

Emergency alert service

Community analysis tool

Social connections

WP4

Entity facet extraction - ranking WP3

Entities

Speech search

Semantic photo query

Log merger

Semaplorer(++)

Events

WP5

WP2

Event model F + M3O

CURIO

VERACITY

Common data model

Representation

WP5

WP6

WP1 Speech Indexing

Data Storage

Storage

WP6

WP2

Account Manager

Login

Community administration platform

Group Management

WP1

WP4

WP5

Access

Manage Item Comment

Tag

Search Knowledge Base Lexical Spam Detector

Users messaging WP1

GUI

WP3

Mobile app

Desktop proto

Post ER tool

ER

Travel preparation

Mobile guidance

Post-travel logging

CSG

WP7

Named entity detection

WP6

System Integration

Page 38: The WeKnowIt Project

weknowit work structure

WP1: Personal Intelligence

WP2: Media Intelligence

WP3: Mass Intelligence

WP4: Social Intelligence

WP6: Architecture / Integration

WP7.I Use Case: ER

WP7.II Use Case: Travel

WP8: Dissemination & Exploitation

research development

WP9: Management

management

dissemination & exploitation

WP5: Organisational Intelligence

Page 39: The WeKnowIt Project

Causality Pattern in Event-Model-F

• Event (cause) implies other event (effect)

• Causal relationship holds under some justification

• Causes and effects are events, and only events

Page 40: The WeKnowIt Project

OntoMDE

• Specification of MoOn using eCore and OAM as UML2 class diagram

• Transformation steps implemented

• Evaluation with ontologies of different complexity

Page 41: The WeKnowIt Project

Content vs. Structure Concepts

Page 42: The WeKnowIt Project

Activities

• Collective Intelligence Workshops and Special Session

• Summer schools

Publications

• 8 journal publications

• Trans. on MultiMedia, IEEE MultiMedia, J. of Web Semantics, MTAP, etc.

• 59 conference papers

• ACM MultiMedia, SIGIR, CVPR, ESWC, WWW, ICIP, WSDM, etc.

• 2 CI book chapters + 1 CI White Paper

• 3 patent applications

results: dissemination