Creating and Sharing Structured Semantic Web Contents through the Social Web

65
Creating and Sharing Creating and Sharing Structured Semantic Web Structured Semantic Web Contents through the Contents through the Social Web Social Web (Main Evaluation) Aman Shakya Advisor: Prof. Hideaki Takeda Sub-advisors: Assoc. Prof. Nigel Collier Assoc. Prof. Kenro Aihara

description

Creating and Sharing Structured Semantic Web Contents through the Social Web. (Main Evaluation) Aman Shakya Advisor: Prof. Hideaki Takeda Sub-advisors: Assoc. Prof. Nigel Collier Assoc. Prof. Kenro Aihara. Outline. Introduction Social Semantic Web - PowerPoint PPT Presentation

Transcript of Creating and Sharing Structured Semantic Web Contents through the Social Web

Page 1: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Creating and Sharing Creating and Sharing Structured Semantic Web Structured Semantic Web Contents through the Social Contents through the Social WebWeb

(Main Evaluation)

Aman ShakyaAdvisor: Prof. Hideaki Takeda Sub-advisors: Assoc. Prof. Nigel Collier

Assoc. Prof. Kenro Aihara

Page 2: Creating and Sharing  Structured Semantic Web Contents through the Social Web

OutlineOutlineIntroduction

◦ Social Semantic Web◦ State-of-art and Problems

Proposed approach◦ The StYLiD system◦ Concept consolidation◦ Concept grouping

EvaluationPractical applicationsConclusions

7/27/2009 main evaluation 2

Page 3: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Introduction

7/27/2009 3main evaluation

Page 4: Creating and Sharing  Structured Semantic Web Contents through the Social Web

BackgroundBackgroundInformation Sharing

◦Information publishing◦Understandable semantics◦Information dissemination

Shared information◦Better utilization Increased value

Shared information put together◦Valuable knowledge

7/27/2009 main evaluation 4

Page 5: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Social Web and Web 2.0Social Web and Web 2.0

◦Easy to publish, understand and use◦Information sharing platform◦User generated contents◦Connecting people◦Collaboration

◦Mass participation – Power of People◦Wisdom of the crowds

7/27/2009 main evaluation 5

Page 6: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Current Limitations and Current Limitations and NeedsNeeds

Data processing and automation◦Unstructured data only for humans

Interoperability◦Sharing data across

different applications

Integration◦Combining data from different applications

7/27/2009 main evaluation 6

Page 7: Creating and Sharing  Structured Semantic Web Contents through the Social Web

The Semantic WebThe Semantic WebWeb of Structured DataMachine understandable semanticsOntologies

◦Represent Conceptualizations of things◦Consensus and common formats

Enables◦Automated processing ◦ Interoperation and Integration◦Effective search and browsing

7/27/2009 main evaluation 7

Page 8: Creating and Sharing  Structured Semantic Web Contents through the Social Web

ChallengesChallengesDifficult to publish on the Semantic WebWide variety of data to share

◦ Long Tail of information domains (Hunyh et al. 2007)

Not enough ontologiesOntology creation is a difficult processGoal - To enable people to easily share wide variety

of semantically structured data

7/27/2009 main evaluation 8

?

Page 9: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Social Semantic WebSocial Semantic WebSocial software + Semantic WebWeb 3.0

7/27/2009 main evaluation 9

Social Semantic Web

Information connectivity- Adapted from (Decker, 2005)

Page 10: Creating and Sharing  Structured Semantic Web Contents through the Social Web

State-of-Art: Social Semantic Web Structured content creation on the

Social Semantic Web

Direct Structured Contents Derived Structured Contents

Instance Data Creation

Ontology + Instance Data creation

Semantification of Social Data

Semantics from Text

Semantics of Tags

Semantic Blogging

Semantic Bookmarking

Semantic Desktop

Semantic Wikis

Collaborative Ontology Creation

7/27/2009 main evaluation

Semantic Annotation

Data Exporters

Scrapers

Emergent Semantics

10

Page 11: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Collaborative Knowledge Base Creation

Collaborative Knowledge Base

Users Users

7/27/2009 main evaluation 11

Knowledge base = ontology + instance data

Page 12: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Collaborative Knowledge Base Creation Systems

Ease of use

Expressiveness

Constraints

Multiplicity

Consensus

Semantic WikisSMW, ikeWiki,

etc

Complexextended

wiki syntax, some

training needed

ModerateMainly instances, concept schemas

possible

strict type

constraints

No NeededWiki way

Freebase

Metaweb Inc.

Moderate

Interactive but

elaborate interface

ModerateConcept schemas,

instances

strict type

constraints

Allowed but

concepts not

related

Mostly neededWiki way, by admin

my-Ontolog

ySiorpaes & Hepp, 2007

Complexunderstanding of ontology

needed

ModerateConcepts, relations, instances

Strict logical

constraints

No NeededWiki way

Ontology

Maturing

Braun et al., 2007

Fairly easy

need to build taxonomy

LowConcept hierarchy

free tagging

No NeededBy

interaction

Desired Solution

Easy Moderate Minimum Yes Optional

7/27/2009 12main evaluation

Page 13: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Problems1. Complexity and learning curve

◦ Powerful collaborative systems difficult for ordinary people

2. Difficult to create perfect concept definitions and ontologies

◦ Difficult to accommodate all requirements◦ Strict constraints can make the model rigid

3. Existence of multiple conceptualizations◦ Different perspectives or contexts

4. Difficulty of collaboration and consensus

7/27/2009 main evaluation 13

Page 14: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Proposed Approach

7/27/2009 14main evaluation

Page 15: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Proposed Collaborative Knowledge Base Creation

Collaborative Knowledge Base

Users

Users

Local KB

Local KB

Local KB

Users

7/27/2009 main evaluation 15

Page 16: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Overview of Proposed Approach

Social Platformfor

Structured Data Authoring

Concept Grouping

Concepts

Instances

Structured Data Collection

Browsing, Searching,Services

Concept Consolidation

Schema Alignment

Structured Linked Data Grouped

concepts

User CommunityEmerging

Lightweight Ontologies

7/27/2009 main evaluation 16

Page 17: Creating and Sharing  Structured Semantic Web Contents through the Social Web

StYLiDStYLiDStructure Your own Linked Data

http://www.stylid.orgSocial Software for Sharing a wide variety of Structured Data

Users freely define their own concepts Easy for ordinary peopleConsolidate multiple concept schemasGroup and organize similar conceptsPopular evolving concepts definitions

7/27/2009 17main evaluation

Page 18: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Creating a new ConceptList of Attributes

Description

Suggested Value Range

7/27/2009 18main evaluation

Or Reuse / Modify existing Concept

“Hotel” Concept

Page 19: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Instance DataLiteral value

Pick value from Suggested range

External URI

Multiple Values

19

Resource URI

Shinjuku Prince Hotel

7/27/2009 main evaluation

Page 20: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Concept ConsolidationConcept Consolidation

7/27/2009 main evaluation 20

Hotel 1

Name

Amenities

Capacity

Contact

Price

Access

Rating

Hotel 2

Name

Facilities

No. of rooms

Phone-number

Single room price

Double room price

Nearest station

Category

Address

Hotel 3

Name

Price

Rating

City

Country

Near-by attractions

Hotel 4

Name

Phone-number

Zip-code

Latitude

Longitude

No. of stories

sameSynonymous / different labels

Different Contexts / PerspectivesMany-to-one Complimentary

Page 21: Creating and Sharing  Structured Semantic Web Contents through the Social Web

7/27/2009 main evaluation 21

Hotel (Consolidated Concept )Name

Facilities

Capacity

Contact

Single room price

Double room price

Access

Rating

Address

Zip-code

Latitude

Longitude

Near-by attractions

No. of stories

Consolidated Concept

Page 22: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Concept ConsolidationConcept Consolidation A concept consolidation C is defined as a triple

< , S, A> where◦ - consolidated concept◦ S - set of constituent concepts {C1,C2 ,…..Cn}◦ A is the attribute alignment between and S

Based on Global-as-View (GAV) approach for data integration (Lenzerini, 2002)

◦ Global schema defined as views on source schemas

Consolidated Concept with consolidated attributes◦ aligned to source concept attributes as views

CC

C

C

7/27/2009 main evaluation 22

Page 23: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Concept ConsolidationConcept Consolidation

23

C1a2a

ma

iCaligned( , )

aligned( , )1a 1

ia2ia

inia

1ia

2a 2ia

aligned( , )ma inia

)( 1ia)( 2

ia

)( inia

view

1C

nC

iM

nM

1M

A = { , … }1M 2M nM

image< , S, A>C

7/27/2009 main evaluation 23

)( 1ia

k k

Page 24: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Concept ConsolidationConcept ConsolidationConsolidated view of instancesTranslation of instances

◦From one conceptualization to anotherQuery Unfolding (Advantage of GAV over LAV)

◦Queries over (in terms of attributes)

to queries over {C1,C2 ,…..Cn} ◦Using alignment A◦Union of results

Translation of queries

C

7/27/2009 main evaluation 24

))(,(),( jj akvakv

)(.....)()()( 2211 nn CQCQCQCQ

Page 25: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Concept CloudConcept Cloud

Sub-Cloud

Consolidated concept

7/27/2009 main evaluation 25

Page 26: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Experiment on ConceptualizationHypothesis

◦ Multiple conceptualizations by different people for the same thing can be consolidated

Methodology◦ Participants given short text passages (6

participants)◦ List down Facts structured as

(Attribute, Value) table

All concept schemas aligned manually

attribute

value

name Kiyomizu

location Kyoto….. …..

26

Concept schema

7/27/2009 26main evaluation

Page 27: Creating and Sharing  Structured Semantic Web Contents through the Social Web

ObservationsObservations

7/27/2009 main evaluation 27

Types of Alignment Relations found

Attribute label similarity

Page 28: Creating and Sharing  Structured Semantic Web Contents through the Social Web

RemarksRemarksPeople can express their conceptualizations in

terms of schemaDifferent people have different

conceptualizations◦ No one covers all possible attributes

Conceptualizations overlap significantlyMost parts can be alignedMost have simple alignment relations

Multiple conceptualizations can be consolidated

287/27/2009 28main evaluation

Page 29: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Alignment of Concept Alignment of Concept SchemasSchemas

Attribute Alignments suggested Automatically◦ Alignment API implementation (with WordNet extension)

(Euzenat, 2004)Community-supported alignment

◦ Human intelligence + Machine intelligence

Alignments are represented and saved◦ Alignment ontology (Hughes and Ashpole, 2004)◦ Alignment API alignment specification language (Euzenat et al.,

2004) Other formats : C-OWL, SWRL, OWL axioms, XSLT, SEKT-ML and SKOS.

◦ Incremental alignment (maintained collaboratively)

A Unified View◦ Consolidated concept with Consolidated Attributes◦ Homogenous table of data

297/27/2009 main evaluation 29

Page 30: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Two Hotel concepts

x

7/27/2009 main evaluation 30

Consolidated attributes

Semi-automatic Schema Semi-automatic Schema AlignmentAlignment

Page 31: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Search on Consolidated Concept

Consolidated Structured Consolidated Structured SearchSearch

7/27/2009 main evaluation 31

Find all hotels with location “Tokyo” and type “luxury”

Hotel 1 ---- Hotel 2location address

type category

Page 32: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Concept GroupingConcept Similarity

ConceptSim(C1, C2) = w1*NameSim(N1, N2) + w2*SchemaSim(S1, S2)

NameSim ◦ WordNet-based similarity - Lin’s algorithm (1998)◦ Levenshtein distance

SchemaSim ◦ Average similarity of best matching pairs of

attributes

Calculate ConceptSim between all pairs of concepts

Group similar concepts above Threshold 327/27/2009 main evaluation 32

Page 33: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Schema SimilaritySchema SimilarityCalculate NameSim for all pairs of attributes to create

an n1*n2 matrix M = [NameSim(A1X A2)]

Find best matching pairs using Hungarian Algorithm (M)(Kuhn, 1955; Munkres, 1957)

Calculate matching averageSchemaSim(S1, S2) = 2xSimilarity of best matching pairs / (|A1|+|A2|)

Adapted from Semantic similarity between sentences (Simpson and Dao, 2005)

7/27/2009 main evaluation

A1A2

S1 S2

33

Page 34: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Visualization of Concepts Visualization of Concepts GroupingGrouping

Cytoscape

7/27/2009 main evaluation 34

Page 35: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Experiments on Freebase Experiments on Freebase DataData

Purpose◦ Evaluate automatic schema alignment◦ Evaluate proposed concept grouping method◦ Observations about user-defined concepts

Community-driven database of world’s information

User-defined Types – concept schemas◦ Queried out (May 20, 2008)

Cleaning◦ Filter out test types, stop-words, types without

instances357/27/2009 35main evaluation

Page 36: Creating and Sharing  Structured Semantic Web Contents through the Social Web

ObservationsObservationsAfter cleaning

◦ 1,412 concepts◦ 500 users who defined concepts

People want to share a wide variety of data

People define their own concept schemas

Most people only define few concepts (1-5)◦ Long tail of information types

7/27/2009 main evaluation 36

Page 37: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Freebase Concept Consolidation Concepts with same name, synonyms,

morphological variants◦ 57 consolidated concepts formed

Multiple versions of concept by different users◦ Up to 6 versions of the same concept◦ Same user also defines multiple versions

Alignments suggested automatically◦ 51 alignment relations (44 aligned attribute sets)◦ Human judgement◦ Precision 88.24%◦ Recall 67.16%

377/27/2009 37main evaluation

Page 38: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Concept Consolidation Concept Consolidation ExampleExample{Recipe (user1), Recipe (user2), Recipes (user3) ….}

r1 r2 r3

Consolidated concept - Recipe Consolidated attributes

◦ {r1#ingredient, r2#ingredients, r3#materials}◦ {r1#steps, r2#instructions}◦ r3#directions◦ r2#tools_required◦ r3#taste◦ r3#author ……

Aligned attribute Sets

38

(adapted from Freebase)

7/27/2009 38main evaluation

Page 39: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Evaluation of Concept Evaluation of Concept GroupingGrouping

ConceptSim(C1, C2) = w1*NameSim(N1, N2) + w2*SchemaSim(S1, S2)

Concept grouping with different thresholds (w1 = 0.7, w2 = 0.3)

Concept grouping with different weights (threshold = 0.8)

397/27/2009 39main evaluation

Page 40: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Emergence of Lightweight OntologiesConcepts contributed by communityConcept consolidationConcept groupingPopularity of concepts (as in Tag

clouds)

Common vocabulary for structured information sharing

Conceptual schemas (class/property)Informal organization by similarity

7/27/2009 40main evaluation

Page 41: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Informal Lightweight Informal Lightweight OntologyOntology

7/27/2009 main evaluation

source: Schaffert et al. (2005) p. 7

41

Page 42: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Evaluation

7/27/2009 42main evaluation

Page 43: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Evaluation of UsabilityHypothesis

◦ StYLiD is more usable than Freebase (for given tasks)

Methodology◦ Tasks performed with StYLiD and Freebase

Task 1 - Structured data authoring Task 2 - Concept schema creation Task 3, 4 - Modifying and reusing concepts Task 5 - Structured concepts and instances authoring Task 6 - Searching

◦ Observations Questionnaires, screen logs, comments, etc

7/27/2009 43main evaluation

Page 44: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Example (Task 1)Example (Task 1)

7/27/2009 main evaluation 44

Input Band – The Beatles

Page 45: Creating and Sharing  Structured Semantic Web Contents through the Social Web

ParticipantsParticipantsTotal 15 participants

◦Including 6 without IT background◦Different backgrounds

Public policy, international relations, psychology, telecommunication, networks, hotel staff, etc.

◦From 10 countries◦Age : 22 – 43 (avg. 28.3)◦Most did not know the systems before

7/27/2009 main evaluation 45

Page 46: Creating and Sharing  Structured Semantic Web Contents through the Social Web

ResultsSystem Usability Scale (SUS) (Digital

Equipment Corp.)◦Average scores: StYLiD – 69.7%, Freebase

– 39.3% Enhanced Semantic MediaWiki – 54.8% (Pfisterer et al.,

2008)

Aggregated results from the Tasks (score: 0-4)

467/27/2009 main evaluation

Page 47: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Results for non-IT Results for non-IT participantsparticipants6 participantsSUS scores

◦StYLiD (71.67%), Freebase (50.42%)

7/27/2009 47

Page 48: Creating and Sharing  Structured Semantic Web Contents through the Social Web

ObservationsObservationsStYLiD quite usable without any training,

knowledge or helpMost users preferred StYLiD to FreebaseSpecifying attribute value range not easy Strict data type constraints can cause

problemsMany people modify and reuse conceptsPeople try to input all data in minimum steps Data entry can be made easier and quicker

◦ Auto-complete mechanisms would be helpful

7/27/2009 main evaluation 48

Page 49: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Comparison with some Comparison with some systemssystems

7/27/2009 main evaluation 49

StYLiD Freebase Semantic MediaWiki

•Concept creation

UI supported UI supported Template markup

•Instance creation

Form-based Form-based Extended wiki syntax + forms

•Data authoring

Blogging / social bookmarking

Structured wiki Wiki text annotation

•Data import Wrappers Bulk import facility

Not possible

•Constraints Flexible Strict type constraints

Strict type constraints

•Multiplicity Allowed Partly No•Consolidation Schema-level Some

instancesNo

•Organization Concept grouping

Bases Categories

Page 50: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Practical Applications

7/27/2009 50main evaluation

Page 51: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Application ScenariosSocial Site for

Structured Information Sharing

Concept Schemas

Structured data

External Data

Resources

StYLiDCMS

IntegrationSchema

Alignment

Information Sharing Social

Semantic Website

Users

Users517/27/2009 51main evaluation

Page 52: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Application ScenariosIntegrated Semantic portal

Structured data

External Data

Resources

StYLiDData

Backend

IntegrationSchema

Alignment

Integrated Semantic

Portal

UsersAdmin

Concept Schemas

IS1

IS2

IS3

Wrapper1

Wrapper2

Wrapper3

Information Sources

527/27/2009 52main evaluation

Page 53: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Adapting to different Adapting to different scenariosscenariosVariable aspects

◦Data and concepts acquisition ◦Community and motivation◦Functionalities and constraints◦Data quality

Ways of adaptation◦Use of wrappers, etc.◦Delegate functionalities/constraints◦Extensible and customizable open source◦Customized queries and views

7/27/2009 main evaluation 53

Page 54: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Real practical applicationsIntegration of research staff directories

◦Osaka university and Nagoya university◦Data scraped from the websites

A musical community website in Tokyo International Exchange Center

Social data bookmarking site StYLiD.org

A document management system in AIT

7/27/2009 54main evaluation

Page 55: Creating and Sharing  Structured Semantic Web Contents through the Social Web

•10 alignments automatically suggested

•All correct

•Total 19 alignments

7/27/2009 55main evaluation

University Directory Integration

Page 56: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Integrated interface

7/27/2009 56main evaluation

Page 57: Creating and Sharing  Structured Semantic Web Contents through the Social Web

TIEC Musical Community TIEC Musical Community websitewebsite

7/27/2009 main evaluation 57

Page 58: Creating and Sharing  Structured Semantic Web Contents through the Social Web

7/27/2009 main evaluation 58

StYLiD.org Data Bookmarking

Page 59: Creating and Sharing  Structured Semantic Web Contents through the Social Web

7/27/2009 main evaluation 59

Document Management system

Page 60: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Structured Information Structured Information Dissemination in Decentralized Dissemination in Decentralized CommunitiesCommunities

Publishing

Aggregation

SocioBiblog System

Publishing

Aggregation

SocioBiblog System

Publishing

Aggregation

SocioBiblog SystemPublishing

Aggregation

SocioBiblog System

Web Extended RSS

Social network links

607/27/2009 60main evaluation

Page 61: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Conclusions

7/27/2009 61main evaluation

Page 62: Creating and Sharing  Structured Semantic Web Contents through the Social Web

ConclusionsConclusionsSocial web application for sharing

structured Semantic Web contents◦ StYLiD ◦ Free contribution, no strict constraints◦ Usable (even without training)

Concept consolidation◦ Multiple conceptualizations exist◦ Overlap significantly and can be consolidated◦ Automatic alignments with good precision and recall◦ A loose collaborative approach for creating shared

concept definitions

7/27/2009 main evaluation 62

Page 63: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Conclusions (contd.)Conclusions (contd.)Concept grouping by similarity

◦ Informal organization◦ Good precision can be obtained◦ Parameters can be tuned for appropriate coverage

and precision

Emergent lightweight informal ontologies◦ Ontology as by-product of information sharing and

integration

Practical applications7/27/2009 main evaluation 63

Page 64: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Future DirectionsFuture DirectionsComputing concept relations

Hierarchical and non-hierarchicalBetter schema alignment techniquesConsolidation of data instancesUsing existing vocabulariesMash-ups / plugins to utilize structured

dataScrapers to collect data from the web…

7/27/2009 main evaluation 64

Page 65: Creating and Sharing  Structured Semantic Web Contents through the Social Web

Thank You!Thank You!QuestionsSuggestions

7/27/2009 main evaluation 65