Information Systems & Semantic Web University · PDF file Information Systems & Semantic Web...

Post on 25-Mar-2018

213 views 0 download

Transcript of Information Systems & Semantic Web University · PDF file Information Systems & Semantic Web...

<is web> Information Systems & Semantic WebUniversity of Koblenz ▪ Landau, Germany

Semantic Searchexamples: Swoogle and Watson

Steffen StaadMaciej Janik

credit:Tim Finin (swoogle), Mathieu d’Aquin (watson)

and their groupsSemantic Web

2009-07-17

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web2 of 52

ISWeb - Information Systems & Semantic Web

Types of semantic search engines

Semantically-enhanced SearchYahoo! SearchMonkey, Google squared …

NLP-based SearchMetaWeb Freebase, Powerset, …

Semantic-NLP-based Searchhakia, Cognition, …

Computational-NLP-based SearchTrue Knowledge, Wolfram Alpha, …

Semantic Web SearchSwoogle, Sindice, SWSE, Falcon-S, Watson, Shoe, …

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web3 of 52

ISWeb - Information Systems & Semantic Web

Semantic Data Vs. Application needs

?

Data Applications

Find and rankontologies

Answer NLPquestion

Discover howentities are related

Query multipleknowledgebases

Browseontologies

Gateway to Semantic Data

Dynamically • retrieving• exploiting • combining relevant semantic resources

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web4 of 52

ISWeb - Information Systems & Semantic Web

Semantic Search? Finding Ontologies?

For reuseTo build upon what existsTo adopt what is used in practiceNot to re-invent the wheelBecause it is simpler than building from scratch

For applicationsBecause semantic applications need knowledgeBecause knowledge is hard to acquireBecause some scenarios require to gather this knowledge at run-timeBecause in some scenarios, the more there is, the better

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web5 of 52

ISWeb - Information Systems & Semantic Web

Swoogle – first Semantic Web Gateway

Find and rankontologies

Answer NLPquestion

Discover howentities are related

Query multipleknowledgebases

Browseontologies

Data Applications

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web6 of 52

ISWeb - Information Systems & Semantic Web

Swoogle

Created in 2004Crawls and discovers documents in RDF,OWLIndexing and retrieval systemSearch for Semantic Web Documents (SWD)

OntologiesInstance data

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web7 of 52

ISWeb - Information Systems & Semantic Web

http://swoogle.umbc.edu/

Swoogle uses four kinds of crawlers to discover semantic web documents and several analysis agents to compute metadata and relations among documents and ontologies. Metadata is stored in a relational DBMS. Services are provided to people and agents.

SWD RankA SWD’s rank is a function of its type (SWO/SWI) and the rank and types of the documents to which it’s related.

Swoogle puts documents into a character n-gram based IR engine to compute document similarity and do retrieval from queries

SWD IR Engine

SWOOGLE 2

SWD Metadata

Web Service

Web Server

SWD Cache

The WebThe Web

CandidateURLs Web Crawler

SWD Reader

IR analyzer SWD analyzer

Human users

Intelligent Agents

discovery

digest

analysis

service

OntologyDictionary

SwoogleSearch

SwoogleStatistics

Ontology Dictionary

Over 10,000Ontologies

705,078,866Triples

2,978,838SWDs

Some statistics …

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web8 of 52

ISWeb - Information Systems & Semantic Web

How Swoogle can be used?

Find ontologies Containing keywords, terms, conceptsSimilar to ‘http://myontology.org/…’Used to describe document X (directly or indirectly)

Find SW documentsContaining keywords, terms …Used or created by specific institution

BrowseOntologies using specific topic hierarchyOntology metadataEntities and navigate between them

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web9 of 52

ISWeb - Information Systems & Semantic Web

Swoogle architecture

Uses Google APIs to discover URIsCrawls using these URIs as

seeds Allows users to submit URIs

Offers multiple interfaces

Generates MetadataUsed for rankingBasic RDF statistics and ontology annotation. RDF Statistics (determine SWD or SWO)Ontology Annotation

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web10 of 52

ISWeb - Information Systems & Semantic Web

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web11 of 52

ISWeb - Information Systems & Semantic Web

Limitations of Swoogle

No quality control mechanismsMany ontologies are duplicatedNo quality information provided

Limited Query/Search mechanisms

No support for relations between ontologiesDuplication, incompatibility (contradiction), modularization, versioning, etc.

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web12 of 52

ISWeb - Information Systems & Semantic Web

Watson

Semantic Web gateway

Watson

Data Applications

Find and rankontologies

Answer NLPquestion

Discover howentities are related

Query multipleknowledgebases

Browseontologies

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web13 of 52

ISWeb - Information Systems & Semantic Web

Watson design principles

Information on ontology metrics– Provides information on structure-based ontology metrics (e.g. richness of concepts / individuals, ontology population)– Provides more semantic information for semantic applications (e.g. ontology topic relevance) to help discover, select, exploit and combine semantic resources

Provides a variety of query and access mechanisms– For both humans (web interface) and machines (web serv., API)– To fit applications having different purposes and requirements– Ranging from keyword search to ontology exploration and formal queries (SPARQL)

Support for relations between ontologies– Detecting redundancy, duplication, incompatibility (contradiction), modularization, versioning, etc.

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web14 of 52

ISWeb - Information Systems & Semantic Web

Watson architecture – functional overview

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web15 of 52

ISWeb - Information Systems & Semantic Web

Watson: GUIs and APIs

GUIsSeveral web-based interfaces for people

APIsImportant for Semantic Web application developersLightweight web servicesOffers same or extended functionality as created GUIs

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web16 of 52

ISWeb - Information Systems & Semantic Web

Web Interface

Advanced Keyword Search

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web17 of 52

ISWeb - Information Systems & Semantic Web

Web Interface

Ontology Exploration

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web18 of 52

ISWeb - Information Systems & Semantic Web

Web Interface

Ontology Metadata

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web19 of 52

ISWeb - Information Systems & Semantic Web

Web Interface

Querying

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web20 of 52

ISWeb - Information Systems & Semantic Web

Applications of Watson

Search for paths in ontologiesDiscover how entities are related

Knowledge selection and modularizationFind and use only the knowledge you need

Natural language question answering Use Semantic Web data to analyze question and get answer

Folksonomy enrichmentDiscover relationships between tags using ontology

Semantic query expansionGeneralize or specialize your queries

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web21 of 52

ISWeb - Information Systems & Semantic Web

Relation discovery

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web22 of 52

ISWeb - Information Systems & Semantic Web

Applications of Watson

Search for paths in ontologies Discover how entities are related

Knowledge selection and modularizationFind and use only the knowledge you need

Natural language question answering Use Semantic Web data to analyze question and get answer

Folksonomy enrichmentDiscover relationships between tags using ontology

Semantic query expansionGeneralize or specialize your queries

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web23 of 52

ISWeb - Information Systems & Semantic Web

Knowledge Selection and Modularization

OntologySelection

t1t2t3t4t5…tn

Web

t2t1

t3t4

t5

The ideal world (Web)

Ideal ontology includes- All search terms- Some additional knowledge

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web24 of 52

ISWeb - Information Systems & Semantic Web

Knowledge Selection and Modularization

t2t1tn

t1t3

t3t4

t5

OntologySelection

t1t2t3t4t5…tn

Web

The real world (Web)

Ontology search results:• Multiple ontologies • Containing parts of interesting terms• Much more additional knowledge

(not so interesting for us)

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web25 of 52

ISWeb - Information Systems & Semantic Web

Knowledge Selection and Modularization

t2t1tn

t1t3

t3t4

t5

OntologySelection

t1t2t3t4t5…tn

Web

Knowledge Selection

Ontology Modularization

t1tnt2

Ontology Modularization

t5t4

t3

Ontology Modularization t3

t1

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web26 of 52

ISWeb - Information Systems & Semantic Web

Knowledge Selection and Modularization

t2t1tn

t1t3

t3t4

t5

OntologySelection

t1t2t3t4t5…tn

Web

Knowledge Selection

Ontology Modularization

t1tnt2

Ontology Modularization

t5t4

t3

Ontology Modularization t3

t1

t2t1

t3t4

t5

tnOntology Merging

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web27 of 52

ISWeb - Information Systems & Semantic Web

Modularization and integration

Query termsFound large ontologycontaining results and more

Interesting ontologymodules Ontology fragment

interesting for us

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web28 of 52

ISWeb - Information Systems & Semantic Web

Applications of Watson

Search for paths in ontologies Discover how entities are related

Knowledge selection and modularizationFind and use only the knowledge you need

Natural language question answeringUse Semantic Web data to analyze question and get answer

Folksonomy enrichmentDiscover relationships between tags using ontology

Semantic query expansionGeneralize or specialize your queries

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web29 of 52

ISWeb - Information Systems & Semantic Web

PowerAqua

Bridge the gap between the user and the Semantic Web: - Provide the user the capability to query the SW using Natural Language.

Dynamically select and combine info drawn from the vast amount of heterogeneous semantic data to answer a user’s query.

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web30 of 52

ISWeb - Information Systems & Semantic Web

PowerAqua

1. NL Question1. NL Question

2. Linguistic interpretation2. Linguistic interpretation

3. Ontology based interpretation3. Ontology based interpretation

4. Answer4. Answer

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web31 of 52

ISWeb - Information Systems & Semantic Web

PowerAqua algorithm

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web32 of 52

ISWeb - Information Systems & Semantic Web

Applications of Watson

Search for paths in ontologies Discover how entities are related

Knowledge selection and modularizationFind and use only the knowledge you need

Natural language question answering Use Semantic Web data to analyze question and get answer

Folksonomy enrichmentDiscover relationships between tags using ontology

Semantic query expansionGeneralize or specialize your queries

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web33 of 52

ISWeb - Information Systems & Semantic Web

FLOR: FoLksonomy Ontology enRichment

Can the Semantic Web provide the structure needed to improve search and navigation of tagged spaces?

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web34 of 52

ISWeb - Information Systems & Semantic Web

FLOR methodology

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web35 of 52

ISWeb - Information Systems & Semantic Web

Search in Tag Spaces

5/24 ≈ 21% precision

Dog Dog

DogDog

Bird

Bird

Bird

Bird

Bird

Bird

Bird

TigerTiger

Tiger Tiger

CatLandscape

Landscape

Landscape

Let’s find photos of “animals which live in the water”

Query: Animal Water

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web36 of 52

ISWeb - Information Systems & Semantic Web

Use ontologies for answering queries

Dolphin Seal

Marine Mammal

Mammal

Sea

livesIn

Whale

Body of Water

Ocean

Sea Elephant

FishlivesIn

Animal

FreshwaterFish SaltwaterFish livesIn

Animal Water

<Animal livesIn Water>

<Dolphin>or<Seal>or<“Sea Elephant”>or<Whale>

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web37 of 52

ISWeb - Information Systems & Semantic Web

Results

dolphin

seal

whale

sea elephant

18/24 ≈ 75% precision

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web38 of 52

ISWeb - Information Systems & Semantic Web

Applications of Watson

Search for paths in ontologies Discover how entities are related

Knowledge selection and modularizationFind and use only the knowledge you need

Natural language question answering Use Semantic Web data to analyze question and get answer

Folksonomy enrichmentDiscover relationships between tags using ontology

Semantic query expansionGeneralize or specialize your queries

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web39 of 52

ISWeb - Information Systems & Semantic Web

Watson

Web DocumentsExpand query for web documentwith relevant ontology terms

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web40 of 52

ISWeb - Information Systems & Semantic Web

Semantic query extensions

An extension of a web search engine that suggests ways to extend a query thanks to online ontologiesExample with the query “researcher”

Suggests “academic staff”, “Person”, etc. as terms to generalize the query, and “professor”, “PhD student” as terms to specializethe queryWithout having to give the system any knowledge: everything comes from the Web!

Versions:Gowgle (http://watson.kmi.open.ac.uk/gowgle): use the Google SOAP API and the Watson SOAP APIWahoo (http://watson.kmi.open.ac.uk/wahoo): use the Yahoo! REST API and the Watson REST API

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web41 of 52

ISWeb - Information Systems & Semantic Web

QueryResult from Yahoo!

Term suggestions

Add/Replace

Screenshot of wahoo (REST based)

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web42 of 52

ISWeb - Information Systems & Semantic Web

How it works?Find ontologies containing the keyword “researcher”http://watson.kmi.open.ac.uk/API/semanticcontent/keywords?q=researcher

… exactly “researcher” in the label or id of a classhttp://watson.kmi.open.ac.uk/API/semanticcontent/keywords?q=researcher&scope=LN+Label&ent=Class&match=Exact

Find entities corresponding to “researcher” in ontologyhttp://watson.kmi.open.ac.uk/API/entity/keyword?q=researcher&uri=http://calo.sri.com/core-plus-office&scope=LN+Label&ent=Class&match=Exact

Find subclasses and superclasses of an entityhttp://watson.kmi.open.ac.uk/API/entity/subclasses?ent=http://calo.sri.com/core-plus-office#Researcher&uri=http://calo. sri.com/core-plus-office

The rest is interface stuff and call to Yahoo!

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web43 of 52

ISWeb - Information Systems & Semantic Web

Faceted browsing and ranking

Faceted browsing / search – great idea !Limitations?

Huge number of entities / facets, but limited display capabilities

What to show? What is important?

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web44 of 52

ISWeb - Information Systems & Semantic Web

Ranking with TripleRank

Rank resources in the ontology w.r.tLink structureAuthority modeling (HITS algorithm)Type of relationships between resources

• Additional dimension

Based on tensor analysis and decompositionlike: “extension of matrix analysis and SVD”

Thomas Franz, Antje Schultz, Sergej Sizov, and Steffen Staab.“TripleRank: Ranking SemanticWeb Data By Tensor Decomposition”

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web45 of 52

ISWeb - Information Systems & Semantic Web

HITS by example

Traditional ranking with HITS algorithm

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web46 of 52

ISWeb - Information Systems & Semantic Web

HITS by example

Hyperlink-Induced Topic Search finds:

Authorities (a): pages with good content about a topic, linked to by many hubsHubs (h): pages that link to many good authority pages on a topic (directories)

HITS is an iterative process to calculate hubs and authoritiesEquations

Traditional ranking with HITS algorithm

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web47 of 52

ISWeb - Information Systems & Semantic Web

HITS by example

Traditional ranking with HITS algorithm

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web48 of 52

ISWeb - Information Systems & Semantic Web

TripleRank

The same graph represented as tensor in TripleRank

Adjacencymatrix for

single property

prope

rties

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web49 of 52

ISWeb - Information Systems & Semantic Web

TripleRank tensor analysis

Results of the tensor analysis:

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web50 of 52

ISWeb - Information Systems & Semantic Web

TripleRank results

General linkageranking

Property-specific ranking

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web51 of 52

ISWeb - Information Systems & Semantic Web

TripleRank - ranked properties

Possible use for faceted browsing using properties

Evaluationinterface

<is web>

Maciej Janikjanik@uni-koblenz.de

Semantic Web52 of 52

ISWeb - Information Systems & Semantic Web

Conclusions

Search is one of the most useful servicesSemantic Web need such services too!

Find relevant ontologiesReuse knowledge

Semantic search is much more focused than traditional (syntactic)

Different types of results and appropriate presentationRicher metadataQuery understanding (NLP) and reasoning

Number of semantic search engines is growingNeed both for “google” for semantics …… and for very specialized services (e.g. medicine)