Information Systems & Semantic Web University · PDF file Information Systems & Semantic Web...
-
Upload
truongxuyen -
Category
Documents
-
view
213 -
download
0
Transcript of Information Systems & Semantic Web University · PDF file Information Systems & Semantic Web...
<is web> Information Systems & Semantic WebUniversity of Koblenz ▪ Landau, Germany
Semantic Searchexamples: Swoogle and Watson
Steffen StaadMaciej Janik
credit:Tim Finin (swoogle), Mathieu d’Aquin (watson)
and their groupsSemantic Web
2009-07-17
<is web>
Maciej [email protected]
Semantic Web2 of 52
ISWeb - Information Systems & Semantic Web
Types of semantic search engines
Semantically-enhanced SearchYahoo! SearchMonkey, Google squared …
NLP-based SearchMetaWeb Freebase, Powerset, …
Semantic-NLP-based Searchhakia, Cognition, …
Computational-NLP-based SearchTrue Knowledge, Wolfram Alpha, …
Semantic Web SearchSwoogle, Sindice, SWSE, Falcon-S, Watson, Shoe, …
<is web>
Maciej [email protected]
Semantic Web3 of 52
ISWeb - Information Systems & Semantic Web
Semantic Data Vs. Application needs
?
Data Applications
Find and rankontologies
Answer NLPquestion
Discover howentities are related
Query multipleknowledgebases
Browseontologies
Gateway to Semantic Data
Dynamically • retrieving• exploiting • combining relevant semantic resources
<is web>
Maciej [email protected]
Semantic Web4 of 52
ISWeb - Information Systems & Semantic Web
Semantic Search? Finding Ontologies?
For reuseTo build upon what existsTo adopt what is used in practiceNot to re-invent the wheelBecause it is simpler than building from scratch
For applicationsBecause semantic applications need knowledgeBecause knowledge is hard to acquireBecause some scenarios require to gather this knowledge at run-timeBecause in some scenarios, the more there is, the better
<is web>
Maciej [email protected]
Semantic Web5 of 52
ISWeb - Information Systems & Semantic Web
Swoogle – first Semantic Web Gateway
Find and rankontologies
Answer NLPquestion
Discover howentities are related
Query multipleknowledgebases
Browseontologies
Data Applications
<is web>
Maciej [email protected]
Semantic Web6 of 52
ISWeb - Information Systems & Semantic Web
Swoogle
Created in 2004Crawls and discovers documents in RDF,OWLIndexing and retrieval systemSearch for Semantic Web Documents (SWD)
OntologiesInstance data
<is web>
Maciej [email protected]
Semantic Web7 of 52
ISWeb - Information Systems & Semantic Web
http://swoogle.umbc.edu/
Swoogle uses four kinds of crawlers to discover semantic web documents and several analysis agents to compute metadata and relations among documents and ontologies. Metadata is stored in a relational DBMS. Services are provided to people and agents.
SWD RankA SWD’s rank is a function of its type (SWO/SWI) and the rank and types of the documents to which it’s related.
Swoogle puts documents into a character n-gram based IR engine to compute document similarity and do retrieval from queries
SWD IR Engine
SWOOGLE 2
SWD Metadata
Web Service
Web Server
SWD Cache
The WebThe Web
CandidateURLs Web Crawler
SWD Reader
IR analyzer SWD analyzer
Human users
Intelligent Agents
discovery
digest
analysis
service
OntologyDictionary
SwoogleSearch
SwoogleStatistics
Ontology Dictionary
Over 10,000Ontologies
705,078,866Triples
2,978,838SWDs
Some statistics …
<is web>
Maciej [email protected]
Semantic Web8 of 52
ISWeb - Information Systems & Semantic Web
How Swoogle can be used?
Find ontologies Containing keywords, terms, conceptsSimilar to ‘http://myontology.org/…’Used to describe document X (directly or indirectly)
Find SW documentsContaining keywords, terms …Used or created by specific institution
BrowseOntologies using specific topic hierarchyOntology metadataEntities and navigate between them
<is web>
Maciej [email protected]
Semantic Web9 of 52
ISWeb - Information Systems & Semantic Web
Swoogle architecture
Uses Google APIs to discover URIsCrawls using these URIs as
seeds Allows users to submit URIs
Offers multiple interfaces
Generates MetadataUsed for rankingBasic RDF statistics and ontology annotation. RDF Statistics (determine SWD or SWO)Ontology Annotation
<is web>
Maciej [email protected]
Semantic Web11 of 52
ISWeb - Information Systems & Semantic Web
Limitations of Swoogle
No quality control mechanismsMany ontologies are duplicatedNo quality information provided
Limited Query/Search mechanisms
No support for relations between ontologiesDuplication, incompatibility (contradiction), modularization, versioning, etc.
<is web>
Maciej [email protected]
Semantic Web12 of 52
ISWeb - Information Systems & Semantic Web
Watson
Semantic Web gateway
Watson
Data Applications
Find and rankontologies
Answer NLPquestion
Discover howentities are related
Query multipleknowledgebases
Browseontologies
<is web>
Maciej [email protected]
Semantic Web13 of 52
ISWeb - Information Systems & Semantic Web
Watson design principles
Information on ontology metrics– Provides information on structure-based ontology metrics (e.g. richness of concepts / individuals, ontology population)– Provides more semantic information for semantic applications (e.g. ontology topic relevance) to help discover, select, exploit and combine semantic resources
Provides a variety of query and access mechanisms– For both humans (web interface) and machines (web serv., API)– To fit applications having different purposes and requirements– Ranging from keyword search to ontology exploration and formal queries (SPARQL)
Support for relations between ontologies– Detecting redundancy, duplication, incompatibility (contradiction), modularization, versioning, etc.
<is web>
Maciej [email protected]
Semantic Web14 of 52
ISWeb - Information Systems & Semantic Web
Watson architecture – functional overview
<is web>
Maciej [email protected]
Semantic Web15 of 52
ISWeb - Information Systems & Semantic Web
Watson: GUIs and APIs
GUIsSeveral web-based interfaces for people
APIsImportant for Semantic Web application developersLightweight web servicesOffers same or extended functionality as created GUIs
<is web>
Maciej [email protected]
Semantic Web16 of 52
ISWeb - Information Systems & Semantic Web
Web Interface
Advanced Keyword Search
<is web>
Maciej [email protected]
Semantic Web17 of 52
ISWeb - Information Systems & Semantic Web
Web Interface
Ontology Exploration
<is web>
Maciej [email protected]
Semantic Web18 of 52
ISWeb - Information Systems & Semantic Web
Web Interface
Ontology Metadata
<is web>
Maciej [email protected]
Semantic Web19 of 52
ISWeb - Information Systems & Semantic Web
Web Interface
Querying
<is web>
Maciej [email protected]
Semantic Web20 of 52
ISWeb - Information Systems & Semantic Web
Applications of Watson
Search for paths in ontologiesDiscover how entities are related
Knowledge selection and modularizationFind and use only the knowledge you need
Natural language question answering Use Semantic Web data to analyze question and get answer
Folksonomy enrichmentDiscover relationships between tags using ontology
Semantic query expansionGeneralize or specialize your queries
<is web>
Maciej [email protected]
Semantic Web21 of 52
ISWeb - Information Systems & Semantic Web
Relation discovery
<is web>
Maciej [email protected]
Semantic Web22 of 52
ISWeb - Information Systems & Semantic Web
Applications of Watson
Search for paths in ontologies Discover how entities are related
Knowledge selection and modularizationFind and use only the knowledge you need
Natural language question answering Use Semantic Web data to analyze question and get answer
Folksonomy enrichmentDiscover relationships between tags using ontology
Semantic query expansionGeneralize or specialize your queries
<is web>
Maciej [email protected]
Semantic Web23 of 52
ISWeb - Information Systems & Semantic Web
Knowledge Selection and Modularization
OntologySelection
t1t2t3t4t5…tn
Web
t2t1
t3t4
t5
The ideal world (Web)
Ideal ontology includes- All search terms- Some additional knowledge
<is web>
Maciej [email protected]
Semantic Web24 of 52
ISWeb - Information Systems & Semantic Web
Knowledge Selection and Modularization
t2t1tn
t1t3
t3t4
t5
OntologySelection
t1t2t3t4t5…tn
Web
The real world (Web)
Ontology search results:• Multiple ontologies • Containing parts of interesting terms• Much more additional knowledge
(not so interesting for us)
<is web>
Maciej [email protected]
Semantic Web25 of 52
ISWeb - Information Systems & Semantic Web
Knowledge Selection and Modularization
t2t1tn
t1t3
t3t4
t5
OntologySelection
t1t2t3t4t5…tn
Web
Knowledge Selection
Ontology Modularization
t1tnt2
Ontology Modularization
t5t4
t3
Ontology Modularization t3
t1
<is web>
Maciej [email protected]
Semantic Web26 of 52
ISWeb - Information Systems & Semantic Web
Knowledge Selection and Modularization
t2t1tn
t1t3
t3t4
t5
OntologySelection
t1t2t3t4t5…tn
Web
Knowledge Selection
Ontology Modularization
t1tnt2
Ontology Modularization
t5t4
t3
Ontology Modularization t3
t1
t2t1
t3t4
t5
tnOntology Merging
<is web>
Maciej [email protected]
Semantic Web27 of 52
ISWeb - Information Systems & Semantic Web
Modularization and integration
Query termsFound large ontologycontaining results and more
Interesting ontologymodules Ontology fragment
interesting for us
<is web>
Maciej [email protected]
Semantic Web28 of 52
ISWeb - Information Systems & Semantic Web
Applications of Watson
Search for paths in ontologies Discover how entities are related
Knowledge selection and modularizationFind and use only the knowledge you need
Natural language question answeringUse Semantic Web data to analyze question and get answer
Folksonomy enrichmentDiscover relationships between tags using ontology
Semantic query expansionGeneralize or specialize your queries
<is web>
Maciej [email protected]
Semantic Web29 of 52
ISWeb - Information Systems & Semantic Web
PowerAqua
Bridge the gap between the user and the Semantic Web: - Provide the user the capability to query the SW using Natural Language.
Dynamically select and combine info drawn from the vast amount of heterogeneous semantic data to answer a user’s query.
<is web>
Maciej [email protected]
Semantic Web30 of 52
ISWeb - Information Systems & Semantic Web
PowerAqua
1. NL Question1. NL Question
2. Linguistic interpretation2. Linguistic interpretation
3. Ontology based interpretation3. Ontology based interpretation
4. Answer4. Answer
<is web>
Maciej [email protected]
Semantic Web31 of 52
ISWeb - Information Systems & Semantic Web
PowerAqua algorithm
<is web>
Maciej [email protected]
Semantic Web32 of 52
ISWeb - Information Systems & Semantic Web
Applications of Watson
Search for paths in ontologies Discover how entities are related
Knowledge selection and modularizationFind and use only the knowledge you need
Natural language question answering Use Semantic Web data to analyze question and get answer
Folksonomy enrichmentDiscover relationships between tags using ontology
Semantic query expansionGeneralize or specialize your queries
<is web>
Maciej [email protected]
Semantic Web33 of 52
ISWeb - Information Systems & Semantic Web
FLOR: FoLksonomy Ontology enRichment
Can the Semantic Web provide the structure needed to improve search and navigation of tagged spaces?
<is web>
Maciej [email protected]
Semantic Web34 of 52
ISWeb - Information Systems & Semantic Web
FLOR methodology
<is web>
Maciej [email protected]
Semantic Web35 of 52
ISWeb - Information Systems & Semantic Web
Search in Tag Spaces
5/24 ≈ 21% precision
Dog Dog
DogDog
Bird
Bird
Bird
Bird
Bird
Bird
Bird
TigerTiger
Tiger Tiger
CatLandscape
Landscape
Landscape
Let’s find photos of “animals which live in the water”
Query: Animal Water
<is web>
Maciej [email protected]
Semantic Web36 of 52
ISWeb - Information Systems & Semantic Web
Use ontologies for answering queries
Dolphin Seal
Marine Mammal
Mammal
Sea
livesIn
Whale
Body of Water
Ocean
Sea Elephant
FishlivesIn
Animal
FreshwaterFish SaltwaterFish livesIn
Animal Water
<Animal livesIn Water>
<Dolphin>or<Seal>or<“Sea Elephant”>or<Whale>
<is web>
Maciej [email protected]
Semantic Web37 of 52
ISWeb - Information Systems & Semantic Web
Results
dolphin
seal
whale
sea elephant
18/24 ≈ 75% precision
<is web>
Maciej [email protected]
Semantic Web38 of 52
ISWeb - Information Systems & Semantic Web
Applications of Watson
Search for paths in ontologies Discover how entities are related
Knowledge selection and modularizationFind and use only the knowledge you need
Natural language question answering Use Semantic Web data to analyze question and get answer
Folksonomy enrichmentDiscover relationships between tags using ontology
Semantic query expansionGeneralize or specialize your queries
<is web>
Maciej [email protected]
Semantic Web39 of 52
ISWeb - Information Systems & Semantic Web
Watson
Web DocumentsExpand query for web documentwith relevant ontology terms
<is web>
Maciej [email protected]
Semantic Web40 of 52
ISWeb - Information Systems & Semantic Web
Semantic query extensions
An extension of a web search engine that suggests ways to extend a query thanks to online ontologiesExample with the query “researcher”
Suggests “academic staff”, “Person”, etc. as terms to generalize the query, and “professor”, “PhD student” as terms to specializethe queryWithout having to give the system any knowledge: everything comes from the Web!
Versions:Gowgle (http://watson.kmi.open.ac.uk/gowgle): use the Google SOAP API and the Watson SOAP APIWahoo (http://watson.kmi.open.ac.uk/wahoo): use the Yahoo! REST API and the Watson REST API
<is web>
Maciej [email protected]
Semantic Web41 of 52
ISWeb - Information Systems & Semantic Web
QueryResult from Yahoo!
Term suggestions
Add/Replace
Screenshot of wahoo (REST based)
<is web>
Maciej [email protected]
Semantic Web42 of 52
ISWeb - Information Systems & Semantic Web
How it works?Find ontologies containing the keyword “researcher”http://watson.kmi.open.ac.uk/API/semanticcontent/keywords?q=researcher
… exactly “researcher” in the label or id of a classhttp://watson.kmi.open.ac.uk/API/semanticcontent/keywords?q=researcher&scope=LN+Label&ent=Class&match=Exact
Find entities corresponding to “researcher” in ontologyhttp://watson.kmi.open.ac.uk/API/entity/keyword?q=researcher&uri=http://calo.sri.com/core-plus-office&scope=LN+Label&ent=Class&match=Exact
Find subclasses and superclasses of an entityhttp://watson.kmi.open.ac.uk/API/entity/subclasses?ent=http://calo.sri.com/core-plus-office#Researcher&uri=http://calo. sri.com/core-plus-office
The rest is interface stuff and call to Yahoo!
<is web>
Maciej [email protected]
Semantic Web43 of 52
ISWeb - Information Systems & Semantic Web
Faceted browsing and ranking
Faceted browsing / search – great idea !Limitations?
Huge number of entities / facets, but limited display capabilities
What to show? What is important?
<is web>
Maciej [email protected]
Semantic Web44 of 52
ISWeb - Information Systems & Semantic Web
Ranking with TripleRank
Rank resources in the ontology w.r.tLink structureAuthority modeling (HITS algorithm)Type of relationships between resources
• Additional dimension
Based on tensor analysis and decompositionlike: “extension of matrix analysis and SVD”
Thomas Franz, Antje Schultz, Sergej Sizov, and Steffen Staab.“TripleRank: Ranking SemanticWeb Data By Tensor Decomposition”
<is web>
Maciej [email protected]
Semantic Web45 of 52
ISWeb - Information Systems & Semantic Web
HITS by example
Traditional ranking with HITS algorithm
<is web>
Maciej [email protected]
Semantic Web46 of 52
ISWeb - Information Systems & Semantic Web
HITS by example
Hyperlink-Induced Topic Search finds:
Authorities (a): pages with good content about a topic, linked to by many hubsHubs (h): pages that link to many good authority pages on a topic (directories)
HITS is an iterative process to calculate hubs and authoritiesEquations
Traditional ranking with HITS algorithm
<is web>
Maciej [email protected]
Semantic Web47 of 52
ISWeb - Information Systems & Semantic Web
HITS by example
Traditional ranking with HITS algorithm
<is web>
Maciej [email protected]
Semantic Web48 of 52
ISWeb - Information Systems & Semantic Web
TripleRank
The same graph represented as tensor in TripleRank
Adjacencymatrix for
single property
prope
rties
<is web>
Maciej [email protected]
Semantic Web49 of 52
ISWeb - Information Systems & Semantic Web
TripleRank tensor analysis
Results of the tensor analysis:
<is web>
Maciej [email protected]
Semantic Web50 of 52
ISWeb - Information Systems & Semantic Web
TripleRank results
General linkageranking
Property-specific ranking
<is web>
Maciej [email protected]
Semantic Web51 of 52
ISWeb - Information Systems & Semantic Web
TripleRank - ranked properties
Possible use for faceted browsing using properties
Evaluationinterface
<is web>
Maciej [email protected]
Semantic Web52 of 52
ISWeb - Information Systems & Semantic Web
Conclusions
Search is one of the most useful servicesSemantic Web need such services too!
Find relevant ontologiesReuse knowledge
Semantic search is much more focused than traditional (syntactic)
Different types of results and appropriate presentationRicher metadataQuery understanding (NLP) and reasoning
Number of semantic search engines is growingNeed both for “google” for semantics …… and for very specialized services (e.g. medicine)