Enriching lives through recreation Enriching lives through recreation
Linkator: enriching web pages by automatically adding dereferenceable semantic annotations
-
Upload
samuraraujo -
Category
Technology
-
view
1.271 -
download
3
description
Transcript of Linkator: enriching web pages by automatically adding dereferenceable semantic annotations
![Page 1: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/1.jpg)
DelftUniversity ofTechnology
Linkator: enriching web pages by automatically adding dereferenceable semantic annotations
Samur Araujo, Geert-Jan Houben, Daniel SchwabeWeb Information SystemsDelft University of Technology, the Netherlands
![Page 2: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/2.jpg)
2Enriching web pages with dereferenceable semantic annotations
Summary – dereferencing semantic annotations
• What dereferencing semantic annotations is about?• Automatic linking web pages.
• Summary1. Overview of the problem and motivation.2. Our approach for solving the problem.3. One example of use.
![Page 3: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/3.jpg)
3Enriching web pages with dereferenceable semantic annotations
Motivation
• Links between HTML pages are the main mechanism to navigate on web pages.
• However, a lot of pages are unlinked or poorly linked.
• Terms on pages have meaning and are intrinsically associated to concepts or entities that the user is interested in.
• These terms can be interpreted by machines and automatically linked to relevant resources on the web.
![Page 4: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/4.jpg)
4Enriching web pages with dereferenceable semantic annotations
![Page 5: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/5.jpg)
5Enriching web pages with dereferenceable semantic annotations
Problem Statement
The problem of automatic linking can be divided in 3 sub-problems:
1. How to identify candidate terms (anchors) for adding links?• It denotes concepts in which the user is interested.
2. Which concept does a candidate term represent?• Disambiguate a candidate term.
3. How to identify a web resource to be the link target?• How to select a source of data for finding the destination of the
link?
![Page 6: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/6.jpg)
6Enriching web pages with dereferenceable semantic annotations
State-of-the-Art in Automatic Linking
• Candidate Terms:• Focused on term disambiguation using an auxiliary knowledge base
or dictionaries (e.g. wikipedia and wordnet).• Link Target:
• It is selected from a specific knowledge base [1] or from a collection
[2] of target documents.
• Limitations• Does not support well users interested in a broader range of
domains.
[1] Mihalcea, R. and Csomai, A. Wikify!: linking documents to encyclopedic knowledge. In Proceedings of the 16th
ACM Conference on Information and Knowledge management (CIKM 07), Lisbon, Portugal, pp. 233-242, 2007.
[2] Gardner JJ, Krowne A, Xiong L. NNexus: An Automatic Linker for Collaborative Web-Based Corpora. IEEE Trans.
Knowl. Data Eng. 21(6). 829-839. 2009.
![Page 7: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/7.jpg)
7Enriching web pages with dereferenceable semantic annotations
Linkator Approach
Extract Terms from Web Pages
Associate Terms to Concepts
Find Resources that Represents these Concepts
Information Extraction Engine Core LinkatorSemantic Annotator
Linkator
![Page 8: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/8.jpg)
8Enriching web pages with dereferenceable semantic annotations
Page is accessed
Term are extracted
Page is semantically annotated
Annotated page
Annotation is extracted
Endpoint is chosen
Query is formulated
Search for a resource
Page Accessed Link Clicked
If notfound
Semantic Links created
![Page 9: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/9.jpg)
9Enriching web pages with dereferenceable semantic annotations
Linkator Approach
Linkator Client - Firefox Plugin
Web Browser
HTTP
Annotator
RDFa Annotator
Information Extraction Engine
HTTP Linkator Server
Linked Data
Query FormulationSparql
Endpoint Resolution
![Page 10: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/10.jpg)
10Enriching web pages with dereferenceable semantic annotations
Semantic Link – Definition
• A semantic link is an HTML tag A that is semantically annotated with RDFa.
• It contains RDF triples associated to it.
• Semantic Link causes a query over Linked Data.
![Page 11: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/11.jpg)
11Enriching web pages with dereferenceable semantic annotations
Semantic Links
RDF Triples associated to the Semantic Link
![Page 12: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/12.jpg)
12Enriching web pages with dereferenceable semantic annotations
Dereferencing Semantic Links
• Linkator uses the Linked Data cloud for discovering a destination for the semantic link as opposed to querying search engines or a fixed knowledge base.
• Algorithm for Endpoint Resolution
• Algorithm for Query Formulation
![Page 13: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/13.jpg)
13Enriching web pages with dereferenceable semantic annotations
Endpoint Resolution
• Task: Find endpoints that contain a specific concept.
• Linkator selects available endpoints based on the vocabularies used in the semantic links. voiD (Vocabulary of Interlinked Datasets)
![Page 14: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/14.jpg)
14Enriching web pages with dereferenceable semantic annotations
Endpoint Resolution
1. Select the vocabulary of all RDF types associated with the annotation.
2. Or select the vocabularies of all predicates associated with the annotation.
![Page 15: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/15.jpg)
15Enriching web pages with dereferenceable semantic annotations
Endpoint Resolution
1. The SelectEndpoint function find the resource: http://ontoware.org/swrc/swrc_v0.3.owl#Author
2. It extracts the vocabulary associated with this resource:http://ontoware.org/swrc/swrc_v0.3.owl
3. It queries the voiD descriptor of the available SPARQL endpoints, looking for such a vocabulary.
![Page 16: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/16.jpg)
16Enriching web pages with dereferenceable semantic annotations
Query Formulation
1. Query is based on the object of the triple.
2. Try to find a human-readable representation of the resource, i.e., try to match predicates such as: foaf:homepage, akt:has-web-address, rdfs:seeAlso.
![Page 17: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/17.jpg)
17Enriching web pages with dereferenceable semantic annotations
Proof of Concept
• Semantic links for pages that contain bibliographic citations.
• Extended version of FreeCite parsing engine.
• Example of bibliographic citation:
Kees van der Sluijs, Geert-Jan Houben, Erwin Leonardi, Jan Hidders. Hera:
Engineering Web Applications Using Semantic Web-Based Models. Book
chapter: Semantic Web Information Management: A Model-Based
Perspective, De Virgilio, Roberto; Giunchiglia, Fausto; Tanca, Letizia (Eds.),
Chapter 22, 2010, Springer.
![Page 18: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/18.jpg)
18Enriching web pages with dereferenceable semantic annotations
Html Page
Plain TextEntity
Extraction
Semantic Annotation
Text Semantically Annotated
Sparql Endpoint
Discovering and Selection
Endpoint Querying
URL Generation
FreeCite Extraction Engine
HTML Page Semantically Annotated
Core Linkator
MarkupRemoved
Insert annotations on the page
Sem
anti
c lin
k c
licke
d
Extract Terms from Web Pages
Associate Terms to Concepts
Find Resources that Represents these Concepts
Information Extraction Engine
Core LinkatorSemantic Annotator
Linkator
![Page 19: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/19.jpg)
19Enriching web pages with dereferenceable semantic annotations
Example – HTML Page without Links
![Page 20: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/20.jpg)
20Enriching web pages with dereferenceable semantic annotations
![Page 21: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/21.jpg)
21Enriching web pages with dereferenceable semantic annotations
Example – Page annotated with RDFa
![Page 22: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/22.jpg)
22Enriching web pages with dereferenceable semantic annotations
Example – Page with Semantic Links
![Page 23: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/23.jpg)
23Enriching web pages with dereferenceable semantic annotations
![Page 24: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/24.jpg)
24Enriching web pages with dereferenceable semantic annotations
![Page 25: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/25.jpg)
25Enriching web pages with dereferenceable semantic annotations
Conclusion and Future Work
• For a specific scenario of linking bibliographic citations Linkator provides a reasonable solution.
• The composition of the Semantic Web technologies can provide a reasonable solution for the problem of automatic linking.
• Linkator is a concrete application that uses Semantic Web technologies.
• Future Work: • Use Linkator in a broader scenario.• Enhance the Linkator algorithms.• Evaluate the precision and recall of the linking.
![Page 26: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/26.jpg)
26Enriching web pages with dereferenceable semantic annotations
Questions?
You can download Linkator at:http://www.wis.ewi.tudelft.nl/
Samur Araujo
Thank you for your attention!
![Page 27: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/27.jpg)
27Enriching web pages with dereferenceable semantic annotations
HTML Page
Annotated HTML Page
Page is annotated
RDF
Link is clicked
Annotation on the page are used to find the link destination
![Page 28: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/28.jpg)
28Enriching web pages with dereferenceable semantic annotations
State-of-the-Art in Automatic Linking
• Example: • Wikify! [1] is focused on linking keywords on web pages to
Wikipedia articles
• Nnexus [2] focus on linking keywords obtained from an index
extracted from target documents.
• [1] Mihalcea, R. and Csomai, A. Wikify!: linking documents to encyclopedic knowledge. In Proceedings of the
16th ACM Conference on Information and Knowledge management (CIKM 07), Lisbon, Portugal, pp. 233-242,
2007.
• [2] Gardner JJ, Krowne A, Xiong L. NNexus: An Automatic Linker for Collaborative Web-Based Corpora. IEEE
Trans. Knowl. Data Eng. 21(6). 829-839. 2009.
![Page 29: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/29.jpg)
29Enriching web pages with dereferenceable semantic annotations
Endpoint Resolution
FUNCTION SelectEndpointE := ArrayR : = select all rdf:type objects associated to the semantic linkT := ExtractVocabulary(R)
FOR EACH vocabulary in T DO{
E.add (select endpoints that contain this vocabulary)}IF E = Empty {
R := select all predicates associated to the semantic link
T := ExtractVocabulary(R)
FOR EACH vocabulary in T DO{
E.add (select endpoints that contain this vocabulary)
}}RETURN E
FUNCTION ExtractVocabulary (R)V := ArrayFOR EACH resource in R DO{
V.add (extract the vocabulary from the resource)}RETURN V
12345678910111213141516171819202122232425262728
![Page 30: Linkator: enriching web pages by automatically adding dereferenceable semantic annotations](https://reader035.fdocuments.net/reader035/viewer/2022062319/5550830bb4c905235b8b47e9/html5/thumbnails/30.jpg)
30Enriching web pages with dereferenceable semantic annotations
Semantic Link – Example
• Triples associated with the semantic link.