Research in Semantic Web and Information Retrieval: Trust
Transcript of Research in Semantic Web and Information Retrieval: Trust
![Page 1: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/1.jpg)
Wright State UniversityCORE Scholar
Kno.e.sis Publications The Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis)
12-10-2009
Research in Semantic Web and InformationRetrieval: Trust, Sensors, and SearchKrishnaprasad ThirunarayanWright State University - Main Campus, [email protected]
Follow this and additional works at: https://corescholar.libraries.wright.edu/knoesis
Part of the Bioinformatics Commons, Communication Technology and New Media Commons,Databases and Information Systems Commons, OS and Networks Commons, and the Science andTechnology Studies Commons
This Article is brought to you for free and open access by the The Ohio Center of Excellence in Knowledge-Enabled Computing (Kno.e.sis) at COREScholar. It has been accepted for inclusion in Kno.e.sis Publications by an authorized administrator of CORE Scholar. For more information, pleasecontact [email protected], [email protected].
Repository CitationThirunarayan, K. (2009). Research in Semantic Web and Information Retrieval: Trust, Sensors, and Search. .https://corescholar.libraries.wright.edu/knoesis/77
![Page 2: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/2.jpg)
1
Research in Semantic Web and Information Retrieval:
Trust, Sensors, and Search
T. K. Prasad (Krishnaprasad Thirunarayan)
Professor Kno.e.sis Center
Department of Computer Science and Engineering Wright State University, Dayton, OH-45435, USA
![Page 4: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/4.jpg)
Knowledge Enabled Information and Services Science 3
http://knoesis.wright.edu/
![Page 5: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/5.jpg)
Information Retrieval
Information retrieval (IR) is finding material (usually documents) of an unstructured nature (usually text) that satisfies an information need from within large collections (usually stored on computers).
4
![Page 6: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/6.jpg)
Evolution of the Web
5
![Page 7: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/7.jpg)
Semantic Web Semantic Web is a standards- based extension of the WWW in which the semantics of information and services on the web is defined, so as to satisfy information need of people and enable machines to use the web content. Machine comprehensible structured data
6
![Page 8: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/8.jpg)
Tim Berner-Lee’s Semantic Web Layer Cake
7
![Page 9: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/9.jpg)
Updated Semantic Web Cake
8
![Page 10: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/10.jpg)
9
Trust Issues in Social Media and Sensor Networks
T. K. Prasad, Cory Henson,
Amit Sheth and Pramod Anantharam Kno.e.sis Center
Department of Computer Science and Engineering Wright State University, Dayton, OH-45435, USA
![Page 11: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/11.jpg)
10
Goal
Study semantic issues relevant to trust in Social Media - Data and Networks Sensor - Data and Networks
Generic Examples involving Trust Analyzing ratings/reviews online on TV models
before making purchasing decision from amazon.com
Seeking recommendations on handy man, car mechanic, etc. from neighbors
![Page 12: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/12.jpg)
11
Generic Approach
Propose models of trust/trust metrics to formalize trust aggregation and trust propagation to deal with indirect trust Develop techniques and tools to glean trust information from
social media data (streams) and networks sensor data (streams) and networks
![Page 13: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/13.jpg)
Trust in Social Media Networks
12
![Page 14: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/14.jpg)
Previous Work Structure of Trust
Trust between a pair of users is modelled as a real number in the closed interval [0,1] or [-1,1] Pros: Facilitates propagation and
computation of aggregated trust Cons:
Too fine-grained, total order Inherent difficulties in initializing,
understanding, and justifying computed trust values
![Page 15: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/15.jpg)
Quote
Guha et al: While continuous-valued trusts are
mathematically clean from the standpoint of usability, most real-world systems will in fact use discrete values at which one user can rate another. E.g., Epinions, Ebay, Amazon, Facebook, etc all
use small sets for (dis)trust/rating values.
![Page 16: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/16.jpg)
Trust-aware Recommender Systems
Collaborative Filtering systems exploit user-similarity to get recommendations. But suffer from data sparsity problem. Adding trust links improves quality of recommendations benefits cold-start users who most need it is robust w.r.t. spamming via engineered profiles
(Shilling Attacks)
15
![Page 17: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/17.jpg)
16
Our Research Propose a model of trust based on Partially ordered discrete values (with emphasis
on relative magnitude) Local but realistic semantics
Distinguishes functional and referral trust Distinguishes direct and inferred trust
Prefers direct information over conflicting inferred information
Represents ambiguity explicitly
HOLY GRAIL: Direct Semantics in favor of Indirect Translations
![Page 18: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/18.jpg)
Essential concepts Trust Scope: Context, Action, … Functional Trust: Agent a1 trusts agent a2’s ability in some context or for doing something Referral Trust: Agent a1 trusts agent a2’s ability to recommend another agent in some context or for doing something Trust is a relationship among agents/users, while belief is a relationship between agents/users and statements
17
![Page 19: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/19.jpg)
Semantics : Interpretation Four valued logic {inconsistent information, true, false, no information} Trust / Distrust 4-valued “binary” function among users Belief / Disbelief: 4-valued “binary” function on users and
statements
![Page 20: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/20.jpg)
Example: Trust Network - Different Trust Links and
Local Ordering on Trust Links
Alice trusts Bob for recommending good car mechanic. Bob trusts Dick to be a good car mechanic. Charlie does not trust Dick to be a good car mechanic. Alice trusts Bob more than Charlie, w.r.t. car mechanic context. Alice trusts Charlie more than Bob, w.r.t. baby sitter context.
19
![Page 21: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/21.jpg)
Formalization of Semantics : Basis for Trust Computation Algorithm
20
![Page 22: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/22.jpg)
Formalization Approach
Given a trust network (Nodes, Edges with Trust Scopes, Local Orderings), specify when a source agent can trust, distrust, or be ambiguous about another target agent, reflecting:
Functional and referral trust links Direct and inferred trust Locality
21
![Page 23: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/23.jpg)
22
![Page 24: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/24.jpg)
23
Similarly for Evidence in support of Negative Functional Trust.
![Page 25: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/25.jpg)
24
![Page 26: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/26.jpg)
Quote summarizing potential bug
The whole problem with the world is that fools and fanatics are always so certain of themselves, but wiser people so full of doubts.
--- Betrand Russell
25
![Page 27: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/27.jpg)
Possible Future Extensions Trust links with trust-scoped exceptions
Straddles two extremes involving just trust links and just trust-scoped links
Trust values annotated with trust path length, target neighborhood summary, etc. Other forms of trust links formalized using upper ontology
26
![Page 28: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/28.jpg)
Trust in Sensor Networks
27
![Page 29: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/29.jpg)
Sensor Networks
Approaches to Trust Reputation-based Trust
Based on past behavior Policy-based Trust
Based on explicitly stated constraints Evidence-based Trust
Based on seeking/verifying evidence
28
![Page 30: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/30.jpg)
Probabilistic basis for reputation-based trust in a Sensor Node
Sensor Reputation and Sensor Observation Credibility determined using outlier detection algorithm aggregating results over time Homogeneous sensor networks can
exploit spatio-temporal locality and redundancy for this purpose
Heterogeneous sensor networks require complex domain models for this purpose
29
![Page 31: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/31.jpg)
(cont’d)
Trust/Reputation in a Sensor node can be modeled as beta probability distribution function with parameters (α,β) gleaned from total number of correct (α−1) and erroneous (β−1) observations so far.
30
![Page 32: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/32.jpg)
Motivation for using Beta PDF
Computational Ease Retain/manipulate just two values (α,β) Incremental update : after checking
whether new data is normal or outlier Intuitively Satisfactory Initialization not necessary (flat PDF) PDF variation sufficiently expressive
That is, it assimilates updates and large number of observations satisfactorily
31
![Page 33: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/33.jpg)
Next few slides shed light on beta probability distribution function
(1) Mathematical formulation (2)Graphs for intuitive understanding of its role
32
![Page 34: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/34.jpg)
Role of Beta probability distribution function
33
x is a probability, so it ranges from 0-1
If the prior distribution of p is uniform, then the beta distribution gives posterior distribution of p after observing α−1 occurrences of event with probability p and β−1 occurrences of the complementary event with probability (1-p).
![Page 35: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/35.jpg)
34
α= 5 β= 5
α= 1 β= 1
α= 2 β= 2
α= 10 β= 10
α= β, so the pdf’s are symmetric w.r.t 0.5. Note that the graphs get narrower as (α+β) increases.
![Page 36: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/36.jpg)
35
α= 5 β= 25
α= 5 β= 10
α= 25 β= 5
α= 10 β= 5
α=/= β, so the pdf’s are asymmetric w.r.t . 0.5. Note that the graphs get narrower as (α+β) increases.
![Page 37: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/37.jpg)
Advantages: Robust w.r.t. attacks
Bad-mouthing attack E-commerce analogy: Sellers collude with
buyers to give bad ratings to others
Ballot stuffing attack E-commerce analogy: Sellers collude with
buyers to give it unfairly good ratings
Sleeper attacks Apparently trusted agent defects
36
![Page 38: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/38.jpg)
Trust in Tweets
37
![Page 39: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/39.jpg)
Large network of people Large number of tweets Tweet – 140 character description
of an event
Problem: How to organize tweets?
38
![Page 40: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/40.jpg)
Exploiting trust information
Rank tweets according to trust information
Trust in the user who tweets
Belief (trust) in the tweet
39
![Page 41: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/41.jpg)
Trust in the person who tweets Popularity of the user Based on count of followers Reputation of the user Based on history of making informed
observations Enrich using Pagerank Analogy? Highly trusted followers count more
than lowly trusted followers
![Page 42: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/42.jpg)
Belief (Trust) in a tweet
Belief in a tweet depends on the trust in the user who generates it. Belief in a tweet depends on the content of “similar” tweets (originating from approximately the same location around the same time)
![Page 43: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/43.jpg)
Trust in Linked Open Data
42
![Page 44: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/44.jpg)
Linked Data
The Linking Open Data (LOD) project is a community-led effort to create openly accessible, and interlinked, RDF Data on the Web. RDF: Resource Description Framework – graph-based representation language
43
![Page 45: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/45.jpg)
Linked Data
44
![Page 46: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/46.jpg)
Exploiting trust for access and standardization
Trust in the creator of the data, and belief (trust) in the data How well connected is the data? Rank LOD according to trust information
45
![Page 47: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/47.jpg)
Sensor Data on LOD
MesoWest weather data in US ~20,000+ Sensor Systems ~1 billion Observational Assertions Sensors linked with Geonames on LOD
http://wiki.knoesis.org/index.php/SSW
46
![Page 48: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/48.jpg)
Trust in Active Perception
47
![Page 49: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/49.jpg)
Active Perception
Perception is the process of observing, hypothesis generation, and verification
48
![Page 50: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/50.jpg)
Evidence-based Trust
Observations (and hypotheses) are more trusted if they can be verified through empirical evidence Sensors are more trusted if their observations are trusted
49
![Page 51: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/51.jpg)
Evidence-based Trust
50
Strengthened Trust
Trust
![Page 52: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/52.jpg)
Additional uses of active perception in sensors context
Determining actionable intelligence by narrowing set of explanations to one Enable use of a “small” set of always on sensors to bootstrap and selectively turn-on additional sensors in a resource (e.g., power) constrained environment
51
![Page 53: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/53.jpg)
52
References
Krishnaprasad Thirunarayan, Dharan Althuru, Cory Henson, and Amit Sheth, “A Local Qualitative Approach to Referral and Functional Trust,” The 4th Indian International Conference on Artificial Intelligence (IICAI-09), December 2009. Cory Henson, Joshua Pschorr, Amit Sheth, and Krishnaprasad Thirunarayan, “SemSOS: Semantic Sensor Observation Service,” International Symposium on Collaborative Technologies and Systems (CTS2009), Workshop on Sensor Web Enablement (SWE2009), Baltimore, Maryland, 2009. Krishnaprasad Thirunarayan, Cory Henson, and Amit Sheth, “Situation Awareness via Abductive Reasoning from Semantic Sensor Data: A Preliminary Report,” International Symposium on Collaborative Technologies and Systems (CTS2009), Workshop on Collaborative Trusted Sensing, Baltimore, Maryland, 2009.
![Page 54: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/54.jpg)
53
References
A. Sheth and M. Nagarajan, “Semantics-Empowered Social Computing, IEEE Internet Computing”, Jan/Feb 2009, 76-80 Amit Sheth, Cory Henson, and Satya Sahoo, "Semantic Sensor Web," IEEE Internet Computing, vol. 12, no. 4, July/August 2008, p. 78-83.
![Page 55: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/55.jpg)
Machine and Citizen Sensor Data Demos
Illustrate semantic web and information retrieval techniques --
spatio-temporal-thematic ontologies, mash-ups, machine and citizen
sensor data analytics
54
![Page 56: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/56.jpg)
55
High-level Sensor Low-level Sensor
How do we check if the three images depict …
• the same time and same place?
• same entity?
• a serious threat?
Motivating Scenario: Spatio-temporal-thematic analytics
![Page 57: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/57.jpg)
Semantic Observation Service:
Overall Architecture and Details
![Page 58: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/58.jpg)
SemSOS Demo http://knoesis.wright.edu/research/sems
ci/application_domain/sem_sensor/cory/demos/ssos_demo/ssos_demo.htm Twitris Demo http://twitris.knoesis.org/
57
![Page 59: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/59.jpg)
Situation Awareness : Analysis Situation Awareness Components
Physical World: Sensor Data Perception: Entity Metadata Comprehension: Relationship Metadata
Semantic Analysis How is the data represented? Sensor Web Enablement What are the sources of the data? Provenance Analysis What objects/events account for the data? Abductive Reasoning Where did the event occur? Spatial Analysis When did the event occur? Temporal Analysis What is the significance of the event? Thematic Analysis What are the reasons for inconsistency? Abductive Reasoning
![Page 60: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/60.jpg)
59
A Unified Approach to Retrieving Web Documents and Semantic Web Data
Trivikram Immaneni* and Krishnaprasad Thirunarayan
Department of Computer Science and Engineering
Wright State University Dayton, OH-45435, USA
*Currently at: Technorati, San Francisco
![Page 61: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/61.jpg)
60
Outline
Goal (What?)
Background and Motivation (Why?)
Unified Web Model (Why?)
Query Language and Examples (What?)
Implementation Details (How?)
Evaluation and Applications (Why?)
Conclusions
![Page 62: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/62.jpg)
61
Goal
![Page 63: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/63.jpg)
62
Integrate HTML Web and Semantic Web by establishing and exploiting connections between them => Unified Web Model Design and implement a language to retrieve data and documents from the Unified Web => Hybrid Query Language Implement the system using mature software components for indexing and search => SITAR
![Page 64: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/64.jpg)
63
Background and Motivation
![Page 65: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/65.jpg)
64
HTML Web
Hyperlinked Web of documents Content human comprehensible Search engines and web browsers search, retrieve, navigate, and display information Keyword-based searches have low precision and high recall
![Page 66: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/66.jpg)
65
Semantic Web
Standards-based labeled graph of resources and binary properties (data) Content machine accessible Database techniques adapted to store and retrieve Semantic Web data Query formulation by lay users difficult but results are precise XML, RDF, SPARQL, Web Services, etc.
![Page 67: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/67.jpg)
66
Shoehorning HTML Web into Semantic Web
Document = Data node + Content as string in RDF graph
Regular expressions in SPARQL used to retrieve documents. Drawbacks that IR tries to overcome Ease of query formulation: Keyword-based Dealing with Large datasets: Ranking
![Page 68: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/68.jpg)
67
Formalizing HTML Web as Semantic Web
Techniques for manual (re)-authoring of (legacy) documents using Semantic Web Technologies is neither feasible nor advisable. State-of-the-art NLP and information
extraction techniques inadequate Informal description indispensable for human
comprehension Escape route: Traceability via superposition (E.g., RDFa)
![Page 69: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/69.jpg)
68
Shoehorning Semantic Web into HTML Web
Currently, Semantic Web documents live on the HTML Web but their components are neither accessible nor reasoned with via keyword-based searches Swoogle attempts to rank Semantic Web
documents
![Page 70: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/70.jpg)
69
Unified Web Model (What?)
![Page 71: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/71.jpg)
70
Aim
Unified Web integrates the two Webs to enable improved hybrid retrieval of data and documents. Unified Web Model Hybrid Query Language
![Page 72: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/72.jpg)
71
Unified Web Model Graph
Node Abstract entity identified by its URI
Blank/Literal node names automatically generated Home URI Section
URI index words Document Section (optional) Outgoing Links Section Triples Section …
![Page 73: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/73.jpg)
72
(cont’d)
Relationships (Edges) hasDocument
Relates Node to content string hyperlinksTo
Relates Node with another node to which the former node’s document contains a hyperlink
Asserts Relates Node with each RDF statement in the document
linksTo Relates Node with another node
to which the former node’s document contains a hyperlink, or such that the former node’s document contains a triple with
the latter node
![Page 74: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/74.jpg)
73
Example of Unified Web Model
Document http://www.abc.com/xyz.htm contains the RDF fragment:
… <mailTo: [email protected]/> … <rdf:RDF…> <owl:Class
rdf:ID=’http://www.abc.com/sw#Jaguar’/> </rdf:RDF> …
![Page 75: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/75.jpg)
74
![Page 76: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/76.jpg)
75
Data Retrieval from Unified Web
Unified Web Model can be specified using RDF
In terms of rdfs:Resource, rdfs:Propery, rdfs:Statement, rdfs:Literal, refs:Subject, rdfs:Predicate, rdfs:Object, etc
Unified Web is a reified Semantic Web (user triples)
SPARQL usable as query language
![Page 77: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/77.jpg)
76
Information Retrieval from Unified Web
Node can be indexed using URI index words
Based on name, content, label, triples, etc
Node can be ranked using its phrasal / URI-based annotations and its node neighborhood
![Page 78: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/78.jpg)
77
Advantages
Semantic Web nodes can be retrieved using (associated) keywords Legacy document recall improved by interpreting hyperlink as Semantic Markup for reasoning.
Hyperlink: mailto:[email protected] Triple: <mailto:[email protected] rdf:type univ:prof>
![Page 79: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/79.jpg)
78
Semantics rich URIs (such as those from dictionary.com) in legacy documents can be incrementally equated with ontologies
Document: … <a href =
http://dictionary.com/search?q=jaguar> Jaguar </a> God of the Underworld …
Ontology: … <http://dictionary.com/search?q=jaguar owl:Sameas http://www.animalOnto.com/Jaguar> ...
![Page 80: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/80.jpg)
79
Query Language and Examples (What?)
![Page 81: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/81.jpg)
80
Aim
Store and retrieve Semantic Web data, and use information in documents to enhance data retrieval Enable use of keywords to deal with lack
of complete URI information Peter affiliated-with ?X
Enable use of partial information about data being searched Student :: Peter affiliated-with ?X
![Page 82: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/82.jpg)
81
Store and retrieve documents, and use information in the Semantic Web to enhance document retrieval Docsearch(<animal>::<jaguar> Maya God)
![Page 83: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/83.jpg)
82
Sample Queries
Wordset queries: <peter haase> -> retrieves all URIs
indexed by BOTH peter AND haase Includes document and URIs URIs are indexed by words.
The words are obtained by analyzing URIs, from label literals, and anchor text of the URIs.
![Page 84: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/84.jpg)
83
Wordset Pair queries:
<phdstudent>::<peter> -> specifies that
user is looking for peter, the phd student
Transitive closure
![Page 85: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/85.jpg)
84
More Queries
Get Peter the Phd student’s home page: getBindings ( [<phdstudent>::<peter> <homepage> ?x] )
Get Peter Haase’s publications that have “Semantic” in their title:
getBindings([<peter haase> <publication> ?x] [?x <title> <semantic>])
Get group 1 element which is white in color getBindings( [?x <group> <group 1>] [?x <color> <white>] )
![Page 86: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/86.jpg)
85
Homepages of Phd students named Peter that “talk about” Semantic Grid
getDocsByBindingsAndContent ( [<phdstudent>::<peter> <homepage> ?x] “semantic grid” )
getLinkingNodes ( http://www.aifb.uni-
karlsruhe.de/Personen/viewPerson?id_db=2023 ) getAssertingNodes
([<peter haase> <publication> ?x]). getDocsByIndexOrContent (peter haase)
![Page 87: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/87.jpg)
86
Implementation Details (How?)
SITAR : Semantic InformaTion Analysis and Retrieval system
![Page 88: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/88.jpg)
87
Tools Used
Apache Lucene 2.0 APIs in Java A high-performance, text search engine
library with smart indexing strategies. Cyberneko HTML Parser Jena ARP RDF parser
![Page 89: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/89.jpg)
88
Evaluation and Application (Why?)
![Page 90: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/90.jpg)
89
Experiments DATASETs: AIFB SEAL data
The crawler collected 1665 files (English XHTML pages and RDF/OWL pages).
1455 (610 RDF files and 845 XHTML files) were successfully parsed and indexed
A total of 193520 triples were parsed and indexed
![Page 91: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/91.jpg)
90
Datasets (cont’d)
TAP dataset Periodic table Lehigh University BenchMarks
![Page 92: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/92.jpg)
91
Conclusions
![Page 93: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/93.jpg)
92
Developed a Hybrid Query language for data and document retrieval that is convenient because it is keyword-based that can be accurate and flexible because disambiguation
information can be provided that is expressive because it can support inheritance
reasoning that is pragmatic because it can work with legacy
documents
FUTURE WORK: Robust Ranking Strategy
![Page 94: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/94.jpg)
93
References
T. Immaneni and K. Thirunarayan, A Unified approach To Retrieving Web Documents and Semantic Web Data, In: Proceedings of the 4th European Semantic Web Conference (ESWC 2007), LNCS 4519, pp. 579-593, June 2007. T. Immaneni, and K. Thirunarayan, Hybrid Retrieval from the Unified Web, In: Proceedings of the 22nd Annual ACM Symposium on Applied Computing (ACM SAC 2007), pp. 1376-1380, March 2007.
![Page 95: Research in Semantic Web and Information Retrieval: Trust](https://reader031.fdocuments.net/reader031/viewer/2022021306/62073f7c49d709492c2f7845/html5/thumbnails/95.jpg)
THANK YOU!
http://knoesis.wright.edu/tkprasad/
94