The Semantic Web for Spatial Data Search
Femke Reitsma
University of Maryland – College Park [email protected]
Why Ontologies: Could the Semantic Web Meet Discovery
Challenges?
Why Ontologies: Could the Semantic Web Meet Discovery
Challenges?
“The semantic web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation” (Tim Burners-Lee et al. 2001)
The semantic web makes web pages machine “understandable” rather than just human understandable.
Semantic web components:
2. Moving up the semantic web layers:
1. Basic components:
Semantic Web layers presented by Tim Berners-Lee
ontology + semantically marked-up web page
= semantic web
We are here
Semantic Web Languages
• Primary languages:
RDF (Resource Description Framework)
RDFS (Resource Description Framework Schema)
OWL (Web Ontology Language)
• Historical development:
XML provides the basic syntax
RDF and RDFS adds some tags to XML
DAML+OIL add some tags to RDF
OWL extends and replaces (almost) DAML+OIL
Basic Structure• Information is encoded as a triple: subject, predicate, and object
For example:
<Femke > <is a> <student> <Zimbabwe> <is a part of> <Africa>
• All subjects and objects are identified with a Universal Resource Identifier (URI):
e.g. http://www.daml.org/2001/02/geofile/geofile-ont.daml#GeographicLocation
What is an ontology?
Big “O” Ontology vs little “o” ontology:
Ontology = metaphysics, the essence of being, reality
ontology = “a logical theory which gives an explicit, partial
account of a conceptualization” (Guarino and Giaretta, 1995 )
What does an ontology look like?
What does semantic web page look like?
Ontology ↔ Semantic Content
Ontology: Dublin core ontologySemantic Web page: http://owl.mindswap.org
ontology + semantic web =
• Computer parsable
• Inference ability
State code > city code > address code
Computer agent could deduce that a Cornell University address, being in Ithaca, must be in New York State, which is in the U.S., and therefore must be formatted to U.S. standards.
Example of application:
Find me places to eat accessible via public transport?
Over here ….
Objective:
Explore the potential of the Semantic Web
for distributing spatial data
Current GCMD Search
North America?
2950 records matched your query
North America?
North America [2950]
Limit search by:
- Spatial resolution
- Temporal resolution
- GCMD keywords
Explore results by:
Canada [1348]USA [1602]
GCMD keywords
……
Key = ability to determine relationships between keywords without explicitly encoding them
Future GCMD Search
North America?
GCMD Database
Sesame Ontology
Java Application
Progressing Towards Level 1:
Sesame = Open Source RDF Schema-based Repository and Querying facility
Keywords → Ontologies• Importance of careful specification of relationships for ontology.
CATEGORY > TOPIC > TERM > VARIABLE
• For purpose of Semantic Web, keyword structure may need modification. e. g.
Hydrosphere > Ground Water > Saltwater Intrusion
e.g. the Variable Fetch is a measurable property of the Term Ocean Waves; however, the Variable Fisheries is a sub-topic of the Term Agricultural Aquatic Sciences.
Keywords: Projects, Sensors, Sources, Locations, IDN Nodes, Data Centers, Science Keywords, Services Keywords, URL Content Types, Chronostratigraphic Units
DIF Schema
• XSLT style sheet to create DIF schema in Semantic Web language
• Mapping terms to ontology
• Avoiding a monolithic ontology by mapping terms to other ontologies
– e.g. Dublin Core
DIFs
• XSLT style sheet to convert DIFs to Semantic Language
• Mapping terms to ontology and DIF Schema• Recording keywords of finest granularity• Avoiding a monolithic ontology by mapping terms
to other ontologies– e.g. Dublin Core, Cyc, DAML-time
Sesame:
• Middleware
• GUI or API
• Database: PostgreSQL or Oracle
HTTP Protocol Handler Soap Protocol Handler
Request Router
Export ModuleQuery ModuleAdmin Module
Repository Abstraction Layer
Client 1 Client 2 Client 3
GCMD Repository:
-RDF DIF files
-Ontologies
-DIF Schema
Sesame
HT
TP
HT
TP
SO
AP
HTTP Protocol Handler Soap Protocol Handler
Request Router
Export ModuleQuery ModuleAdmin Module
Repository Abstraction Layer
Client 1 Client 2 Client 3
GCMD Repository:
-RDF DIF files
-Ontologies
-DIF Schema
Sesame
HT
TP
HT
TP
SO
AP
Advantages for the GCMD
• Semantic Web presents database structure in a machine parsable format
• Ability to search for the semantic relationships among any DIF terms within the ontology
• Do not need to change the database structure when new classes and relationships are added
• Real advantages = when ontology is enriched
UWG Assistance• Gene Major
– How do we handle the scalability issue with regards to population of the DIFs. We have now over 15,000 entries; updates require much work to a) determine if data is still viable (b) make revisions. Database revisions such as phone numbers, etc. are easy; content revisions are more labor intensive.
– How do we handle the same data sets being delivered from multiple systems (not data centers)..like OPeNDAP, NOMADS, THREDDS, etc. All may deliver the same data set, but how do we point to all those catalogs? How to we index the DIFs to do that. We could use Related_URL, but is that the right solution.
– How can we get interaction between data sets and publishers. In other words, what mechanisms can we use to link data sets with the current literature.
– How do you feel about potential privacy concerns over contact information within GCMD DIFs/SERFs
• Stephanie Leicester– Suggestions about how to encourage DIF Authors to review and
update their records regularly
– Direction and guidance on developing a metadata standard for archived samples
• Heather Weir– Suggestions about how to increase the number of SERFs
– Direction and guidance with the Learning Center and Astronomy keywords
• Scott Ritz– To spread the word in their community (to data users and
producers).
– Encourage data holders they meet to submit metadata.
– Closer interaction with Science Coordinators: new data notifications, contacts.
• Monica Holland– Continue to spread the word about GCMD.
• Cheryl Solomon– Suggest University sources for ecological
datasets– Suggest international sources for metadata
• Tyler Stevens– What direction we should take with GIS in the GCMD.
– More GIS contacts to work with to increase GIS within the GCMD
Top Related