Ontologies in Physical Science
-
Upload
petermurrayrust -
Category
Education
-
view
847 -
download
5
description
Transcript of Ontologies in Physical Science
![Page 1: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/1.jpg)
Ontologies in Physical Science
Onto Workshop, ed.ac.uk 2013-04-11
An #animalgarden production
Peter Murray-Rust, University of Cambridge
& Open Knowledge Foundation
![Page 2: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/2.jpg)
PMR and friends want us to help build
a computational chemistry ontology
Is it an important problem?
$1,000,000,000/yrfor compchem
They need OWL
Problem: How to build ontologies when people are uninterested or antagonisticeven though we have the technology
![Page 3: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/3.jpg)
Chemists don’t use ontologies
Perhaps the chemists could use OWL-DL
Top-down schemas like AniML haven’t (yet) taken off
![Page 4: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/4.jpg)
Are there any ontologies in physical
science that work?
Crystallo-graphers build
CIF dictionaries
The IUCr, right? Tell us about CIF
IUCr: International Union of Crystallography
![Page 5: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/5.jpg)
CIF Core defines 500 common
concepts
CIF: http://www.iucr.org/cif
Or the volume of the crystal cell
Like the wavelength of the radiation
used
![Page 6: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/6.jpg)
Core dictionary (coreCIF) version 2.4.3 _diffrn_ambient_temperatureDefinition: The mean temperature in kelvins at which the intensities were measured. Range: 0.0 -> infinity Type: numb
An example
?
ID
For humans
For machines:Constraint + type
http://www.iucr.org/__data/iucr/cifdic_html/1/cif_core.dic/Idiffrn_ambient_temperature.html
![Page 7: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/7.jpg)
Definition: The mean temperature in kelvins at which the intensities were measured.
So everyone converts
temperatures to use K?
Yes! today I swam at 273K
But chemists want to use all
sorts of different units
We MUST have a units
ontology
![Page 8: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/8.jpg)
OWL? Is CIF a proper
ontology? It’s not RDF…
…but we’ve global URIs, likecif:_diffrn_ambient_temperature
Because IUCr controls the namespace prefix: cif=http://www.iucr.org/cif
![Page 9: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/9.jpg)
CIF had 20 years of community involvement through IUCr
But most top-down chemistry projects don’t
work
So we’ll do this bottom-up.
![Page 10: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/10.jpg)
Every compchem program uses basically
the same scientific concepts
We think each should build its own dictionary so we understand the output
Won’t that just be a mess?
No. It’s the first step to interoperability.
![Page 11: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/11.jpg)
Chemical Markup Language PMR/Rzepa http://www.xml-cml.org
Hyperchem builds ITS dictionary
Each annotates their own
program output
NWChem builds ITS dictionary
The programs will use CML* for chemical output
![Page 12: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/12.jpg)
in a communal cml:compchem dictionary
that everyone uses
We agree they are the same so create
compchem:alphae
Alpha-electrons: Hyperchem uses hchem:e_alpha
NWChem has nwchem:_alpha_elec
![Page 13: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/13.jpg)
What if the data structure
or concepts don’t map
CML provides conventions so each group can define their data
structure
Data can then be machine validated
against each convention!
![Page 14: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/14.jpg)
But there are over 20 program codes.
We’ve prototyped with many before.
They’ll be encouraged
I think it’s going to work.
BUT TTT*
GULP, DPOLY,
CASTEP, SIESTA,
MOPAC …
TTT: Things Take Time (Piet Hein)
![Page 15: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/15.jpg)
Will it work? It depends on
people
National labs CSIRO/AU
and PNNL/US are committed
And we have companies like
Hyperchem and Kitware
I wish we had some
publishers
![Page 16: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/16.jpg)
We’ll need tools
We’ve got FoX* for FORTRAN output
JUMBOTemplates to parse logfiles
RDF for navigating dictionaries
FoX*: XML/FORTRAN Toby White, Andrew Walker
![Page 17: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/17.jpg)
Benefits of semantic dictionaries:• FORTRAN logfile can be made semantic• High degree of interoperability in chemistry• Semantic publication (HTML5, CML, MathML)• Interoperates with mainstream Web • Easily scalable to other phys sci.
Problems:• Closed code/minds is short-term market advantage• Non-trivial commitment (updates, code revision)• Getting top-down approval (e.g. IUPAC)
![Page 18: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/18.jpg)
Benefits of semantic dictionaries:• FORTRAN logfile can be made semantic• High degree of interoperability in chemistry• Semantic publication (HTML5, CML, MathML)• Interoperates with mainstream Web • Easily scalable to other phys sci.
Problems:• Closed code/minds is short-term market advantage• Non-trivial commitment (updates, code revision)• Getting top-down approval (e.g. IUPAC)
![Page 19: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/19.jpg)
Benefits of semantic dictionaries:• FORTRAN logfile can be made semantic• High degree of interoperability in chemistry• Semantic publication (HTML5, CML, MathML)• Interoperates with mainstream Web • Easily scalable to other phys sci.
Problems:• Closed code/minds is short-term market advantage• Non-trivial commitment (updates, code revision)• Getting top-down approval (e.g. IUPAC)
![Page 20: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/20.jpg)
Benefits of semantic dictionaries:• FORTRAN logfile can be made semantic• High degree of interoperability in chemistry• Semantic publication (HTML5, CML, MathML)• Interoperates with mainstream Web • Easily scalable to other phys sci.
Problems:• Closed code/minds is short-term market advantage• Non-trivial commitment (updates, code revision)• Getting top-down approval (e.g. IUPAC)
![Page 21: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/21.jpg)
Benefits of semantic dictionaries:• FORTRAN logfile can be made semantic• High degree of interoperability in chemistry• Semantic publication (HTML5, CML, MathML)• Interoperates with mainstream Web • Easily scalable to other phys sci.
Problems:• Closed code/minds is short-term market advantage• Non-trivial commitment (updates, code revision)• Getting top-down approval (e.g. IUPAC)
![Page 22: Ontologies in Physical Science](https://reader036.fdocuments.net/reader036/viewer/2022062418/5549602cb4c905e94e8b5796/html5/thumbnails/22.jpg)
Top-down schemas like AniML haven’t (yet) taken off
Chemists don’t use ANY
ontologies
Perhaps the chemists could use OWL-DL