A Framework for Improved Access to Museum Databases in the Semantic Web

25
A Framework for Improved Access to Museum Databases Access to Museum Databases in the Semantic Web Dana Dannélls, Mariana Damova, PhD, Ramona Enache, Milen Chechev

description

This paper presents a framework for processing Museum databases according to a set of interlinked ontologies, including CIDOC-CRM, and loading them in a reason-able view of the web of data, providing additional links to datasets from the LOD cloud. The infrastructure allows accessing the data via SPARQL queries and to verbalize the query results in natural language, the GF formalism, which allows access to 18 natural languages.

Transcript of A Framework for Improved Access to Museum Databases in the Semantic Web

Page 1: A Framework for Improved Access to Museum Databases in the Semantic Web

A Framework for Improved

Access to Museum DatabasesAccess to Museum Databases

in the Semantic Web

Dana Dannélls, Mariana Damova, PhD, Ramona Enache, Milen Chechev

Page 2: A Framework for Improved Access to Museum Databases in the Semantic Web

Introduction

Linked Open Data

combining facts and knowledge from different datasets is the

ultimate goal of the Semantic Web

Need for convincing real life use cases demonstrating the benefits

of these technologies

#2September 2011Museum Reason-able View

of these technologies

MacManus, the Founder and Editor-in-Chief of ReadWriteWeb

defined an exemplary test for the Semantic Web

cities around the world which have Modigliani art works

Page 3: A Framework for Improved Access to Museum Databases in the Semantic Web

FactForge of Ontotext solves the Modigliani query

#3September 2011Museum Reason-able View

The cultural heritage domain can become a useful usecase for the application of semantic technologies.

Page 4: A Framework for Improved Access to Museum Databases in the Semantic Web

Outline

• Linked Open Data – the Vision

• Reason-able View – Linked Open Data Management

• Museum Reason-able View – Data

• Museum Reason-able View – Environment

• Ontology based verbalization of triple results

#4September 2011Museum Reason-able View

• Ontology based verbalization of triple results

• Related Work

• Conclusion

Page 5: A Framework for Improved Access to Museum Databases in the Semantic Web

Linked Open Data – the Vision

Tim Berners-Lee

graphs published on the web and explorable across servers in a manner similar

to the way the HTML web is navigated

Design principles of Linked Open Data

– Use URIs to identify things.

– Use HTTP URIs so that these things can be referred to and looked up

("dereferenced") by people and user agents.

#5September 2011Museum Reason-able View

("dereferenced") by people and user agents.

– Provide useful information about the thing when its URI is dereferenced, using

standard formats such as RDF/XML.

– Include links to other, related URIs in the exposed data to improve discovery of

other related information on the Web.

Page 6: A Framework for Improved Access to Museum Databases in the Semantic Web

Linked Open Data Cloud

#6September 2011Museum Reason-able View

258 datasets as of september 2011

Page 7: A Framework for Improved Access to Museum Databases in the Semantic Web

Reason-able View – Linked Open Data Management

Using linked data for data management is considered to have great potential forthe transformation of the web of data into a giant global graph (Heath, & Bizer,2011). Still, there are several challenges that have to be overcome to make thispossible, namely:

• LOD are hard to comprehend;• Diversity comes at a price;

#7September 2011Museum Reason-able View

• Diversity comes at a price;• LOD is unreliable;• Dealing with data distributed on the web is slow;• No consistency is guaranteed.

Using reason-able views (Kiryakov et al., 2009a) – a solution to LOD management.

Page 8: A Framework for Improved Access to Museum Databases in the Semantic Web

Reason-able View – Linked Open Data Management

• An approach for reasoning with and managing linked data

- an assembly of independent datasets, which can be used as a single body of knowledge with respect to reasoning and query evaluation

- lowering the cost and the risks of using specific linked datasets for specific purposes

• The linkage between the data- at the schema level

#8September 2011Museum Reason-able View

- at the schema level- at the instance level

• Accessible via- SPARQL endpoint- keywords

• Queries with predicates from different datasets• “Federated” results from different datasets

Page 9: A Framework for Improved Access to Museum Databases in the Semantic Web

Museum Reason-able View – Data

• Requirements:

- the ability to handle generic knowledge, such as people, institutions, and locations- the ability to handle specific subject domains, such as the cultural heritage and

museums

#9September 2011Museum Reason-able View

• Datasets covering the Generic Knowledge of the Museum Reason-able View.

- DBpedia - the RDF-ized version of Wikipedia, describing more than 3.5 million thingsand covers 97 languages.

- Geonames - a geographic database that covers 6 million of the most significant geographical features on Earth.

- PROTON - an upper-level ontology, 542 entity classes and 183 properties.

Page 10: A Framework for Improved Access to Museum Databases in the Semantic Web

Museum Reason-able View – Museum Data Models

• CIDOC – CRMdeveloped by the International Council of Museum’s Committee for Documentation (ICOM-CIDOC)

an upper-level ontology for cultural and natural historyfor museum professionals to perform their work well

90 classes and 148 properties

#10September 2011Museum Reason-able View

- Entity, Temporal Entity, Time Span, Place, Dimension, - Production, Creation, Dissolution, Acquisition, Curation

Page 11: A Framework for Improved Access to Museum Databases in the Semantic Web

Museum Reason-able View – Museum Data Models

K-samsök, the Swedish Open Cultural Heritage (SOCH)

• a Web service for applications to retrieve data from cultural heritage institutions or associations with Cultural Heritage information.

• includes features which are divided in the following categories:

- Identification of the item in the collection- Internet address, and thumbnail address- Description of the item- Description of the presentation of the item, including a thumbnail

#11September 2011Museum Reason-able View

- Description of the presentation of the item, including a thumbnail- Geographic location coordinates- Museum information about the item- Context, when was it created, to which style it belongs, etc.- Item specification, e.g. size, and type of the item – painting, sculpture and the like

Page 12: A Framework for Improved Access to Museum Databases in the Semantic Web

Museum Reason-able View – Museum Data Models

Painting Ontology OWL 2182 classes and 92 properties

<owl:Class rdf:about="&painting;Painting"><owl:equivalentClass>

<owl:Class><owl:intersectionOf rdf:parseType="Collection">

<rdf:Description rdf:about="&ksasok;item"/>

#12September 2011Museum Reason-able View

<rdf:Description rdf:about="&ksasok;item"/><rdf:Description rdf:about="&milo;PaintedPicture"/>

</owl:intersectionOf></owl:Class>

</owl:equivalentClass><rdfs:subClassOf rdf:resource="&core;E22_Man-Made_Object"/>

</owl:Class>

Integrated with SOCH Time OntologyMERGE from SUMO Mid-Level-Ontology from SUMO

Page 13: A Framework for Improved Access to Museum Databases in the Semantic Web

Museum Reason-able View – Gothenburg City Museum Data

8900 museum objects in two museum collections – GSM and GIMGSM – Gothenburg Stads MuseumGIM – Gothenburg Industry Museum

39 properties describe each museum object in each collection

The Gothenburg City Museum data is integrated by using predicates from :- CIDOC-CRM

#13September 2011Museum Reason-able View

- CIDOC-CRM- PROTON- Painting Ontology- linkages to DBpedia

Page 14: A Framework for Improved Access to Museum Databases in the Semantic Web

Museum Reason-able View – Architecture

#14September 2011Museum Reason-able View

Page 15: A Framework for Improved Access to Museum Databases in the Semantic Web

Museum Reason-able View – Museum Data Triplification

#15September 2011Museum Reason-able View

Process of triplification and localization of Gothenburg City Museum data in English.

Page 16: A Framework for Improved Access to Museum Databases in the Semantic Web

Museum Reason-able View – Environment

• OWLIM

• Ontologies and data loaded with full materialization

– Dbpedia 3.6, Geonames 2.2.1, PROTON 3.0, CIDOC-CRM 1.0, GCM data

• 20% more retrievable statements than loaded explicit

statements

– 257,774,678 (explicit)

#16September 2011Museum Reason-able View

– 257,774,678 (explicit)

– 305,313,536 (retrievable)

• SPARQL endpoint– Museum artefacts preserved in the museum since 2005

– Paintings from the GSM collection

– Inventory numbers of the paintings from the GSM collection

– Location of the objects created by Anders Hafrin

– Paintings with length less than 1 m

– etc.

http://museum.ontotext.com

Page 17: A Framework for Improved Access to Museum Databases in the Semantic Web

Museum Reason-able View – Access and Querying

http://museum.ontotext.com/sparql

#17September 2011Museum Reason-able View

http://museum.ontotext.com/sparql

Page 18: A Framework for Improved Access to Museum Databases in the Semantic Web

Ontologies Verbalization

• The Grammatical Framework (GF)

- key feature is the division of a grammar in the abstract syntax which

acts as a semantic interlingua and the concrete syntaxes-

representing verbalizations in various target languages (natural or

formal).

#18September 2011Museum Reason-able View

formal).

- a resource library (Ranta, 2009), where the abstract syntax

describes the most common grammatical constructions allowing text

generation, which are further mapped to concrete syntaxes

corresponding to 18 languages.

Page 19: A Framework for Improved Access to Museum Databases in the Semantic Web

Museum Reason-able View Verbalization

• subclass relation rdfs:subClassOf from the ontology are

encoded as functions in the GF grammar

• Other information stated in the ontology, is encoded in GF as axioms

• The natural language generation is based on composeable templates

Example: OWL entry corresponding to the painting Big Garden:

#19September 2011Museum Reason-able View

Example: OWL entry corresponding to the painting Big Garden:

<owl:NamedIndividual rdf:about="&painting; BigGardenObj">

<rdf:type rdf:resource="&painting;Painting"/>

<isPaintedOn rdf:resource="&painting;Canvas"/>

<createdBy rdf:resource="&painting;CarlLarsson"/>

<hasCreationDate rdf:resource="&painting;Year1937"/>

</owl:NamedIndividual>

Page 20: A Framework for Improved Access to Museum Databases in the Semantic Web

Museum Reason-able View Verbalization

• fun BigGardenObj : Ind Painting ;

• A set of Axioms

– isPaintedOn (el BigGradenObj) (el Canvas)

– createdBy (el BigGardenObj)(el CarlLarsson)

– hasCreationDate (el BigGardenObj) (el (year 1937))

• Generated sentences

#20September 2011Museum Reason-able View

• Generated sentences

– Big Garden is a painting

– Big Garden is painted on canvas

– Big Garden is painted by Carl Larsson

– Big Garden was created in 1937

Page 21: A Framework for Improved Access to Museum Databases in the Semantic Web

GF Abstract Representation to English Syntax

• Abstract Representation for CreationDate

fun Painting_hasCreationDate : El Painting_Artwork

-> El Painting_TimePeriod -> Formula ;

• GF syntactic rule for CreationDate

#21September 2011Museum Reason-able View

• GF syntactic rule for CreationDate

lin Painting_hasCreationDate o1 o2 =

mkPolSentPast (S.mkCl o1 (S.mkVP (S.passiveVP create_V2) (S.mkAdv in_Prep o2))) ;

Page 22: A Framework for Improved Access to Museum Databases in the Semantic Web

Related Work

MAO – Finland

http://www.seco.tkk.fi/projects/finnonto/

Europeana – EU

http://www.europeana.eu/portal/

VUA – Amsterdam Museum with semantic technologies

within Europeana connect

British Museum – Research Space

just won tender funded by Melon foundation

#22September 2011Museum Reason-able View

just won tender funded by Melon foundation

Contribution of the paper:

First to link real museum data to LOD

First to use schema-level mapping to data integration in a specific domain like

cultural heritage

Ontology-based verbalization of query result triples

Page 23: A Framework for Improved Access to Museum Databases in the Semantic Web

Conclusion

A framework for integrating and accessing museum linked data

A method to present this data using natural language generation technology

A Museum Reason-able View

- a series of upper-level and domain specific ontologies used to transform

Gothenburg museum data from a relational database into RDF

- links to LOD cloud data

#23September 2011Museum Reason-able View

Templates automatically obtained in GF to generate the query results in natural language

Future work

Experiments with the Museum Reason-able view

Extension of the data

Increasing the coverage of the GF grammar

Fluent discourse generation from ontologies

MOLTO, FP7-ICT-247914

Page 24: A Framework for Improved Access to Museum Databases in the Semantic Web

References

• Tim Berners-Lee. 2004. OWL Web Ontology Language reference, February. W3C Recommendation. T.

Berners-Lee. 2006. Design issues: Linked data. Retrieved from

http://www.w3.org/DesignIssues/LinkedData.html.

• B. Bishop, A. Kiryakov, D. Ognyanoff, I. Peikov, Z. Tashev, and R. Velkov. 2011. Owlim: A family of

scalable semantic repositories. Semantic Web Journal, Special Issue: Real-World Applications of OWL.

• Nick Crofts, Martin Doerr, Tony Gill, Stephen Stead, and Matthew Stiff, 2008. Definition of the CIDOC

Conceptual Reference Model.

• Mariana Damova and Dana Dannells. 2011. Reasonable view of linked data for cultural heritage. In

#24September 2011Museum Reason-able View

• Mariana Damova and Dana Dannells. 2011. Reasonable view of linked data for cultural heritage. In

Proceedings of the third International Conference on Software, Services and Semantic Technologies

(S3T).

• Ramona Enache and Krasimir Angelov. 2010. Typeful ontologies with direct multilingual

verbalization. Workshop on Controlled Natural Languages (CNL) 2010.

• Bernhard Haslhofer and Antoine Isaac. 2011. data.europeana.eu the europeana linked open data

pilot. In Proceedings of the Intl. Conf. on Dublin Core and Metadata Applications.

• Aarne Ranta. 2009. The GF resource grammar library. Linguistic Issues in Language Technology, 2(2).

Page 25: A Framework for Improved Access to Museum Databases in the Semantic Web

Thank you for your attention!

Questions

#25September 2011Museum Reason-able View

Questions

[email protected]