Integrating archaeological data: The ARIADNE Infrastructure, Achille Felicetti PIN, Università...

34
ARIADNE is funded by the European Commission's Seventh Framework Programme Integrating archaeological data The ARIADNE infrastructure Integrating archaeological data The ARIADNE infrastructure Save the data - Workshop on digital repositories Vienna, December 2 nd 2014 Achille Felicetti PIN, Università degli Studi di Firenze, Prato Save the data - Workshop on digital repositories Vienna, December 2 nd 2014 Achille Felicetti PIN, Università degli Studi di Firenze, Prato

Transcript of Integrating archaeological data: The ARIADNE Infrastructure, Achille Felicetti PIN, Università...

ARIADNE is funded by the European Commission's Seventh Framework Programme

Integrating archaeological dataThe ARIADNE infrastructure

Integrating archaeological dataThe ARIADNE infrastructure

Save the data - Workshop on digital repositoriesVienna, December 2nd 2014

Achille FelicettiPIN, Università degli Studi di Firenze, Prato

Save the data - Workshop on digital repositoriesVienna, December 2nd 2014

Achille FelicettiPIN, Università degli Studi di Firenze, Prato

What is ARIADNE

• ARIADNE is a Research Infrastructure aiming at the integration of archaeological datasets in Europe (and beyond)

• Four years’ duration• Starting 1st February 2012• 24 partners• Coordinated by PIN-U. of Florence (IT)

• ARIADNE is a Research Infrastructure aiming at the integration of archaeological datasets in Europe (and beyond)

• Four years’ duration• Starting 1st February 2012• 24 partners• Coordinated by PIN-U. of Florence (IT)

The ARIADNE partnership

• Coordinator• Partner• Associate

Why ARIADNE

• Huge number of archaeological data available in digital format

• Large number of non-communicating archaeological datasets

• Increasing interest of the research community for data sharing

• Social pressure for opening data vaults

• Huge number of archaeological data available in digital format

• Large number of non-communicating archaeological datasets

• Increasing interest of the research community for data sharing

• Social pressure for opening data vaults

Project activities

• Networking activities– Community building: involving additional institutions

sharing data and establishing together guidelins

– Standardization and good practices

• Trans-National Access to shared datasets and training in their creation, as well as to on-line repositories– Support for digitization and data organization

• Research activities– Knowledge organization

– Data management

– New or improved tools to extract information

– Advances in methodology

• Networking activities– Community building: involving additional institutions

sharing data and establishing together guidelins

– Standardization and good practices

• Trans-National Access to shared datasets and training in their creation, as well as to on-line repositories– Support for digitization and data organization

• Research activities– Knowledge organization

– Data management

– New or improved tools to extract information

– Advances in methodology

Archaeology and the heterogeneity …

Digital documentation

• Museums information• Excavations records• Images• 3D models

• RDBMS• GIS• XML• CSV• Excel • Unstructured files

XML

Textual documentation

Terminological tools

ICCD thesaurus for archaeological findsFor movable archaeological items

How to achieve integration

Data sharing requires

•Suitability of somebody else’s data for one’s purposes•Interoperability of datasets

•Trusting in data collected by others

•Guarantee of data “provenance”•Common understanding on meanings

Data sharing requires•Suitability of somebody else’s data for one’s purposes

•Interoperability of datasets

•Trusting in data collected by others•Guarantee of data “provenance”

•Common understanding on meanings

First step: the ARIADNE Registry• Complete catalogue of archaeological datasets owned by

partners– Used to identify common information on which to build data

integration

• ACDM – Ariadne Catalogue Data Model– Based onW3C DCAT model (

http://www.w3.org/TR/vocab-dcat/), – Formal description of archaeological datasets

• Detailed descriptions of information managed by partners– Data format– Number of stored records– Metadata schemas used– Standards and vocabularies used (if any)

• Complete catalogue of archaeological datasets owned by partners– Used to identify common information on which to build data

integration

• ACDM – Ariadne Catalogue Data Model– Based onW3C DCAT model (

http://www.w3.org/TR/vocab-dcat/), – Formal description of archaeological datasets

• Detailed descriptions of information managed by partners– Data format– Number of stored records– Metadata schemas used– Standards and vocabularies used (if any)

Data integration: capture the semantic meaning

ID Item Room Showcase35 Amphora 3 224 Coin 8 1518 ... ... ...

ID Artifact SU1020 Coin 121021 ... ...1022 Amphora 13

Museum DB: Items Table

Excavation DB: Artifacts Table

Different archives

Different data structures

Is integration possible?

Capture the semantic meaning ...

ID Item Room Showcase35 Amphora 3 224 Coin 8 1518 ... ... ...

ID Artifact SU1020 Coin 121021 ... ...1022 Amphora 13

Museum DB: Items Table

Excavation DB: Artifacts Table

ObjectObjectObjectObject PlacePlacePlacePlace

Implicit knowledge: semantic relations

ID Item Room Showcase35 Amphora 3 224 Coin 8 1518 ... ... ...

ID Artifact SU1020 Coin 121021 ... ...1022 Amphora 13

Museum DB: Items Table

Excavation DB: Artifacts Table

ObjectObjectObjectObject

PlacePlacePlacePlace

ObjectObjectObjectObject

PlacePlacePlacePlace

Found InFound In Stored InStored In

Temporal entities

ID Artifact SU Data Period

1020 Coin 12 1981 V B.C.

1021 ... ...

1022 Amphora 13 1974 III B.C.

TimeTimeObjectObjectObjectObject

CreatedCreated

FoundFound

Mapping and RDF Encoding ...<crm:E22.Man-Made_Object rdf:about=”35"> <crm:P2.has_type> <crm:E55.Type rdf:about=”Amphora"/> </crm:P2.has_type></crm:E22.Man-Made_Object>

<crm:E22.Man-Made_Object rdf:about=”35"> <crm:P55.has_current_location> <crm:E53.Place rdf:about=”2"/> <crm:P2.has_type> <crm:E55.Type rdf:about=”Showcase"/> </crm:P2.has_type> </crm:P55.has_current_location></crm:E22.Man-Made_Object> ID Item Room Showcase

35 Amphora 3 224 Coin 8 1518 ... ... ...

ObjectObjectObjectObject PlacePlacePlacePlace

Representing knowledge ...

SYNERGY Mapping Reference Model• Designed by CIDOC CRM SIG• Modular structure

– Different tools communicating between them– Individual tasks so as to divide the mapping problem in different subparts

• Schema visualizers • Mapping interfaces• Source analyzers and normalizer • Mapping suggester• Mapping memories storage

– mapping histories of analogous cases collected from user community

• Terminology mapper– to define equivalences between terms from different vocabularies

• Designed by CIDOC CRM SIG• Modular structure

– Different tools communicating between them– Individual tasks so as to divide the mapping problem in different subparts

• Schema visualizers • Mapping interfaces• Source analyzers and normalizer • Mapping suggester• Mapping memories storage

– mapping histories of analogous cases collected from user community

• Terminology mapper– to define equivalences between terms from different vocabularies

FORTH - 3M

• FORTH Data Model and Standards:– http://www.ics.forth.gr/isl/index_main.php?l=e&c=229

• FORTH Mapping Memory Manager– http://www.ics.forth.gr/isl/3M

• ICCD (Italian Central Institute for Cataloguing and Documentation ) models for archaeological documentation– Monuments & sites– Artifacts

• OEAW coins database

• FORTH Data Model and Standards:– http://www.ics.forth.gr/isl/index_main.php?l=e&c=229

• FORTH Mapping Memory Manager– http://www.ics.forth.gr/isl/3M

• ICCD (Italian Central Institute for Cataloguing and Documentation ) models for archaeological documentation– Monuments & sites– Artifacts

• OEAW coins database

OEAW coins database

OEAW coins database mapping

Knowledge extraction from texts

... he made a palace extending all the way from the Palatine to the Esquiline, which at first he called the House of Passage, but when it was burned shortly after its completion and rebuilt, the Golden House

... domum a Palatio Esquilias usque fecit, quam primo Transitoriam, mox incendio absumptam restitutamque Auream nominavit (Suetonius, Nero 31, 1)

Places:PalatineEsquiline

Actors:Nero

Appellations:House of PassageGolden House

Events:Palace burning

Objects:The palace

Activities:Palace extensionPalace rebuildingName assignements

Date:64 d.C.

Classes

Knowledge Extraction: annotation tools

Refers to

Refers to

CarriesCarries

From text to RDF

<cidoc:E39_Actor rdf:about="Nero"> <cidoc:P14_performed rdf:resource="GH_Name_assignment"/> <cidoc:P14_performed rdf:resource="HP_Name_assignment"/> <cidoc:P14_performed rdf:resource="Palace_extension"/> <cidoc:P14_performed rdf:resource="Palace_rebuilding"/> </cidoc:E39_Actor>

<cidoc:E79_Part_Addition rdf:about="Palace_extension"> <cidoc:P7_took_place_at rdf:resource="Esquiline"/> <cidoc:P110_augmented rdf:resource="The_Palace"/> <cidoc:P31_has_modified rdf:resource="The_Palace"/> </cidoc:E79_Part_Addition>

<cidoc:E11_Modification rdf:about="Palace_rebuilding"> <cidoc:P7_took_place_at rdf:resource="Esquiline"/> <cidoc:P7_took_place_at rdf:resource="Palatine"/> <cidoc:P31_has_modified rdf:resource="The_Palace"/> </cidoc:E11_Modification>

ARIADNE is funded by the European Commission's Seventh Framework Programme

----------------------------------------------------------------------------------

Metatada RepositoryMetatada Repository

CIDOC CRM

ContentProviders

Integration &Interoperability

XML OAI-PMH RDF

Integration LayerIntegration Layer

– Common semantic representation (mapping)

– Data transparency

– Data peculiarity preserved by the system

– Common semantic representation (mapping)

– Data transparency

– Data peculiarity preserved by the system

Preserving legacy archives

• Legacy database synchronization

– ARIADNE system constantly updated according to modifications of legacy archives

• References to legacy archives always provided

– Data provenance

– URLs to information on original portals/web applications

• User to navigate original information

– To perform custom searches tailored on specific needs

• Legacy database synchronization

– ARIADNE system constantly updated according to modifications of legacy archives

• References to legacy archives always provided

– Data provenance

– URLs to information on original portals/web applications

• User to navigate original information

– To perform custom searches tailored on specific needs

System architecture

• Triple store– scalable and able to handle a large number of RDF triples

(i.e. the entire ARIADNE semantic graph) efficiently;

• Communication modules – legacy archives – on-the-fly and on-demand population of the ARIADNE

platform

• Internal consistency control modules• Terminological services • Query management system• Semantic enrichment and exporting features

• Triple store– scalable and able to handle a large number of RDF triples

(i.e. the entire ARIADNE semantic graph) efficiently;

• Communication modules – legacy archives – on-the-fly and on-demand population of the ARIADNE

platform

• Internal consistency control modules• Terminological services • Query management system• Semantic enrichment and exporting features

Triple stores and semantic data

Virtuosohttp://virtuoso.openlinksw.com/

3D-COFORM Repositoryhttp://www.3d-coform.eu

Synchronisation Services

D2R ServerPublishing relational databases on the Semantic Web

http://d2rq.org/d2r-server

Accessing the architecture

• ARIADNE Portal– Unique access point to the whole system– Configuration and management module for users

administration and data workflow control; – Interfaces for interacting with the query management

layers

• ARIADNE Services– Data access and management– Resource discovery– Data import/export– Data Visualization and annotation

• ARIADNE Portal– Unique access point to the whole system– Configuration and management module for users

administration and data workflow control; – Interfaces for interacting with the query management

layers

• ARIADNE Services– Data access and management– Resource discovery– Data import/export– Data Visualization and annotation

Query on map and timeline

ARIADNE is funded by the European Commission's Seventh Framework Programme

Repository and ServicesRepository and ServicesArchitectureArchitecture

----------------------------------------------------------------------------------

----------------------------------------------------------------------------------

----------------------------------------------------------------------------------

MetadataRepositoryMetadataRepository RegistryRegistry

ACDMCIDOC CRM

ContentProviders

Integration &Interoperability

IntegratedServices

Configuration & Management

Configuration & Management

ARIADNEPortal

Browse/Query Interfaces

Browse/Query Interfaces

VocabulariesVocabularies

Deposit ServicesOAI-PMHXML

MetadataEnhancement

MetadataEnhancement

Data+

Metadata

Data+

Metadata

Data+

Metadata

Data+

Metadata

Data+

Metadata

Data+

Metadata

Data+

Metadata

Data+

Metadata

Resource Discovery

Resource DiscoveryPreviewPreview PreservationPreservation

Data Access (SPARQL, REST)Data Access

(SPARQL, REST)

ArchiveDiscoveryArchive

DiscoveryDigitalAsset

Management

DigitalAsset

ManagementDataset

DiscoveryDataset

DiscoveryVocabulariesManagement

VocabulariesManagement WEB

LOD

Mapping Services

Final considerations

• Very advanced stage of development

– End of the project

• ARIADNE main goal

– “Integration of existing archaeological research data infrastructure through new and powerful technologies” (ARIADNE DoW)

• “From differences results the most beautiful harmony”(Heraclitus of Ephesus)

• Very advanced stage of development

– End of the project

• ARIADNE main goal

– “Integration of existing archaeological research data infrastructure through new and powerful technologies” (ARIADNE DoW)

• “From differences results the most beautiful harmony”(Heraclitus of Ephesus)

ARIADNE is a project funded by the European Commission under the Community’s Seventh Framework Programme, contract no. FP7-INFRASTRUCTURES-2012-1-313193.

The views and opinions expressed in this presentation are the sole responsibility of the authors and do not necessarily reflect the views of the European Commission.

ARIADNE is a project funded by the European Commission under the Community’s Seventh Framework Programme, contract no. FP7-INFRASTRUCTURES-2012-1-313193.

The views and opinions expressed in this presentation are the sole responsibility of the authors and do not necessarily reflect the views of the European Commission.

Thank you