Cigs lod rcahms_seneschal_pm_20131118
-
Upload
cigscotland -
Category
Technology
-
view
2.018 -
download
0
description
Transcript of Cigs lod rcahms_seneschal_pm_20131118
SENESCHAL: Semantic ENrichment Enabling Sustainability of arCHAeological Links
Peter McKeague(On behalf of project partners)
SENESCHAL
www.rcahms.gov.uk http://canmore.rcahms.gov.uk
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Outline of talk Part I RCAHMS
What we do What we hold Classifying
Part II Drivers for Linked Data
Part III SENESCHAL Project Partners The Project so far Prospects
• Identifies, surveys and analyses the historic and built environment of Scotland
• Preserves, cares for and adds to the information and items in its national collection
• Promotes understanding, education and enjoyment through interpretation of the information it collects and the items it looks after
RCAHMS Mission Statement
RCAHMS vocabularies
SC694685
Objects
SC656461
Maritime CraftMonuments
SC1224403
Events
SC335945
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Standards: Midas Heritage
http://www.english-heritage.org.uk/publications/midas-heritage/
CIDOC Conceptual Reference Model (CRM) http://www.cidoc-crm.org/
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Monuments: Internal staff databaseThesaurus: Events
ThesauriMonumentsObjectsMaritime Craft
Pick lists
Pick list Pick list
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Information is published on Canmore
ThesauriMonumentsObjectsMaritime Craft
http://canmore.rcahms.gov.uk
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Information is published on Canmore
ThesauriMonumentsObjectsMaritime Craft
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
RCAHMS thesauri: text search
http://orapweb.rcahms.gov.uk/apex/f?p=210:1:
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
RCAHMS thesauri: term definition
http://orapweb.rcahms.gov.uk/apex/f?p=210:1:
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
RCAHMS thesauri : suggest a term
http://orapweb.rcahms.gov.uk/apex/f?p=210:1:
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Part II: Drivers for Linked DataWe already publish our thesauri as key reference datasets for use by professional archaeologists in national organisations, in local authority Historic Environment Records as well as by anyone interested in the historic environment.
BUT
Our vocabularies (and other data) are not visible
The thesaurus architecture limits the potential of the terminology
Terms lack the persistent URIs that would allow our resources to act as hubs for the Web of Data.
Interoperability----For heritage, the main exponents of Linked Data are from the research community,and in Scotland primarily from Computer Scientists
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Drivers for Linked Open Data
Open Data White paper June 2012: Scotland’s Digital Future April 2013: http://data.gov.uk/sites/default/files/Open_data_White_Paper.pdf http://www.scotland.gov.uk/Resource/0042/00421478.pdf
It is Government policy
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Drivers for Linked Open Data
• Public data policy and practice will be clearly driven by the public and businesses who want to use the data, including what data is released, when and in what form
• Public data will be published in reusable, machine-readable form
• Public data will be released under the same open licence which enables free reuse, including commercial reuse
• Public data will be published using open standards, and following relevant recommendations of the World Wide Web Consortium
• Public data from different departments about the same subject will be published in the same, standard formats and with the same definitions
• Public data underlying the Government’s own website will be published in re-usable form • Release data quickly, and then work to make sure it is available in open standard formats, including Linked data forms.
It is Government policy: Open Data White Paper June 2012:
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
... And a practical use
An online submission form to report fieldwork from contractors to curators
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
©University of Glamorgan
“the key to interoperability”
http://www.heritagedata.org/
Part III: The partners
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Lineage
STAR: Semantic Technologies for Archaeological resources 2007-2010AHRC funded project with English Heritage to apply semantic and knowledge-based technologies to the digital archaeological domain. STAR developed new methods for linking digital archive databases, vocabularies and the associated grey literature, exploiting the potential of a high level, core ontology and natural language processing techniques.http://hypermedia.research.southwales.ac.uk/kos/star/
STELLAR: Semantic Technologies Enhancing Links and Linked data for Archaeological Resources 2010-2011AHRC funded project with the ADS and English Heritage. Building on the outcomes of STAR, STELLAR provided support for non-specialist users to map and extract datasets. http://hypermedia.research.southwales.ac.uk/kos/stellar/
SENESCHAL: Semantic ENrichment Enabling Sustainability of arCHAeological Links 2013-2014AHRC funded project with the ADS, English Heritage, RCAHMS, RCAHMW and Wessex Archaeology. http://hypermedia.research.southwales.ac.uk/kos/SENESCHAL/and http://www.heritagedata.org
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
The SENESCHAL Project seneschal n. Historical
The steward or major-domo of a medieval great house 12 month AHRC funded project
March 2013 - February 2014 Deliverables
Controlled vocabularies online Linked data (SKOS) Downloadable files
Web services term suggestion, term validation, legacy data alignment
Tools to align data with controlled vocabularies Browser-based ‘widget’ controls
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Interoperability “The terminology of a subject is the key to
interoperability” (John F. Sowa) Interoperability requires more than just a common
data model Data compatibility occurs on 2 levels – semantic
and syntactic. Ontologies / data structures deal with the semantic but not necessarily the syntactic “The CRM relies on existing syntactic interoperability
and is concerned only with adding semantic interoperability” (CIDOC CRM documentation)
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
You say potato, I say tomato… Multiple datasets, multiple
organisations, multiple languages Unification of data structures is
possible, BUT… Incompatible terminology hinders
cross search and prevents greater interoperability
Applications attempting to reuse data must all individually sort out the same old problems
E.g. Get all the iron age post holes…
Feature PeriodPost-hole IRON AGEPosthole |ron agePOST HOLE Iron age?POSTHLOLE EARLY IRON AGEPOST HOLE (POSSIBLE)
250 BC
POSTHOLES C 500-200 B.C.
Solution: data cleansing and controlled vocabularies?
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Typical interoperability issues encountered Simple spelling errors
POSTHLOLE”, “CESS PITT”, “FURRROWS”, FLINT SCRAPPER” Alternate word forms
“BOUNDARY”/”BOUNDARIES”, “GULLEY”/”GULLIES” Prefixes / suffixes
“RED HILL (POSSIBLE)”, “TRACKWAY (COBBLED)”, “CROFT?”, “CAIRN (POSSIBLE)”, “PORTAL DOLMEN (RE-ERECTED)”
Nested delimiters “POTTERY, CERAMIC TILE, IRON OBJECTS, GLASS”
Terms not intended for indexing “NONE”, “UNIDENTIFIED OBJECT”, “N/A”, “NA”, “INCOHERENT”
Terms that would not be in (any) thesauri “WOTSITS PACKET”, “CHARLES 2ND COIN”, “ROMAN STRUCTURE POSSIBLY A
VILLA“, “ST GUTHLACS BENEDICTINE PRIORY”, “WORCESTER-BIRMINGHAM CANAL”, “KUNGLIGA SLOTTET”, “SUB-FOSSIL BEETLES”
More specific phrases “SIDE WALL OF POT WITH LUG”, “BRICK-LINED INDUSTRIAL WELL OR MINE
SHAFT”, “ALIGNMENT OF PLATFORMS AND STONES”
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Solutions - SENESCHAL Controlled vocabularies (again)
Commonly agreed concepts, terminology and identifiers Existing / new thesauri – community contributions?
Openness and availability Licensing, web services, downloads, data formats
Alignment of existing data Data cleansing tools Alignment techniques
Alignment of new data Interactive embedded data entry tools Validation at point of data entry Rather than trying to solve this vocabulary problem, help to prevent
it from happening in the first place
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Vocabularies online as (SKOS) Linked Data Vocabularies from English Heritage
Monument Types Thesaurus Objects Thesaurus Event Types Thesaurus Maritime Craft Thesaurus RCHME Cultural Periods List / MIDAS Archaeological Periods List
Vocabularies from RCAHMS Monument Thesaurus (Scotland)
Multilingual - includes Scottish Gaelic translations! Objects (Scotland) Maritime Craft (Scotland)
Vocabularies from RCAHMW Monument Thesaurus (Wales) Event (Wales) Period (Wales)
Moving from term based towards concept based indexing Start to create links between concepts… between vocabularies… between datasets… between
sites… between countries Cross searching of (multilingual) cultural heritage resources
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
(partial) SKOS model
skos:Concept
skos:inScheme
[literal value]
skos:ConceptScheme
skos:Collection
skos:broader,skos:narrower,
skos:related
skos:prefLabel,skos:altLabel,skos:notation,
skos:scopeNote,skos:changeNote
skos:member
[literal value]dc:title,
dc:description
skos:hasTopConceptskos:topConceptOf
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Data licensing and attribution using CC REL
skos:Conceptskos:ConceptScheme
URI
cc:license
[literal value]cc:attributionName
cc:attributionURL
URIcc:license
cc:attributionURL
cc:attributionName
URI
dct:creator dct:creator
URI
dc:sourcedc:source
Attribution back to original data providers
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
General System Architecture
SENESCHAL data store
Linked DataREST API
SPARQL query endpoint
web controls & applications
Web Services REST API
Native vocabularies
STELLAR (SKOS) templates
SKOS RDF vocabularies
(upload)
Additional metadata
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Linked Data API (preliminary) The project will implement a Linked Data (restful) API The base URI maybe http://www.heritagedata.org/ or http://purl.org/xxx/.. Seneschal is a sub-project within the wider scope of ‘heritagedata.org’ – so:
http://www.heritagedata.org/seneschal - wiki/blog for project details, and <base uri>/schemes/123 (e.g.) for actual data API – see below…
Proposed REST API: /schemes – return list of all SKOS concept schemes held /schemes/search - (with parameters) – search for schemes /schemes/{id} – return details of specified SKOS concept scheme (current version) /schemes/{id}.html, .n3, .rdf, .json – return different serializations of that data, obtained either by
content negotiation or by direct request including extension /schemes/{id}/concepts – return list of ALL SKOS concepts in specified scheme /schemes/{id}/concepts/search – search for concepts in the specified scheme /concepts – return list of all SKOS concepts in ALL schemes /concepts/search - (with parameters) – search for concepts in any scheme /concepts/{id} – return details of specified SKOS concept (current version) /concepts/{id}.html, .n3, .rdf, .json – return different serializations of the data, obtained either by
content negotiation or by direct request including extension /concepts/{id}/schemes - return list of all schemes referencing the specified concept
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Project deliverables
http://www.heritagedata.org/blog/
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Schema List
http://heritagedata.org/test/getAllSchemes.php
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Scottish Monument types
http://heritagedata.org/test/schemes/1.html
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Scottish Monument types: Top level
http://heritagedata.org/test/schemes/1/concepts/405.html
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Scottish Monument types: concept
http://purl.org/heritagedata/schemes/1/concepts/409
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
http://heritagedata.org/test/searchForm.php
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
http://heritagedata.org/test/sparql.php
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Versioning (preliminary) /schemes/{id} – returns current version of the specified scheme /schemes/{id}/versions – returns all versions of the specified
scheme /schemes/{id}/versions/{id} – returns specified version of the
specified scheme /concepts/{id} – returns current version of the specified concept /concepts/{id}/versions – returns all versions of the specified
concept /concepts/{id}/versions/{id} – returns specified version of the
specified concept[skos:ConceptScheme]
data:schemes/123/versions/20111005[skos:ConceptScheme]
data:schemes/123
dct:hasVersion
(dct:isVersionOf)
[skos:ConceptScheme]data:schemes/123/versions/2013020301
dct:hasVersion
(dct:isVersionOf)
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Published vocabularies
Vocabulary England Scotland Wales
Monument type YES YES YES
Objects YES YES
Maritime craft YES YES
Period YES YES
Events (activities) YES ???
Archaeological Sciences YES ???
Components YES
Building materials YES
Evidence YES
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
A question of jurisdiction
289 Allison Street, Glasgow: TENEMENThttp://canmore.rcahms.gov.uk/en/site/148111/
TENEMENT (Scotland)http://purl.org/heritagedata/schemes/1/concepts/467A large building containing a number of rooms or flats,
access to which is usually gained via a common stairway.
TENEMENT (England)http://purl.org/heritagedata/schemes/eh_tmt2/concepts/68997A parcel of land.
TENEMENT (Wales)http://purl.org/heritagedata/schemes/10/concepts/68997
TENEMENT BLOCK (England)http://purl.org/heritagedata/schemes/eh_tmt2/concepts/71489Use for speculatively built 19th century "model dwellings", rather than those built by a philanthropic society.
TENEMENT BLOCK (Wales)http://purl.org/heritagedata/schemes/10/concepts/71489
TENEMENT HOUSE (England)http://purl.org/heritagedata/schemes/eh_tmt2/concepts/71476Originally built as a family house. Converted into flats during the 19th or 20th century.
SC674834
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
A question of jurisdiction
A Cruck House in Wick, WorcestershireCruck cottage in Wick Philip Halling http://creativecommons.org/licenses/by-sa/2.0/
Cruck Framed Byre, Latheron, Caithnesshttp://canmore.rcahms.gov.uk/en/site/86630/
SC683414
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
A bheil Gàidhlig agaibh?
The Cenotaph, George Square, Glasgow: http://canmore.rcahms.gov.uk/en/site/143264/
DP151933
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
A bheil Gàidhlig agaibh?
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Multilinguality Multilingual
labels & notes Search in one
language, retrieve another
Potential to manage regional terms
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Challenges for RCAHMS Controlled vocabularies online Integration of project deliverables into RCAHMS processes
Managing candidate terms
Publishing additional vocabularies
Jurisdiction - a single British thesaurus for Cultural heritage?
Adding images
Moving the goalposts
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
Summary Controlled vocabularies online
Linked data (SKOS) Downloadable files
Linking out Mapping between the different thesauri
Web services term suggestion, term validation, legacy data alignment
Tools to align data with controlled vocabularies Browser-based ‘widget’ controls
http://www.heritagedata.org/blog/work-in-the-pipeline/
SENESCHAL - Semantic ENrichment Enabling Sustainability of arCHAeological Links
©University of Glamorgan
“the key to interoperability”
http://www.heritagedata.org/