A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
-
Upload
andrea-huang -
Category
Technology
-
view
359 -
download
0
description
Transcript of A Linked Data Prototype for the Union Catalog of Digital Archives Taiwan
A Linked Data Prototype for The Union Catalog of Digital Archives Taiwan
Museum Computing: An Approach to Bridging Cultures, Communities and ScienceThe 21th PNC Annual Conference and Joint Meetings, October 21-23, 2014
National Palace Museum, Taipei, Taiwan
Keh-Jiann Chen, Tyng-Ruey Chuang, Andrea Wei-Ching Huang, Chung-Hsi Hung, and Wan-Jung Shu
Institute of Information Science, Academia Sinica, Taipei, Taiwan
The corresponding author is Andrea Wei‐Ching Huang at {andreahg}@iis.sinica.edu.tw
Outline1. Introduction & Motivation 2. Digital Archives Thesaurus (dat) 3. A Chinese Bottle in the Prototype4. The dat Ontology & Prototype System 5. Conclusion & Future Works6. Reference
Introduction / MotivationUnion Catalog of Digital Archives Taiwan
Why linked dataDatasets we use for the experiment
Digital Archives Thesaurus (dat) Overview AAT hierarchy adaption Disambiguation skill
Speaker (1): Wan-Jung Shu
dat
1. Union Catalog of Digital Archives Taiwan:
Collections from more than 12 Institutions
Introduction & Motivation
1. Anthropology
2. Archaeology
3. Archeology
4. Archives
5. Biology
6. Chinese Artifacts
7. Chinese Paintings & Calligraphy
8. Full Text of Rare Chinese Books
9. Geology
10. Language
11. Map & Remote Sensing
12. Multimedia
13. News
14. Rare & Manuscript Collections
15. Research Reusing
16. Resource Integration for Applications
17. Stone Rubbing
17 Topic Subjects
Metadata (DC 15 elements): 5,214,602 Image: 4,032,112 Audio & Video Media: 48,591
o Academia Sinica
o Academia Historica
o National Museum of Natural Science
o National Central Library
o National Taiwan University
o National Palace Museum
o Taiwan Historica
o National Museum of History
o Chinese Taipei Film Archive
o Hakka Affairs Council
o National Archives Administration
o Council of Indigenous Peoples
o Open Requests for Proposals Projects
o …
o …
Catalogs in Web Context
• Need to be open.• Need to be linkable.• Needs to provide links.• Must be part of the network.• Can not be an end in itself.• Allow for hackability.
Commonsense Cataloging
• 2014 Survey indicates: Over 36.6% of keywords in Google search results include Schema snippets.
• Pages using schema.org markupshave higher Google rankings.
• Library users visit daily, such as Google, Wikipedia and social networks.
Modernize Catalogs
• Improve Visibility, Discoverability and Findability.
• Linking Outside the Catalog.• Sharing of metadata.• Move from Document-based Model
to Data-Centric Description Model(ex. Marc-based to BIBFRAME). MARC MARC 21-BIBFRAME
For Linking For Sharing For Finding
Introduction & Motivation: Why Linked (Open) Data ?
Reason 1: International Trends
Introduction & Motivation: Why Linked (Open) Data ?
Data SemanticsThesaurusVocabularyOntology
What to wear is depending on what
applications need.
Old Data New MeaningNew Value
Reason 2: Sematic value added for data
Introduction & Motivation
1) For Digital Archives Thesaurus: Chinese Artifacts : 32,044 Concepts : 1,667
2) For Linked Data Prototype: 5 sub categories of the Chinese Artifacts / 25 examples No. of Concepts : 167 No. of Triples : 225
…
bamboo/wood lacquerware ceramic artifacts enamelware and glass artifacts jade/stone artifacts metal artifacts
Chinese Artifacts
Datasets we use for this prototype experiment
Digital Archives Thesaurus: overview
Chinese Art and Artifact Subsets : [concepts and guide terms : 3,088 ] / [terms : 4,538]
Digital Archive Thesaurus
Concept N
Term 1
Union Catalog Keyword dictionary
AAT hierarchy adaptionRelated terms of Chinese Artifacts
Term 2 Term 3 Term 4 Term 5
Concept 2Concept 1
Term n
Union Catalog Keyword Dictionary Over 100,000 keywords
Source of related terms: Art dictionaries Textbooks Journal papers
Digital Archives Thesaurus: AAT hierarchy adaption
Chinese Art and Artifact Subsets : [concepts and guide terms : 3,088 ] / [terms : 4,538]
Contribution to AAT
Equivalence relation
AAT dat
tagged term
Digital Archives Thesaurus: knowledge extraction form Chinese text
Digital Archive Thesaurus
CKIP segmentation process
銀鍍金
纍絲
點翠
珠寶
花蝶
簪
term extraction
Digital Archives Thesaurus: concept-terms-object
Concept N
洋彩
瓷胎洋彩
tag n瓶
bottle紙槌瓶
蕉葉紋
番蓮紋
開光
內填琺瑯
champlevé
如意雲紋
磁胎 銅胎
錦地
Digital Archives Thesaurus: disambiguation
Disambiguation skills Homograph distinguished by prefix Subject restriction DC elements restriction
Example of ambiguation 金 in Chinese may represent
Metal (material) Gold (material) Golden (color) Jin Dynasty (styles and periods )
Homograph distinguished by prefix
青花 (blue white porcelain) as a type of object 青花(ching-wha glaze) as glazing material 青花 with prefix character 以、用、由 (means ‘use’)
→ use ching-hwa glaze
Digital Archives Thesaurus: Disambiguation-Part I
Subject restriction 琉璃 as a kind of glazed pottery tags in ‘Pottery’ category 琉璃 as glass material tags in ‘Enamel and Glassware’ category
Digital Archives Thesaurus: Disambiguation-Part II
DC element restriction1. DC elements used: title, type, date, subject, description2. Example in title
object type term (簪=hair pin) must be at the last word of title
Digital Archives Thesaurus: Disambiguation-Part III
A Prototype System Framework overview
How a Chinese Bottle is semantically represented
in RDF triples?
A beta dat ontology domain knowledge representation
of the Chinese Artifacts descriptions for curation and
publication the Artifacts about data reusing and the use of
the R4R Ontology
Speaker (2): Andrea Wei-Ching Huang
A Chinese Bottle in the Prototype
作者不詳(-)。[銅琺瑯方瓶]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/30/e5/f1.html
@prefix prv: <http://purl.org/net/provenance/ns#> .@prefix dc: <http://purl.org/dc/terms/> .@prefix r4r: <http://guava.iis.sinica.edu.tw/r4r/> .@prefix foaf: <http://xmlns.com/foaf/0.1/> .@prefix sp: <http://spinrdf.org/sp#> .@prefix xhtml: <http://www.w3.org/1999/xhtml/vocab/#> .@prefix void: <http://rdfs.org/ns/void#> .@prefix d2rq: <http://www.wiwiss.fu-berlin.de/suhl/bizer/D2RQ/0.1#> .@prefix dat: <http://dat.digitalarchives.tw/ontology.html#> .@prefix schema: <http://schema.org/> .@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .@prefix dcat: <http://www.w3.org/ns/dcat#> .@prefix prvTypes: <http://purl.org/net/provenance/types#> .@prefix dct: <http://purl.org/dc/terms/> .@prefix d2r: <http://sites.wiwiss.fu-berlin.de/suhl/bizer/d2r-server/config.rdf#> .@prefix aat: <http://vocab.getty.edu/aat/> .@prefix owl: <http://www.w3.org/2002/07/owl#> .@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .@prefix map: <http://dat.digitalarchives.tw/resource/#> .@prefix dbpedia: <http://dbpedia.org/resource/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix uc: <http://dat.digitalarchives.tw/resource/uc/> .@prefix doap: <http://usefulinc.com/ns/doap#> .
<http://dat.digitalarchives.tw/data/Artifact/3204593>a foaf:Document , prv:DataItem ;dct:date "2014-11-24T06:55:37.329Z"^^xsd:dateTime ;prv:containedBy <http://dat.digitalarchives.tw/dataset> ;void:inDataset <http://dat.digitalarchives.tw/dataset> ;foaf:primaryTopic <http://dat.digitalarchives.tw/resource/Artifact/3204593> .
<http://dat.digitalarchives.tw/resource/Artifact/3204593>rdfs:isDefinedBy <http://dat.digitalarchives.tw/data/Artifact/3204593> ;dat:artifactType <http://vocab.getty.edu/aat/300010898> , <http://dat.digitalarchives.tw/Concept/800000632> ;dat:componentForm <http://dat.digitalarchives.tw/Concept/800001205> ,
<http://dat.digitalarchives.tw/Concept/800001103> , <http://dat.digitalarchives.tw/Concept/800000915> , <http://dat.digitalarchives.tw/Concept/800000886> , <http://dat.digitalarchives.tw/Concept/800000913> ;
dat:decorationSubject<http://dat.digitalarchives.tw/Concept/800000295> ;
r4r:hasProvenance <http://trdf.sourceforge.net/provenance/ns.html#DataCreation> ;r4r:isPartOf <http://dat.digitalarchives.tw/data/Dataset/10000001> ;dct:created "unavailable " ;dct:instructionalMethod
<http://vocab.getty.edu/aat/300053778> ;dct:title "銅琺瑯方瓶" ;schema:url <http://catalog.digitalarchives.tw/item/00/30/e5/f1.html> ;foaf:page <http://dat.digitalarchives.tw/page/Artifact/3204593> .
http://dat.digitalarchives.tw/concept/800000295
Search Concept (800000295:Shanshui)
Result in one artifact: numbered 3204593
How is this Chinese Bottle semantically represented in RDF triples through our prototype?
The Prototype – I
Describing and
representing for publishing
the concept relations
between the Chinese
Artifacts of the Digital
Archive Taiwan and the
Digital Archives Thesaurus.
Union
Catalog Metadata
Digital
Archives Thesaurus
The dat beta ontology
Chinese Artifacts Relational Database
Semantic Browsing
The dat concept making process
Chinese Knowledge and Information Processing (CKIP)
Chinese Word Segmentation System
Segmented Keyword List
Keyword Extraction
Tag
Extensions
Binary Relation
Overview of a Linked Data Prototype System using dat (Digital Archives Thesaurus) & dat Ontology
Union
Catalog Metadata
Chinese Knowledge and Information Processing (CKIP)
Chinese Word Segmentation System
Segmented Keyword List
Keyword Extraction
landscape
landscape
Digital
Archives Thesaurus
Tag
Extensions
Binary Relation
Before The Prototype
Union
Catalog Metadata
Chinese Knowledge and Information Processing (CKIP)
Chinese Word Segmentation System
Segmented Keyword List
Keyword Extraction
landscape
landscape
Digital
Archives Thesaurus
Tag
Extensions
Binary Relation
landscape shanshui
Shanshui
The Prototype – II
Overview of a Linked Data Prototype System using dat (Digital Archives Thesaurus)
Describing and
representing for publishing
the concept relations
between the Chinese
Artifacts of the Digital
Archive Taiwan and the
Digital Archives Thesaurus.
Union
Catalog Metadata
Digital
Archives Thesaurus
The dat beta ontology
Chinese Artifacts Relational Database
Semantic Browsing
Binary Relation
Every artifact item has been assigned a dat URI.
The Prototype – III
Overview of a Linked Data Prototype System using the dat Ontology
Artifact
Concept (dat)
Concept(aat)
Tag
dat:artifactTypedat:componentFormdat:decorationSubjectdat:describedSubjectdat:designElementdct:createddct:instructionalMethoddct:mediumschema:color
dat:hasTag
black ovals are the main modeling resources white ovals are resources defined by local class definitions grey ovals are resources defined by external class definitions dash lines indicate mapping relation tasks not completed
skos:narrower
The Core Ontology: intellectual semantics of the Chinese Artifacts
The dat ontology – I
dcat:Dataset
Artifact
ObjectNameSource
UnionCatalog
rdfs:subClassOf
dct:title
schema:url
dat:Provenance
Curation & Publication: descriptions of the modelling objects
We use popular vocabularies such as DC terms and schema.org to relate Artifact to its preservation and technical descriptions.
The dat ontology – II
dcat:Dataset
Artifact
r4r:isPartOf
r4r:hasProvenance
r4r:RRObject
rdfs:subClassOf
Reusing: descriptions of the modelling object to associated publications and policy used
Do not use common vocabularies to describe Artifact and Dataset relations because we wish to publish the dataset and to be reused by others.
In particular, we wish to publish an URI for this resource that can support dynamic contexts:
(1) Metadata: ready or not ready? (2) Publish only or can be reused?(3) Joint publications such as article, data
and code.
dat:Provenance
The dat ontology – III
dcat:Dataset
Artifact
Concept (dat)
ObjectNameSource
Concept(aat)
Tag
dat:artifactTypedat:componentFormdat:decorationSubjectdat:describedSubjectdat:designElementdct:createddct:instructionalMethoddct:mediumschema:color
r4r:isPartOf
UnionCatalog
rdfs:subClassOf
skos:Concept
dat:hasTagdct:title
rdfs:subClassOf
schema:url
r4r:hasProvenance
r4r:RRObject
rdfs:subClassOf
black ovals are the main modeling resources white ovals are resources defined by local class definitions grey ovals are resources defined by external class definitions dash lines indicate mapping relation tasks not completed
Beta: An Ontology for Publishing Chinese Artifacts as Linked Data Using the Digital Archives Thesaurus (dat)
skos:narrower skos:broader
skos:related
preservation & technical descriptions of modelling objects
dat:Provenance
The dat ontology
Conclusion & Future Works
http://dat.digitalarchives.tw/concept/
http://dat.digitalarchives.tw/ontology
http://dat.digitalarchives.tw/
dcat:Dataset
Artifact
Concept (dat)
ObjectNameSource
Concept(aat)
Tag
dat:artifactTypedat:componentFormdat:decorationSubjectdat:describedSubjectdat:designElementdct:createddct:instructionalMethoddct:mediumschema:color
r4r:isPartOf
UnionCatalog
rdfs:subClassOf
skos:Concept
dat:hasTagdct:title
rdfs:subClassOf
schema:url
r4r:hasProvenance
r4r:RRObject
rdfs:subClassOf
skos:narrower skos:broader
skos:related
Future Works - I
Wikipedia
dat:Provenance
dcat:Dataset
Artifact
Concept (dat)
ObjectNameSource
Concept(aat)
Tag
dat:artifactTypedat:componentFormdat:decorationSubjectdat:describedSubjectdat:designElementdct:createddct:instructionalMethoddct:mediumschema:color
r4r:isPartOf
UnionCatalog
rdfs:subClassOf
skos:Concept
dat:hasTagdct:title
schema:url
r4r:hasProvenance
r4r:RRObject
rdfs:subClassOf
skos:narrower
Future Works - II
Wikipedia
Concept(Place)
Concept(People)
dat:Provenance
dcat:Dataset
16 other catalogs
Concept (domain local
ObjectNameSource
Concept(domain external)
Tag
r4r:isPartOf
UnionCatalog
rdfs:subClassOf
skos:Concept
dat:hasTagdct:title
rdfs:subClassOf
schema:url
r4r:hasProvenance
r4r:RRObject
rdfs:subClassOf
skos:narrower skos:broader
skos:related
Future Works - III
Wikipedia
cross-domaindat
thesaurus
dat:Provenance
Reference
Article:
Bizer, Christian, and Richard Cyganiak. "D2r server-publishing relational databases on the semantic web." Poster at the 5th International Semantic Web Conference. 2006.
Bizer, Chris, Richard Cyganiak, and Tom Heath. "How to publish linked data on the web." (2007).
Huang, Andrea Wei-Ching and Tyng-Ruey Chuang, “Relations for Reusing (R4R) in a Shared Context: An Exploration on Research Publications and Cultural Objects”, Proc. of the 4th International Workshop on Semantic Digital Archives (SDA), in conjunction with International Digital Libraries Conference (DL2014), London, 8th-12th September 2014.
Malmsten, Martin. "Making a library catalogue part of the semantic web."UniversitätsverlagGöttingen (2008): 146.
OCLC Linked Data, http://oclc.org/developer/develop/linked-data.en.html
LC Linked Data Service: Authorities and Vocabularies, http://id.loc.gov/
Code:
d2R, Database to RDF mapping engine and SPARQL server http://d2rq.org/, https://github.com/d2rq/d2rq
Huang, Andrea Wei-Ching and Tyng-Ruey Chuang, Relations for Reusing (R4R) Ontology, http://guava.iis.sinica.edu.tw/r4r
Huang, Andrea Wei-Ching, Chung-Hsi Hung, and Wan-Jung Shu, Keh-Jiann Chen and Tyng-Ruey Chuang, Beta: An Ontology for Publishing Chinese Artifacts as Linked Data Using the Digital Archives Thesaurus (dat), http://dat.digitalarchives.tw/ontology/
Data
作者不詳(-)。[銅琺瑯方瓶]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/30/e5/f1.html
作者不詳(2500 B.C.-2200 B.C.)。[良渚文化晚期玉琮]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/0c/c0/4e.html
作者不詳(1199 B.C.-1000 B.C.)。[商後期□父丁方鼎]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/0c/be/f7.html
作者不詳(960 A.D.-1279 A.D.)。[宋官窯翠青琮式瓶]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/0c/c4/b5.html
作者不詳(960 A.D.-1279 A.D.)。[宋定窯劃花蓮花葵瓣口盤]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/5f/a4/74.html
作者不詳(1601 A.D.-1700 A.D.)。[明末清初銅胎琺瑯獸面紋方鼎式爐]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/33/0b/96.html
作者不詳(1601 A.D-1700 A.D)。[明 十七世紀 嵌玉石花鳥圓盒]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/33/0e/2d.html
作者不詳(1644 A.D.-1911 A.D.)。[清內填琺瑯纍絲瓜形盒]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/59/d2/1c.html
作者不詳(1644 A.D.-1911 A.D.)。[清玉香盒]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/11/1c/d8.html
作者不詳(1644 A.D.-1911 A.D.)。[清玉鎖環]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/33/48/ef.html
作者不詳(1644 A.D.-1911 A.D.)。[清伽南香手串(十八子)]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/5f/a0/ef.html
作者不詳(1644 A.D.-1911 A.D.)。[清周樂元玻璃內繪行旅圖鼻煙壺]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/59/d0/6a.html
作者不詳(1644 A.D.-1911 A.D.)。[清青玉琱花爐]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/11/0b/3f.html
作者不詳(1644 A.D.-1911 A.D.)。[清剔彩耕作圓瓣式盒]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/33/49/38.html
作者不詳(1644 A.D.-1911 A.D.)。[清留青竹雕臂擱]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/10/b4/73.html
作者不詳(1644 A.D.-1911 A.D.)。[清瑪瑙葵瓣口碗]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/1c/c6/06.html
作者不詳(1644 A.D.-1911 A.D.)。[清銀鍍金嵌珠鳳蝶牡丹鈿花]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/5f/a2/79.html
作者不詳(1644 A.D.-1911 A.D.)。[清銀鍍金纍絲點翠嵌珠寶花蝶簪]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/5f/a2/85.html
作者不詳(1644 A.D.-1911 A.D.)。[清銅鎏金葫蘆式執壺]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/42/b7/6d.html
作者不詳(1644 A.D.-1911 A.D.)。[清燒藍竹桃蘭芝花籃形銀片]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/59/d1/3c.html
作者不詳(1736 A.D.-1795 A.D.)。[清乾隆內填琺瑯番蓮紋瓶]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/0c/c5/bf.html
作者不詳(1736 A.D.-1795 A.D.)。[清乾隆青花荔枝桃實執壺]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/42/b4/00.html
作者不詳(1736 A.D-1795 A.D)。[清 乾隆(1736-1795) 剔彩山人水物四瓣式套盒]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/33/0e/34.html
作者不詳(1741 A.D.-)。[清乾隆六年磁胎畫琺瑯八哥膽瓶]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/42/b7/d0.html
作者不詳(1742 A.D.-)。[清乾隆窯琺瑯彩藍地開光花卉瓶]。《數位典藏與數位學習聯合目錄》。http://catalog.digitalarchives.tw/item/00/33/49/cf.html
Reference
ADERVERTISEMENT
http://summit2015.lodlam.net/