E-Culture semantic search pilot
-
Upload
guus-schreiber -
Category
Technology
-
view
67 -
download
1
description
Transcript of E-Culture semantic search pilot
![Page 1: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/1.jpg)
MultimediaNPilot E-Culture
![Page 2: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/2.jpg)
2
Pilot E-Culture
Partners: VU, UvA, CWI, DEN, ICN
Subproject of MultimediaN, a 16 MEuro project on multimedia technology funded by the Dutch government
Aim: demonstrate added value of Semantic Web techniques for virtual heritage collections
![Page 3: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/3.jpg)
3
![Page 4: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/4.jpg)
4
Hypothesis
Semantic Web technology is in particular useful in knowledge-rich domains
or formulated differently
If we cannot show added value in knowledge-rich domains, then it may have no value at all
![Page 5: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/5.jpg)
5
Use case: painting style
Find paintings of a similar style
KLIMT, GustavPortrait of Adele Bloch-Bauer I1907Oil and gold on canvas138 x 138 cmAustrian Gallery, Vienna
![Page 6: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/6.jpg)
6
How can we find this other ‘Art nouveau’ painting?
MUNCH, EdvardThe Scream1893Oil, tempera and pastel on
cardboard91 x 73.5 cmNational Gallery, Oslo
![Page 7: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/7.jpg)
7
Issues w.r.t. the use case
Parse annotation to find matches with thesauri terms– E.g. match artists to ULAN individuals
Artists-style links– AAT contains styles; ULAN contains artists, but there
is no link• Learn link from corpora• Derive it from other annotations
– Domain-specific rules/reasoning needed • see example in SWRL doc• Painters may have painted in multiple styles
![Page 8: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/8.jpg)
8
Natural-lang proc.automatic annotation
text stings concepts
Distributedcultuurwijzer.nl collections
OAI-based access
Reasoning supporttime/space reasoning
Web interfacesupport for web collections
Presentation facilitiessemantic presentation
device-specific
InteroperabilityXML/RDF/OWL
Scalability> 10,000,000 triples
OntologiesWordNet, AAT, TGN ULAN, Dutch labels
Search strategiessibling searchsemantic distance
Dublin Corespecializationsdumb-down
semantic annotation
DIGITAL HERITAGE COLLECTIONS
semantic search
BASELINEENHANCEDENHANCEDFEATURESFEATURES
NEWNEWFEATURESFEATURES
![Page 9: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/9.jpg)
9
Architecture
![Page 10: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/10.jpg)
10
Use of thesauri
RDF/OWL data models of Getty thesauri– Issues: scope, preserving structure
WordNet: W3C SWBPD workhttp://www.w3.org/TR/wordnet-rdf/
Multilingualism– Dutch version of AAT
Existing collection metadata are parsed to find matches in thesauri (e.g. creator name => ULAN entry)
![Page 11: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/11.jpg)
11
Distributed vs. centralized collection dataMinimal requirement: collection object has
image URIPreference for external metadata,
accessed through protocol such as OAI In practice, external metadata access is
still cumbersome
![Page 12: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/12.jpg)
12
Search strategies
Basic search: keyword-orientedAdvanced search:
– Tweaking default search parameters– Time-related queries
Faceted searchRelation search
– How are two URIs related?
![Page 13: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/13.jpg)
13
Keyword search with semantic clustering1. Btree of literals plus Porter stem and
metaphone index2. Find resources with matching labels
• Default resources are “Work”s
3. Find related resources by one-way graph traversal
• owl:inverseOf is used• Threshold used for constraining search
4. Cluster results (group instances)
![Page 14: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/14.jpg)
14
Demonstrator
![Page 15: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/15.jpg)
15
Search: WordNet patterns that increase recall without sacrificing precisions
(Hollink)
![Page 16: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/16.jpg)
16
Triple statistics
![Page 17: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/17.jpg)
17
Status
4-year project, now in month 18Short-term goals:
– Adding more ethnological collections– Location-oriented presentation– User studies with professional users (museum
people) and interested lay persons– Multi-lingual interface (English, Dutch,
Indonesian)
![Page 18: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/18.jpg)
18
Issues
Getting access to collections is mainly a social process– There is usually no principled objection to make data,
metadata and thesauri publicly available, but it still feels threatening
Cultural heritage is a good area for a Semantic Web “island”:– lots of domain-specific knowledge– strong application pull– enormous amount of existing annotations, which have
been built up over centuries
![Page 19: E-Culture semantic search pilot](https://reader035.fdocuments.net/reader035/viewer/2022062419/5578f60ad8b42a675b8b4730/html5/thumbnails/19.jpg)
19
On-line demohttp://e-culture.multimedian.nl