Shebanq roma-2013-10-01
-
Upload
dirk-roorda -
Category
Technology
-
view
162 -
download
0
description
Transcript of Shebanq roma-2013-10-01
![Page 1: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/1.jpg)
Data Archiving and Networked Services !
SHEBANQ !
Dirk Roorda - researcher @ DANS,TLA !
System for HEBrew Text: ANnotations for Queries and Markup !
TEI pre-conference workshop: Query !Roma – 2013-10-01 !
![Page 2: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/2.jpg)
Overview
1. Context: text, data, research in Hebrew Bible
2. MdF database model, MQL query language
3. Sharing the research process
4. CLARIN-NL project: SHEBANQ
5. Towards new tools
![Page 3: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/3.jpg)
1 (of 5) Context
Text, data and research in the Hebrew Bible
![Page 4: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/4.jpg)
VU Amsterdam
Eep Talstra Centre for Bible and Computer
text + linguistic features => database
database + research questions => publications
4 !
![Page 5: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/5.jpg)
2 (of 5) MdF and MQL
• MdF database model
• MQL query language
![Page 6: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/6.jpg)
Monad Object Feature
1977-now: Eep Talstra et al. ECA, WIVU. Print reference (Google Books)
1988-1994 Crist-Jan Doedens: Text Databases – One Database Model and Several Retrieval Languages (google books reference)
2004: Ulrik Petersen. Emdros - a text database engine for analyzed or annotated text. COLING
![Page 7: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/7.jpg)
word objects
standardedition
text
monads(atomic chunks
of text)
lexeme_utf8= תישארold_lexeme_utf8= תישאר
vocalized_lexeme_utf8= תישארsurface_consonants_utf8= תישאר
graphical_lexeme_utf8= ישאר
׃ץראה תאו םימשה תא םיה.א ארב תישארב
1234567891011
23456789101112
84383
59559
34680
7763777638
40770
7 .. 511 .. 9
11 .. 5
11 .. 5
11 .. 1
11 .. 1
clause_atom_number=1clause_atom_relation=0
clause_atom_relation_daughter_tense=unknownclause_atom_relation_kind=No_relation
clause_atom_relation_mother_tense=unknownclause_atom_relation_preposition_class=none
clause_atom_type=xQtlindentation=0
phrase objects
Monad-Object-Feature
subphrase objects
phrase_atom objects
clause_atom objects
sentence objects
![Page 8: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/8.jpg)
MQL query language
topographic, i.e:
query expression =~= query results w.r.t.
• sequence
• embedding
![Page 9: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/9.jpg)
Example SELECT ALL OBJECTS !WHERE ![Clause ! [Phrase ! [Word FOCUS !" " "part_of_speech = verb AND !" " "lexeme = "FJM["] !
] ! .. ! [Phrase FOCUS !" "phrase_function = Objc OR !" "phrase_function = IrpO!
] ! .. ! [Phrase FOCUS !" "phrase_function = Objc OR !" "phrase_function = IrpO!
] !] !
!
![Page 10: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/10.jpg)
3 (of 5) Sharing
Problem: how to share (intermediate) results of analysis
Solution: saving queries as annotations
![Page 11: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/11.jpg)
Lock - in
scholarly-bi
bles.com!
Stuttgart Electronic Study Bible
⇒ massive dissemination
But
⇒ not the right dynamics for tool development
![Page 12: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/12.jpg)
Leiden: international workshop biblical scholarship
Desiderata:
new tool development
text transmission (variants)
linguistic analysis (features)
even combined!
a short history: 2012
leiden loren
tz!
![Page 13: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/13.jpg)
Hebrew Text in the Archive
urn:nbn:nl:u
i:13-ikjj-ek
!
![Page 14: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/14.jpg)
Hebrew Text in the Archive
urn:nbn:nl:u
i:13-ikjj-ek
!
how can the people annotate
our work? !
![Page 15: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/15.jpg)
Research Data Cycle
![Page 16: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/16.jpg)
Research Data Cycle Text transmission, tradition, editorial
processes
Free University, theology faculty,
server department, WIVU project
!
NWO projects !NWO projects
religious communities
theol. scholars
theol. scholars
enlightened lay people
scholarly-
bibles.com!
![Page 17: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/17.jpg)
Research Data Cycle Text transmission, tradition, editorial
processes
Free University, theology faculty,
server department, WIVU project
!
NWO projects !NWO projects
religious communities
theol. scholars
theol. scholars
CLARIN SHEBANQ
linguists
Wider public: Annotation,
Query Saving, via Linked Data
dig. hum
comp. hum
enlightened lay people
scholarly-
bibles.com!
Research Data Archiving
DANS
![Page 18: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/18.jpg)
3 (of 5) Sharing (c’t’d)
Solution: Queries As Annotations
![Page 19: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/19.jpg)
queries-as-annotations
model ! query ! example !
body ! query instruction !SELECT ALL OBJECTS WHERE [Word FOCUS part_of_speech = verb AND lexeme = "שים"] !
targets ! query results in context !
ו ישכם יעקב ב בקר ו יקח את ה אבן אשר שם מראשתיו ו ישם אתה מצבה ו יצק שמן
על ראשה
annotation ! published query ! qu123 (just an identifier) !
metadata !
researcher, date created, date last
run, research question !
Janet Dyk 2004-02-16 2012-01-27 Can the verb ים have a double שobject? - article in Foundations for Syriac Lexicography !
![Page 20: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/20.jpg)
OpenAnnotation openannotati
on.org!
![Page 21: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/21.jpg)
provenance
![Page 22: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/22.jpg)
motivation
![Page 23: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/23.jpg)
demonstrator datane
tworkservice
.nl/qaa!
![Page 24: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/24.jpg)
demonstrator datane
tworkservice
.nl/qaa!
![Page 25: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/25.jpg)
demonstrator datane
tworkservice
.nl/qaa!
![Page 26: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/26.jpg)
demonstrator datane
tworkservice
.nl/qaa!
![Page 27: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/27.jpg)
demonstrator
![Page 28: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/28.jpg)
demonstrator
![Page 29: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/29.jpg)
demonstrator
![Page 30: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/30.jpg)
demonstrator
still missing:
saving queries
not semantic-web-enabled
sustainability
![Page 31: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/31.jpg)
4 (of 5) Project
CLARIN-NL: SHEBANQ:
(A) Curation
(B) Demonstrator
![Page 32: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/32.jpg)
SHEBANQ
System for Hebrew Text: ANnotations for Queries
CLARIN-NL project
data curation: LAF
demonstrator: query saver
#!/etc bc
s/g$/q/ !
![Page 33: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/33.jpg)
Linguistic Annotation Framework
ISO 24612:2012
Nancy Ide, Laurent Romary
![Page 34: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/34.jpg)
![Page 35: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/35.jpg)
![Page 36: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/36.jpg)
![Page 37: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/37.jpg)
![Page 38: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/38.jpg)
feature definitions
![Page 39: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/39.jpg)
feature definitions
![Page 40: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/40.jpg)
TEI ISO-FS schema
![Page 41: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/41.jpg)
dcr:datcat on <fDecl> versus <f>
26,225,966 <f>s ! !2.5 GB redundant attribute material !!
![Page 42: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/42.jpg)
5 (of 5) Project
CLARIN-NL: SHEBANQ: (B) Demonstrator
![Page 43: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/43.jpg)
select all objects where
[clause [phrase phrase_function = Objc [word FOCUS tense = infinitive_absolute] ]]
Execute
Query executed
Passage
תאו םימשה תא םיהלא ארב תישארב׃ץראה
תיב הלעא יכ תוא המ והיקזח רמאיו׃הוהי
Controls
תיב הלעא יכ תוא המ והיקזח רמאיו׃הוהי
Gen 1:1
2Chron 3:4
Gen 1:1 תאו םימשה תא םיהלא ארב תישארב׃ץראה
תיב הלעא יכ תוא המ והיקזח רמאיו׃הוהי
Text
1Sam 12:4
Ex 23:2
Query results
Prev 2 3 65 ... 2241 Next21 313 results
Executing query ...
view in context
Save this query
Researcher Oliver Glanz
Date created 2013-08-25
Date last run 2013-08-25
Project Data and Tradition
Institute VU/Eep Talstra Centre for Bible and Computing
Reason irregular valency of ארב
Comments needs to be combined with query on םיהלא
Save PublishCancel
Name valency ארב
Edit Query
![Page 44: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/44.jpg)
Passage
תאו םימשה תא םיהלא ארב תישארב׃ץראה
תיב הלעא יכ תוא המ והיקזח רמאיו׃הוהי
Controls
תיב הלעא יכ תוא המ והיקזח רמאיו׃הוהי
Gen 1:1
2Chron 3:4
Gen 1:1 תאו םימשה תא םיהלא ארב תישארב׃ץראה
תיב הלעא יכ תוא המ והיקזח רמאיו׃הוהי
Text
1Sam 12:4
Ex 23:2
Saved Query Results
Prev 2 3 65 ... 2241 Next21 313 results
view in context
Information on this query
Researcher Oliver Glanz
Date created 2013-08-25
Date last run 2013-08-25
Project
Institute
Reason
Comments
Name
Query Info
select all objects where
[clause [phrase phrase_function = Objc [word FOCUS tense = infinitive_absolute] ]]
MQL query text Persistent Identifier urn:nbn:nl:ui:13-scpm-ji
http://www.persistent-identifier.nl/?identifier=urn...
valency ארב
Data and Tradition
VU/Eep Talstra Centre for Bible and Computing
irregular valency of ארב
needs to be combined with query on םיהלא
![Page 45: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/45.jpg)
datanetworks
ervice.nl/qa
a!
![Page 46: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/46.jpg)
SHEBANQ: implementing Q-a-A
![Page 47: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/47.jpg)
5 (of 5) Towards new tools
• LAF tools
• or generic graph algorithms
• Emdros tools
• or generic database technology
• Linked Data tools
• or generic SPARQL queries
![Page 48: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/48.jpg)
Side conditions • development close to the researchers
• preferably in their own institutions
• decent performance
• within the scale of a laptop
• usable to researchers
• that is: non-programmers
• persistence in mind
• new results will be archived and re-enter the data cycle
![Page 49: Shebanq roma-2013-10-01](https://reader033.fdocuments.net/reader033/viewer/2022060119/558c91fcd8b42af2428b4758/html5/thumbnails/49.jpg)
thank you
slideshare.net/dirkroorda/
s/g$/q/ !
#!/etc bc Eep Talstra Centre for Bible and Computer!