Quality and consistency in text alignment
-
Upload
jrcovington -
Category
Technology
-
view
112 -
download
3
Transcript of Quality and consistency in text alignment
![Page 1: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/1.jpg)
Quality and Consistency in Text Alignment
James R. Covington
Miklal Software Solutions
![Page 2: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/2.jpg)
Text alignmentutility as a function of quality and consistency
![Page 3: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/3.jpg)
Text alignment: utility
Machine translation
Text comparison
Preaching
Education in biblical languages
Textual criticism
Translation technique
Lexicography
Biblical interpretation
James R. Covington | Miklal Software Solutions | [email protected]
![Page 4: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/4.jpg)
Text alignment: quality and consistency
Machine translation lots of data
Text comparison big picture
Preaching
Education in biblical languages
Textual criticism
Translation technique
Lexicography
Biblical interpretation
James R. Covington | Miklal Software Solutions | [email protected]
![Page 5: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/5.jpg)
Text alignment: quality and consistency
Machine translation lots of data
Text comparison big picture
Preaching bad sermon
Education in biblical languages bad exam
Textual criticism bad research
Translation technique
Lexicography
Biblical interpretation
James R. Covington | Miklal Software Solutions | [email protected]
![Page 6: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/6.jpg)
Text alignment: quality and consistency
Part 1: writing consistency standards
Part 2: designing a software tool to promote consistency
Part 3: post-processing quality control
James R. Covington | Miklal Software Solutions | [email protected]
![Page 7: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/7.jpg)
Writing consistency standardsguidelines for evaluating quality and consistency
![Page 8: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/8.jpg)
Writing consistency standards
Step 1: Engineering (our focus today)
Step 2: Proofing
Step 3: Revising
James R. Covington | Miklal Software Solutions | [email protected]
![Page 9: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/9.jpg)
Engineering: principles
Principle 1: as small as possible
“Each set of tokens being linked should be as small as possible.”
Principle 2: as large as necessary
“Each set of tokens being linked should be as large as necessary.”
Principle 1 > Principle 2
James R. Covington | Miklal Software Solutions | [email protected]
![Page 10: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/10.jpg)
Engineering: principles
Principle 1: as small as possible
Gen 12:4James R. Covington | Miklal Software Solutions | [email protected]
![Page 11: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/11.jpg)
Engineering: principles
Principle 2: as large as necessary
Ex 34:6James R. Covington | Miklal Software Solutions | [email protected]
![Page 12: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/12.jpg)
Engineering: principles
Principle 2: as large as necessary
ἐν ἐν to to
γαστήρ γαστρὶ be be
ἔχω ἔχουσα with with
child child
Matt 1:18James R. Covington | Miklal Software Solutions | [email protected]
![Page 13: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/13.jpg)
Engineering: case-specific rules
Step 1: Identify grammatical structures in source language.
Step 2: Identify grammatical structures in target language used to translate structures from Step 1.
Step 3: Write a rule for each pair of grammatical structures.
James R. Covington | Miklal Software Solutions | [email protected]
![Page 14: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/14.jpg)
Engineering: case-specific rules
Step 1: Identify grammatical structures in source language.
Function words Substantives Verbs PunctuationArticles Nouns Auxiliaries Quotation markUniv. Quantifier Pronouns Subjects Question markPrepositions Adjectives ObjectsConjunctions Finite
VolitionalInfinitivesParticiples
James R. Covington | Miklal Software Solutions | [email protected]
![Page 15: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/15.jpg)
Engineering: case-specific rules
Step 1: Identify grammatical structures in source language.
Function words Substantives PersonalArticles Nouns ReflexiveUniv. Quantifier Pronouns PossessivePrepositions Adjectives ReciprocalConjunctions Demonstrative
RelativeInterrogativeIndefiniteCorrelative
James R. Covington | Miklal Software Solutions | [email protected]
![Page 16: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/16.jpg)
Step 1: Hebrew structure
ל + infinitive construct
Engineering: case-specific rules
Gen 2:15James R. Covington | Miklal Software Solutions | [email protected]
![Page 17: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/17.jpg)
Step 2: English structures
Case 1: English to + infinitive
Case 2: English infinitive
Engineering: case-specific rules
Case 1
Case 2
Gen 2:15James R. Covington | Miklal Software Solutions | [email protected]
![Page 18: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/18.jpg)
Step 3: Rules
Case 1: English to + infinitive
Rule 1: link separately
Case 2: English infinitive
Rule 2: group ל and infinitive
infinitive is primary
Engineering: case-specific rules
Case 1
Case 2
Gen 2:15James R. Covington | Miklal Software Solutions | [email protected]
![Page 19: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/19.jpg)
Engineering: case-specific rules
Step 1: Greek structure Step 2: English structures
circumstantial participle participle phrase
ἔχων ὑπʼ ἐμαυτὸν στρατιώτας subordinate clause
main clause
prepositional phrase
preposition
Luke 7:8James R. Covington | Miklal Software Solutions | [email protected]
![Page 20: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/20.jpg)
Engineering: case-specific rules
ἔχων ὑπʼ ἐμαυτὸν στρατιώτας Step 2: English structures
having soldiers under myself participle phrase
since I have soldiers under myself subordinate clause
and I have soldiers under myself main clause
in having soldiers under myself prepositional phrase
with soldiers under me (ESV) preposition
Luke 7:8James R. Covington | Miklal Software Solutions | [email protected]
![Page 21: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/21.jpg)
Engineering: case-specific rules
Step 3: Rules
Case 1: participle phrase Case 2: subordinate clause
συμπαραλαβὼν taking though
καὶ Ἕλλην he
Τίτον Titus ὤν was
along a
with Greek
me
Gal 2:1 Gal 2:3James R. Covington | Miklal Software Solutions | [email protected]
![Page 22: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/22.jpg)
Proofing and Revising
Proofing: multiple readers
time
consult work of other alignments
Revising: begin alignment
note problem spots
note undefined cases
revise and expand cases/rules
James R. Covington | Miklal Software Solutions | [email protected]
![Page 23: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/23.jpg)
Proofing and Revising
Proofing: multiple readers
time
consult work of other alignments
Revising: begin alignment
note problem spots
note undefined cases
revise and expand cases/rules
James R. Covington | Miklal Software Solutions | [email protected]
![Page 24: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/24.jpg)
Designing a software toolan environment to facilitate accuracy and consistency
![Page 25: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/25.jpg)
Designing a software tool: goals
clarity understand alignment correctly
find errors easily
speed make changes quickly
dig deeper quickly
comparison find parallels to check for consistency
James R. Covington | Miklal Software Solutions | [email protected]
![Page 26: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/26.jpg)
Designing a software tool: demo
[demo tool]
James R. Covington | Miklal Software Solutions | [email protected]
![Page 27: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/27.jpg)
Post-processingchecking for accuracy and consistency
![Page 28: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/28.jpg)
Post-processing: philosophy
Find as many algorithmically-detectable mistakes as possible.
Recall > Precision
Precision (low) % hits false
Recall (high) % mistakes caught
James R. Covington | Miklal Software Solutions | [email protected]
![Page 29: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/29.jpg)
Post-processing: techniques
1. Natural Language Processing: conformity to consistency rules
uncommon links
improbable links
consistent treatment of n-grams
2. Graph theory: consistent primary status
James R. Covington | Miklal Software Solutions | [email protected]
![Page 30: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/30.jpg)
Natural language processing: rules
ArticlesGen 1:27
Zech 1:10
2 Sam 15:6
James R. Covington | Miklal Software Solutions | [email protected]
![Page 31: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/31.jpg)
Natural language processing: rules
Verbs Are auxiliaries grouped with main verbs?
Do main verbs receive primary status?
Of Is “of” grouped with nomen regens (construct)?
Waw Is waw grouped with conjunctions that follow it?
James R. Covington | Miklal Software Solutions | [email protected]
![Page 32: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/32.jpg)
Natural language processing: rules
Hebrew definite direct object marker ( תא )
always unlinked (unless interpreted as preposition)
Jer 10:1James R. Covington | Miklal Software Solutions | [email protected]
![Page 33: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/33.jpg)
Natural language processing: rules
[demo Hebrew definite direct object checker]
James R. Covington | Miklal Software Solutions | [email protected]
![Page 34: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/34.jpg)
Natural language processing: context
Uncommon link checker global context
common tokens
uncommon link
Improbable link checker local context
more probable link
(“unstable marriage”)
James R. Covington | Miklal Software Solutions | [email protected]
![Page 35: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/35.jpg)
Natural language processing: context
[demo uncommon and improbably link checker]
James R. Covington | Miklal Software Solutions | [email protected]
![Page 36: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/36.jpg)
N-grams: consistent alignment
4-gram (Hebrew)
James R. Covington | Miklal Software Solutions | [email protected]
![Page 37: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/37.jpg)
Graph theory: primary status of םש “name”
Example groups linked to םש “name”
name
a/the name
the name of
a name for
renown
was named
she named
he called … name
James R. Covington | Miklal Software Solutions | [email protected]
םש
![Page 38: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/38.jpg)
Graph theory: primary status of םש “name”
Goal: simple directed graph (i.e. no loops)
James R. Covington | Miklal Software Solutions | [email protected]
![Page 39: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/39.jpg)
Graph theory: שוב (qal) “return”
Some graphs get complicated.
James R. Covington | Miklal Software Solutions | [email protected]
![Page 41: Quality and consistency in text alignment](https://reader031.fdocuments.net/reader031/viewer/2022032717/55d19156bb61eb934e8b45d7/html5/thumbnails/41.jpg)
Conclusions
1. Text alignment is useful inasmuch as it is accurate and consistent.
2. Achieving quality and consistency requires multiple strategies:
a. writing consistency standards (before)
b. software-design (during)
c. post-processing (after)
James R. Covington | Miklal Software Solutions | [email protected]