Integrating Machine Translation with Translation Memory: A Practical Approach
-
Upload
dimkart -
Category
Technology
-
view
1.524 -
download
0
Transcript of Integrating Machine Translation with Translation Memory: A Practical Approach
IntroductionMethodology
Discussion
Integrating Machine Translation with TranslationMemory: A Practical Approach
Panagiotis Kanavos and Dimitrios Kartsaklis
November 4, 2010
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 1/ 18
IntroductionMethodology
Discussion
Introduction
I Despite the ongoing research and the progress on the field,Machine Translation has not been widely accepted by theprofessional translation industry
I Common criticisms:I MT is only suitable for draft translations of e-mails and web
pagesI MT is not efficient for morphologically rich languagesI MT is useful only to large companies owning a wealth of
resources
I In a nutshell : MT is something for researchers to play aroundwith
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 2/ 18
IntroductionMethodology
Discussion
A Case Study
I How MT can be incorporated into professional translationworkflows, with limited resources, in ways that significantlyincrease productivity.
I We combine both statistical and rule-based MT systems withTranslation Memory software using two approaches:
I The on demand, sentence-by-sentence application of MTI The one-time application of MT into the whole translation
project
I The case study is conducted in production conditions, withfinal deliverables that require the highest translation quality.
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 3/ 18
IntroductionMethodology
Discussion
ConfigurationSegment-by-segment workflowsOne-time MT application workflow
Our setting
I Language pair: English to Greek
I Text to be translated: Two Informatics books: onetechnical guide and one academic textbook.
I TM size: 140,000 TUs coming from in-domain texts
I Terminology DB size: 30,000 entries
I Fuzzy threshold: 70%
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 4/ 18
IntroductionMethodology
Discussion
ConfigurationSegment-by-segment workflowsOne-time MT application workflow
Software programs and combinations
I MT systems:I Statistical: MosesI Rule-based: Systran
I CAT programs:I Swordfish II (Java application) over LinuxI Deja Vu X over MS WindowsI Wordfast, an MS Word macro template
I Three combinations, based on practical factors:I Sentence-by-sentence workflow with Swordfish/MosesI Sentence-by-sentence workflow with Wordfast/SystranI One-time MT application workflow with Deja Vu X/Moses
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 5/ 18
IntroductionMethodology
Discussion
ConfigurationSegment-by-segment workflowsOne-time MT application workflow
Swordfish/Moses combination
I Swordfish: Allows connection to external programs or scriptsI Connection with Moses achieved with a custom Python scriptI Basic workflow:
if TM match > 80% thenaccept fuzzy match for post-edit
else if 70% < TM match =< 80% thenevaluate the fuzzy matchif quality not acceptable then
apply MTend if
elseapply MTif quality not acceptable then
type the translation from scratchend if
end if
post-edit
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 6/ 18
IntroductionMethodology
Discussion
ConfigurationSegment-by-segment workflowsOne-time MT application workflow
Swordfish/Moses combination: Results
Book 1 : Instructive guide, Book 2 : Textbook
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 7/ 18
IntroductionMethodology
Discussion
ConfigurationSegment-by-segment workflowsOne-time MT application workflow
Wordfast/Systran combination
I Wordfast: A macro template working on top of MS Word
I Great deal of customization through MS Word macros
I Rule-based version of Systran, supporting user dictionaries
I Basic workflow:if TM match < 70% then
apply pre-editing macrossend segment to MT engineapply post-editing macroswhile MT result not good do
amend Systran user dictionary and re-send segment to MTend while
elseaccept the translation for post-edit
end if
post-edit
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 8/ 18
IntroductionMethodology
Discussion
ConfigurationSegment-by-segment workflowsOne-time MT application workflow
Wordfast/Systran combination: Results
Book 1 : Instructive guide, Book 2 : Textbook
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 9/ 18
IntroductionMethodology
Discussion
ConfigurationSegment-by-segment workflowsOne-time MT application workflow
Deja Vu X/Moses combination
I Deja Vu X: similar concept to SwordfishI However: No way of integration with an MT system, so the
only option is pre-translation of the whole project with MosesI Send for MT only segments with no TM matches or TM
matches below 80%I Pre-translation stage:
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 10/ 18
IntroductionMethodology
Discussion
ConfigurationSegment-by-segment workflowsOne-time MT application workflow
Deja Vu X/Moses combination
I Basic workflow:if TM match > 80% then
accept the translation for post-editelse
evaluate MT translationif quality not acceptable then
if any TM match exists (between 70-80%) thenaccept the translation for post-edit
elseapply “auto-assemble” featureif quality not acceptable then
type the translation from scratchend if
end ifend if
end if
post-edit
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 11/ 18
IntroductionMethodology
Discussion
ConfigurationSegment-by-segment workflowsOne-time MT application workflow
Deja Vu X/Moses combination: Results
Book 1 : Instructive guide, Book 2 : Textbook
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 12/ 18
IntroductionMethodology
Discussion
Productivity increase
I MT & TM combination: Productivity increased to a level notpossible by applying either technology in isolation:
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 13/ 18
IntroductionMethodology
Discussion
Important factors
I Quantity and quality of TM entriesI The domain of the translation material used to train the
statistical MT systemI The above impose serious limitations for those who work with
small texts in many different domains. Rule-based systems aremore suitable in such cases
I Language pair: Coding efficient user dictionaries withmorphologically rich languages is difficult and requires sometrial and error. Phrase-based systems like Moses have betterperformance
I Style of text: Productivity is higher with repetitive text andstep-by-step instructions
I User expertise with all technologies involved
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 14/ 18
IntroductionMethodology
Discussion
A proposal for a unified application
I For general acceptance by the professional translationcommunity, MT should be integrated with TM into anintuitive unified system
I Basically a TM environment, with the MT engine as an extracomponent working on top of it
I MT suggestions should be presented in a controlled andselective way
I Basic components:I A 2-column translation grid for source and target segmentsI Terminology managementI MT engineI Alignment toolI Quality assurance control
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 15/ 18
IntroductionMethodology
Discussion
Advanced issues
I Automation of the training process with TM databases
I Statistical systems require considerable computing resources.A solution: MT as Software As a Service (SaaS)
I Terminology databases can be used for more than referencepurposes
I Additional entry fields for coding MT dictionary entries(Systran)
I Linguistic information can be used for creating factored models(Moses)
I Automatic suggestions-as-you-type (TransType, Caitra)
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 16/ 18
IntroductionMethodology
Discussion
Summary
I The combination of MT with TM results in significantproductivity increase not feasible in a TM-only environment
I Currently there is not a straightforward way for doing that
I Work is in progress by the authors towards this purpose, inthe form of a Software Specification document that willdescribe the design and the components of such a system inevery detail
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 17/ 18
IntroductionMethodology
Discussion
Thank you!
Any questions?
Panagiotis Kanavos and Dimitrios Kartsaklis Integrating MT with TM: A Practical Approach 18/ 18