Machine Translation Quality Estimation - A Linguist's Approach
-
Upload
juan-rowda -
Category
Data & Analytics
-
view
230 -
download
0
Transcript of Machine Translation Quality Estimation - A Linguist's Approach
![Page 1: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/1.jpg)
MACHINE TRANSLATION QUALITY ESTIMATIONA Linguist’s Approach
![Page 2: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/2.jpg)
2
WHAT IS MT QUALITY ESTIMATION?
Automatically providing a quality indicator for machine translation output without depending on human reference translations.
Our objective:Estimate quality and post-editing effort for eBay listing titles and descriptions
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH
![Page 3: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/3.jpg)
3
ONE big CHALLENGE
min W Ʃ T t=1 ||(W(t)X(t) − Y (t) )||2 2 + λs||S||1 + λb||B||1,∞ subject to: W = S + B
or
“State-of-the-art QE explores different supervised linear or non-linear learning methods for regression or classification such as Support Vector Machines (SVM), different types of Decision Trees, Neural Networks, Elastic-Net, Gaussian Processes, Naive Bayes, among others”
(Machine Translation Quality Estimation Across Domains, de Souza et al, 2014)
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH
![Page 4: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/4.jpg)
4
A LINGUIST’S APPROACH
Using linguistic features from 3 dimensions:
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH
COMPLEXITY ADEQUACYFLUENCY
![Page 5: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/5.jpg)
5
FEATURESComplexity:
• Length
• Polysemy
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH
Adequacy:
• QA Terminology Patterns Blacklist Numbers
• Automated Post-Editing
• (POS)
• (NER)
Fluency:
• Misspellings
• Grammar errors
![Page 6: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/6.jpg)
6
IMPLEMENTATION
Checkmate+LanguageTool
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH
Reusable Profile
Detailed Report
Score
![Page 7: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/7.jpg)
7
TESTING
• One Language (es-LA)
• Short samples (~300 words)
• Bigger samples (~1000 words)
• Post-Edited files (~50,000 words)
• pt-BR, ru-RU, zh-CN
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH
![Page 8: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/8.jpg)
RESULTS
![Page 9: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/9.jpg)
9
MEASURING RESULTS
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH
![Page 10: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/10.jpg)
10
SAMPLES - SCORE AND TIME ALIGN
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH
![Page 11: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/11.jpg)
11
FILES - SCORE AND ED ALIGN
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH
Average ED (es-LA, descriptions) = 72
![Page 12: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/12.jpg)
12
MT QE OVER TIME
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH
![Page 13: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/13.jpg)
13
SAMPLES - OTHER LANGUAGES
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH
![Page 14: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/14.jpg)
14
CHALLENGES
• False positives
• Matching score and post-editing effort
• Same weight for all features
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH
![Page 15: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/15.jpg)
15
WHAT’S NEXT
• Tracking scores over time
• Adding scores to our post-editing tool
• Adding new languages
• Researching new features
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH
![Page 16: Machine Translation Quality Estimation - A Linguist's Approach](https://reader036.fdocuments.net/reader036/viewer/2022081605/58f095661a28ab3c588b45cf/html5/thumbnails/16.jpg)
16
HOW CAN YOU USE THIS?
• Tailor the model to your needs
• Estimate quality at the file/segment level
• Target post-editing, discard bad content
• Estimate post-editing effort/time
• Compare MT systems
• Monitor MT system progress
MT QUALITY ESTIMATION – A LINGUIST’S APPROACH