KantanMT Analytics: The Missing Link in Machine Translation

34
KantanMT Analytics - The Missing Link No Hardware. No Software. No Hassle MT

description

www.kantanmt.com. Tony O’Dowd discusses the major developments in Machine Translation over the last few years with a particular focus on measurement technologies. In the past, users of Machine Translation have had considerable difficulty in pre-evaluating the quality of their Machine Translation output. This has led to industry confusion with regards to both post editing pricing and client Machine Translation project pricing. Speaking about the ‘confidence scoring’ technology which has been co-developed with CNGL, Tony illustrates how LSPs and other users of Machine Translation can now accurately predict the quality of their Machine Translation output on a segment by segment basis.

Transcript of KantanMT Analytics: The Missing Link in Machine Translation

  • 1. No Hardware. No Software. No Hassle MT.

2. KantanMT Analytics - The Missing Link 3. What we aim to cover today? The MT & Quality Relationship What is quality? Possible ways of measuring it Automated/Manual methods Who needs to measure quality Localisation stakeholders The Missing Link - KantanMT Analytics Segment level quality analysis Helping to build predictable business models45 Mins Presentation 15 Mins Q&A Q&A KantanMT Analytics - The Missing Link 4. What is KantanMT.com? Statistical MT System Cloud-based Highly scalable Inexpensive to operate Quick to deploy Our Vision To put Machine Translation Customization Improvement Deployment into your handsFully Operational 7 months Active KantanMT Engines6,632 Training Words Uploaded23,653,605,925 Member Words Translated362,291,925 KantanMT Analytics - The Missing Link 5. The Quality & MT Relationship Lets agree a model for defining quality!Quality Target (defined by client)No Quality (baseline)Taking into consideration quality of MT outputs and level of quality defined by your clients.KantanMT Analytics - The Missing Link 6. Attributes of Quality Attributes of Quality Model Language Attributes Adequacy Fluency AdequacyMeaning of generated textsexpressed in source/target Fluency Comprehensibility & readability Factors include Task-oriented Attributes Productivity Post-editing speed Acceptability Fit-for-purpose measurement Usable translations within the context of the end user/clientAcceptabilityGrammar errors word selection syntaxLanguageProductivityTaskKantanMT Analytics - The Missing Link 7. Attributes of Quality Attributes of Quality Model Language Attributes Adequacy Fluency AdequacyMeaning of generated textsexpressed in source/target Fluency Comprehensibility & readability Factors include Task-oriented Attributes Productivity Post-editing speed Acceptability Fit-for-purpose measurement Usable translations within the context of the end user/clientAcceptabilityGrammar errors word selection syntaxLanguage Translation StyleProductivityTask Business Model KantanMT Analytics - The Missing Link 8. Attributes of Quality Attributes of Quality Model Language AttributesTask-oriented AttributesWhat we want?Fluency AdequacyProductivity AcceptabilityFuzzyMatchLanguage Translation StyleTask Business Model KantanMT Analytics - The Missing Link 9. Measuring MT Quality Automated Fast Repeatable Objective Scalable Cheap Based on samples Cant be used by PMs Scope/Cost predictions Manual Slow Cumbersome Subjective Not scalable Expensive Based on samples Cant be used by PMs Scope/Cost predictionsKantanMT Analytics - The Missing Link 10. Measuring MT by hand! Sample Translations based on template StyleWrong terminology Wrong Spelling Source not Capitalization Translated/Omissions Syntax & Grammar Compliance with client specs Wrong Word Form Literal translation Part of Speech Wrong Text/Information added Punctuation Technical Tags and Markup Sentence Structure Locale AdaptationOverallSpacing Adequacy Score Fluency Score Overall Quality ScoreKantanMT Analytics - The Missing Link 11. Manual Framework Adequacy Score (Range 1 5)5 Full Meaning All meaning expressed in the source segment appears in the translated segment Most Meaning Most of the source segment meaning is expressed in the translated segment Much Meaning Much of the source segment meaning is expressed in the translated segment Little Meaning Little of the source segment is expressed in the translated segment No Meaning None of the meaning expressed in the source segment is expressed in the translated segment1KantanMT Analytics - The Missing Link 12. Manual Framework Fluency Score (Range 1 5)5 Native language fluency No grammar errors, excellent word selection and good syntax. No post-editing required. Near native fluency Few terminology/grammar errors. No impact on overall understanding of the meaning. Little post-editing required. Not very fluent About half of translation contains errors and requires post-editing. Little fluency Wrong word choice, poor grammar and syntax. A lot of post-editing required. No fluency Absolutely ungrammatical and doesnt make any sense. Re-translate from scratch .1KantanMT Analytics - The Missing Link 13. Source MT Target SpacingSyntax and GrammarLocale AdaptationTags and MarkupSentence StructurePunctuationWrong Part of SpeechStyleWrong Word FormCapitalizationText/Information addedLiteral translationCompliance with client specsSource not Translated/OmissionsWrong SpellingWrong terminologyOverall quality (1-4)Fluency (Score 1-5)Adequacy (Score 1-5)Manual Framework TechKantanMT Analytics - The Missing Link 14. Manual Framework Attributes of Quality Model Language AttributesFluencyTask-oriented AttributesProductivityManual MethodsAdequacyAcceptabilityLanguage Translation StyleTask Business Model KantanMT Analytics - The Missing Link 15. Automated Methods Many different methods available BLEU, F-Measure, GTM, TER, NIST, Meteor, etc. Common characteristics Compute similarity of generated texts to reference texts The smaller the difference => the better the quality! Broad adoption Industry & AcademiaKantanMT Analytics - The Missing Link 16. Automated Methods F-Measure Recall & Precision Metric Reference Translation MT Output RecallPrecisionF-Measurecorrect Ref-Lencorrect MT-LenPrecision * Recall (Precision + Recall) /280%66%73% Flaw: no penalty for reordering KantanMT Analytics - The Missing Link 17. Automated Methods WER (Word Error Rate) Min number of edits to transform output to reference Reference Translation MT Output WER Substitutions + insertions + deletions Reference-length Levenshtein distance measure General indicator of Post-Editing Effort KantanMT Analytics - The Missing Link 18. Automated Methods BLEU Score Put simply measures how many words overlap, giving higher scores to sequential words High correlation between BLEU and human judgement of translation quality Reference TranslationMT OutputKantanMT Analytics - The Missing Link 19. Automated Methods KantanWatch can be used to track and monitorautomated scores* KantanWatch ReportsKantanMT Analytics - The Missing Link 20. Automated Methods Improvements can be monitored during the build-measure-learn cycle of a KantanMT deployment* KantanWatch ReportsKantanMT Analytics - The Missing Link 21. Automated Methods Time-graphs offer good overview of the maturing of aKantanMT engine* KantanWatch ReportsKantanMT Analytics - The Missing Link 22. Automated Methods Can also present a holistic view of the potential qualityof KantanMT outputs* KantanWatch ReportsKantanMT Analytics - The Missing Link 23. Automated Methods Attributes of Quality Model Language AttributesTask-oriented AttributesNISTFluencyProductivityGTM F-MeasureAdequacyTERAcceptabilityBLEU METEORLanguageTaskTranslation Style Business Model Major Flaw: All measurements based on reference translations KantanMT Analytics - The Missing Link 24. Who uses these measurements? The Localisation Stakeholder Dilemma Developers of MT Engines Automated BLEU, METEOR, F-MEASURE, TER ideal and practical No individual measurement has absolute meaning but points quality curve in the right direction within a domainKantanMT Analytics - The Missing Link 25. Who needs to measure Quality? The Localisation Stakeholder Dilemma Production Teams (PMs, LEs and QEs) Need segment measurements on quality and PE efforts Determine tiered segment post-edit rate Distribution of post-editing tasks based on segment quality Localisation Managers Need productivity measurements to predict budget and schedule Aka Project Segment Reports MT Measurements need to fit business planning and charge models Translators Unfortunately, dont get a fair deal No segment information, just top level project inferences based on samples KantanMT Analytics - The Missing Link 26. Manual MethodsTERBLEUGTMMETEORF-MeasureNISTMT DevelopersProductionThe Quality & MT RelationshipKantanMT Analytics - The Missing Link 27. Conclusions There are many automated MT quality measurements Mostly suitable for MT developers Not optimal for production teams Of no use to translators All rely on reference texts to compute measurements Whats needed? Segment level measurements Drive project schedule and charge model High correlation to human effort Do not rely on reference texts to compute measurementsKantanMT Analytics - The Missing Link 28. Attributes of Quality Attributes of Quality Model Language AttributesTask-oriented AttributesWhat you wantFluency AdequacyProductivity AcceptabilityKantanMT AnalyticsLanguage Translation StyleTask Business Model KantanMT Analytics - The Missing Link 29. Introducing KantanMT Analytics Segment level scoring for MT output Designed to make it possible to create predictable Business Models Project Schedule Cost Models Co-developed KantanMT.com CNGL Centre of Next Generation LocalisationKantanMT Analytics - The Missing Link 30. KantanMT Analytics Select Analyse featureKantanMT Analytics - The Missing Link 31. KantanMT Analytics Select Analyse featureKantanMT Analytics - The Missing Link 32. KantanMT Analytics KantanMT Analytics Reportcreated XML based for consumption byTMS/GMS platforms KantanMT Analytics - The Missing Link 33. KantanMT Analytics XLIFF document created Contains scores for each segmentKantanMT Analytics - The Missing Link 34. The Missing Link Attributes of Quality Model Language AttributesTask-oriented AttributesFluencyProductivityKantanMT AnalyticsAdequacyLanguage Translation StyleAcceptabilityTask Business Model KantanMT Analytics - The Missing Link