TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2013

TAUS MACHINE TRANSLATION SHOWCASE

The WeMT Program 10:20 – 10:40 Thursday, 10 October 2013 Olga Beregovaya Welocalize

WeMT Tools and Processes

We’ll talk about:

• MT Programs • Metrics • Engines •  Language Tools

Current MT Programs

Dell – 27 languages Autodesk – 11 languages PayPal -‐ 8 languages Cisco – 17 languages between 3 Ders Intuit – 20+languages MicrosoH (pre-‐project support) McAfee (pilot) … many more in pilot stage

MT Program: Path-to-Success Components

A set of MT engines – “mix and match”

TMT SelecDon Mechanisms

Post-‐ediDng Environment

Processes and metrics

Data gathering and reporDng tool – what, how much, how fast and at what effort

EDUCATION EDUCATION EDUCATION

CHANGE

The recipe for success

Process and Workflow

All aspects of the localization ecosystem are taken into consideration Selec3ng the right MT engine

By using our MT engine selecDon Scorecard we make sure all important KPIs are taken into consideraDon at selecDon Dme

Empowerment through educa3on Internal, by the use of customized Toolkits; external, through specialised Trainings.

MT KPIs: ü  Produc3vity: Throughputs ü  Produc3vity: Delta ü  Quality: LQA ü  Quality: Automa3c Scores ü  Cost ü  GlobalSight: Connec3vity ü  GlobalSight: Tagging ü  Human Evalua3on ü  Customiza3on: Internal/External ü  Customiza3on: Time The feedback loop

ConstrucDve communicaDon from post-‐editor to MT provider

o  Source content classificaDon (i.e. markeDng/UI/UA/UGC) o  Length of the source segment o  Source segment morpho-‐syntacDc complexity o  Presence/absence of pre-‐defined glossary terms or mulD-‐word glossary

elements, UI elements, numeric variables, product lists, ‘do-‐not-‐translate’ and transliteraDon lists

o  Tag density -‐ Metadata aeributes and their representaDon in localizaDon industry standard formats (“tags”)

o  ROC – quality levels based on content use (“impact”)

3D Model: Expected producDvity mapped to desired quality levels and source content complexity

MT Program Design - Source

Produc3vity -‐ Throughputs Number of post-‐edited words per hour

Produc3vity -‐ Delta Percentage difference between translaDon and post-‐

ediDng Dme Cost

ExtrapolaDon, cost per word CMS -‐ Connec3vity

Is there a connector in place? Quality/Nature of source Quality (Final) -‐ LQA

Internal quality verificaDon Quality (MT) -‐ Automa3c Scores

A set of automaDc scoring systems is used

MT Engine Selection Scorecard

We have tested and used different engines so we’ve seen the good, the bad and the ugly; now we can better appreciate what we have

Scorecard - Metrics Overall data

KPIs # 1 # 2 # 3 # 4 KPIs # 1 # 2 # 3 # 4Productivity 4 4 4 4 Productivity 4 5 3 4Productivity Increase 5 4 1 3 Productivity Increase 5 5 1 4Quality -‐ LQA 2 2 1 2 Quality -‐ LQA 5 3 3 4Quality -‐ Automatic Scores 3 3 3 3 Quality -‐ Automatic Scores 3 4 3 3Cost 4 2 3 3 Cost 4 2 3 3GlobalSight -‐ Connectivity 4 3 2 4 GlobalSight -‐ Connectivity 4 3 2 4GlobalSight -‐ Tagging 4 2 4 2 GlobalSight -‐ Tagging 4 2 2 2Human Evaluation 3 3 3 4 Human Evaluation 3 3 3 3Customization -‐ Internal/External 4 2 3 3 Customization -‐ Internal/External 4 2 3 3Customization -‐ Time 3 1 2 1 Customization -‐ Time 3 1 2 1Total 36 26 26 29 Total 39 30 25 31

German French ProducDvity metrics

AutomaDc Scoring

Human EvaluaDon

Toolkits and Trainings

Our experience: ü  Most translators know and have experienced post-‐ediDng but they have limited knowledge of any other related aspect (automaDc scoring, output differences between RBMT and SMT...) ü  The majority of people who work in localizaDon have heard about MT but most of them sDll find it a daunDng subject.

Our answer: ü  ConDnuous MT and PE related trainings and documentaDon for language providers ü  Customized Toolkits for different internal departments (ProducDon, Quality, Sales, Vendor Management)

Transparency and Ownership Theory – knowledge foundaDons

Prac3ce – customized PE sessions for different client accounts

Transparency – process, engine selecDon/customizaDon, evaluaDons

Responsibility – valid evaluaDons, construcDve feedback, quality ownership

Training helps a lot - After I was told some of the background information and tips and tricks for certain engines/outputs, I was much more relaxed and happy to give MT a go.

Legacy data – best prediction tool > StaDsDcs from legacy knowledge base

The feedback loop

engine retraining improved significantly the handling of tags and spaces around tags, this is a productive achievement as it saves us a lot of manual corrections.

For me the biggest advantage would be

the possibility to implement a client

terminology list [in SMT]

I wish we could easily fix the corpus for outdated

terminology and characters

Teach the engine to properly cope with sentences containing more than one verb and/or verbs in progressive form

Feedback and Engine Improvement

“Beyond the Engine” Tools

•  Teaminology -‐ crowdsourcing plamorm for centralized term governance; simultaneous concordance search of TMs and term bases => clean training data

•  Dispatcher -‐ A global community content translaDon applicaDon that connects user generated content (UGC) including live chats, social media, forums, comments and knowledge bases to customized machine translaDon (MT) engines for real-‐Dme translaDon

•  Source Candidate Scorer – scoring of candidate sentences against historically good and bad sentences based on POS and perplexity

•  Corpus Prepara3on Toolkit – set of applicaDon to maximize data preparaDon for MT

engine training

Teaminology

Teaminology

Dispatcher

Source Candidate Scorer

Source Candidate

Scorer

Compares your source content to “the good” and “the bad” legacy segments and esDmates potenDal suitability for MT

Corpus Preparation Suite

Variety of tools to prepare corpus for training MT engines such as: •  DeleDng formaong tags from TMX •  Removing double spaces •  Removing duplicated punctuaDon (e.g. commas) •  DeleDng segments where source = target •  DeleDng segments containing only URLs •  Escaping characters •  Removing duplicate sentences

Corpus Preparation: TM Creator

TM Creator

Aggregates training data from various relevant sources

Corpus Preparation: TMX Splitter

Extracts the relevant training corpus based on the TMX metadata

Welocalize Moses Implementation

•  Why? Far more control over engine quality since we can control corpus

preparaDon and output post-‐processing •  Control over metadata handling •  Ties into our company open-‐source philosophy •  Have experienced personnel in-‐house •  Can extend and customize Moses funcDonality as necessary •  Have connector to TMS (GlobalSight) RESULTS: In our internal tests with Moses/DoMT, we are geong automated scores similar to commercial engines for the languages into which we localize most. Same feedback received from human evaluators

… And it works!

We are in the position to offer realistic discounts and aggressive timelines providing quality levels appropriate for the content

“Work-in-progress” Projects

•  Ongoing improvements to our adaptation of iOmegaT tool (Welocalize/CNGL)

•  Industry Partner in CNGL “Source Content Profiler” project

•  Adoption of TMTPrime (CNGL) - MT vs. Fuzzy Match selection mechanism

•  Language and content-specific pre-processing for the in-house Moses deployment

•  Teaminology – adding linguistic intelligence

Questions

[email protected] We speak MT -‐ the language of the future

TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2013

Technology

Transcript of TAUS MT SHOWCASE, The WeMT Program, Olga Beregovaya, Welocalize, 10 October 2013