TAUS MT SHOWCASE, Microsoft Translator, Chris Wendt, Microsoft, 10 October 2013

43
TAUS MACHINE TRANSLATION SHOWCASE Microsoft Translator 09:30 – 09:50 Thursday, 10 October 2013 Chris Wendt Microsoft

Transcript of TAUS MT SHOWCASE, Microsoft Translator, Chris Wendt, Microsoft, 10 October 2013

TAUS  MACHINE  TRANSLATION  SHOWCASE  

Microsoft Translator 09:30 – 09:50 Thursday, 10 October 2013 Chris Wendt Microsoft

Microsoft Translator Chris Wendt [email protected]

TAUS MT Showcase October 10, 2013 - Santa Clara, California

Why MT? The purpose

The Crude §  Extent of localization §  Data Mining & Business Intelligence §  Globalized NLP §  Triage for human translation

Research §  Machine Learning §  Statistical Linguistics §  Same-language translation

The Good §  Breaking down language barriers §  Text, Speech, Images & Video §  Language Preservation

NOT: §  Spend less money §  Take the job of human translators §  Perform miracles

Microsoft Translator – Quick Facts

§  Linguistically informed statistical MT system §  41 languages – from any language to any other language §  Runs in Microsoft Datacenter §  Simple web service API: SOAP, REST, AJAX, OData, web site widget §  2 million characters/month free §  Available in the Enterprise Agreement, as a monthly subscription §  For extreme confidentiality situations available on-premise

§  Highly customizable: –  Collaborative Translations – Involve community, coworkers and customers –  Hub: Custom engine training via an easy-to use UI

§  Web Scale –  Powers translations in Bing, Microsoft Office, Microsoft SharePoint, Internet Explorer,

Yammer –  Powers translations in Facebook, Twitter, eBay, and many other government and enterprise

sites

4

Microsoft Translator at a Glance

World-class Statistical Machine Translation Built on over a decade of work at Microsoft Research

Big Data Powered Trained with billions of “parallel” sentences (Bing index & licensed)

General Purpose System Powers Bing Translator, supports 40+ languages, any-to-any

Unprecedented Customization Capability Hub train before translation + CTF edit after translation

Powerful Cloud API Rich, secure API enabling integrations, 99.9% availability

Fully integrated across the stack, Translator extends the value of Microsoft platform and your solutions built on the Microsoft platform for our customers including consumer facing applications such as Bing Translator, Bing Toolbar, Bing Dictionary, and Windows Phone App.

+80,000 more.

A few of our customers and partners….

Enabling Translation in Many Products

Powerful Tools and Customization

Our machine learning & big-data based translation technology brings the power of instant translations to break down language barriers for users, developers, webmasters, translators and businesses. Robust, industry leading tools such as the HUB and CTF allow for unprecedented customization of the translation experience.

Instant translation and language services in web, desktop and mobile applications. Highly scalable and robust cloud-based, machine-translation service from Microsoft. Supports SOAP, REST, AJAX, OData, and the Translator web site translation widget. Extensibility for development on SharePoint, Office , Windows Phone, and more…..

Instant translations of web pages without the need to write any code. Use the AJAX API to roll-your-own widget. Use the integrated “Collaborative Translations” (CTF) functionality to tap into your community.

Custom translation portal to build, train, and deploy customized automatic language translation systems. Combine your data with Bing big data to tune the translation output to best fit your content. Free with any level of Translator subscription (including the free tier).

Override, modify or vote for the translated output to best fit the content. Provide the end-user alternative translations. Import the edits back into Hub for further training.

Hub CTF Widget Powerful API

Integrates with your TM tool

8

Top translation tools support Microsoft Translator

Give these a try! (Demo)

Bing Translator

Lync Conversation Translator

Translator Widget for Webpages

Word Web App

Contextual Thesaurus

Price Competitively priced

§  Monthly subscription §  Free for up to 2 million characters per month §  Base price: $10 per million characters §  Discounted for higher volumes §  Paid by credit card or via Microsoft Enterprise agreement

10

Extent of localization Methods of applying MT

11

Post-Editing

§ Goal: Human translation quality

§ Increase human translator’s productivity

§ In practice: 0% to 25% productivity increase – Varies by content, style and

language

Raw publishing

§ Goals: – Good enough for the purpose – Speed – Cost

§ Publish the output of the MT system directly to end user

§ Best with bilingual UI § Good results with technical

audiences

Extent of localization Methods of applying MT

12

Post-Editing

§ Goal: Human translation quality

§ Increase human translator’s productivity

§ In practice: 0% to 25% productivity increase – Varies by content, style and

language

Raw publishing

§ Goals: – Good enough for the purpose – Speed – Cost

§ Publish the output of the MT system directly to end user

§ Best with bilingual UI § Good results with technical

audiences

Post-Publish Post-Editing “P3” § Know what you are human

translating, and why § Make use of community

– Domain experts – Enthusiasts – Employees – Professional translators

§ Best of both worlds – Fast – Better than raw – Always current

The Triangle You can have only two. Not anymore!

13

Price

Speed Quality

P3

P3: Post-Publishing Post-Edit

The cost/quality curve Optimize for the knee

14

Highly visible marketing content

Low pageview supporting content

No cost No translation

Low cost MT+TM+ Community

High cost Fully qualified HT

Very high cost Expert reviewed translation/ transcreation

User satisfaction

Good enough for the intended purpose

$

Collaboration: MT + Your community

Your  community

Your  Web  Site

Your  App

Microsoft  Translator  

Collaborative  TM

Enormous  language  knowledge

Microsoft  Translator  API

TranslationRequest Response

Match  first

Translate  if  no  match

What makes this possible – fully integrated 100% matching TM

Collaborative TM entries: §  Rating 1 to 4: unapproved §  Rating 5 to10: Approved §  Rating -10 to -1: Rejected 1 to many is possible

Making it easier for the approver – Pending edits highlight

Making it easier for the approver – Managing authorized users

Making it easier for the approver – Bulk approvals

What is Important? In this order

§ Quality § Access § Coverage

Measuring Quality: Human Evaluations Knowledge powered by people

§  Absolute §  3 to 5 independent human evaluators are asked to rank translation

quality for 250 sentences on a scale of 1 to 4 –  Comparing to human translated sentence –  No source language knowledge required

23

4   Ideal   Grammatically correct, all information included  

3   Acceptable   Not perfect, but definitely comprehensible, and with accurate transfer of all important information  

2   Possibly Acceptable  

May be interpretable given context/time, some information transferred accurately  

1   Unacceptable   Absolutely not comprehensible and/or little or not information transferred accurately  

Also: Relative evals, against a competitor, or a previous version of ourselves

Measuring Quality: BLEU* Cheap and effective – but be aware of the limits

§ A fully automated MT evaluation metric – Modified N-gram precision, comparing a test

sentence to reference sentences

§ Standard in the MT community – Immediate, simple to administer – Correlates with human judgments

§ Automatic and cheap: runs daily and for every change

§ Not suitable for cross-engine or cross-language evaluations

24

* BLEU: BiLingual Evaluation Understudy

Result are always relative to the test set.

Measuring Quality In Context Real-world data

§ Instrumentation to observe user’s behavior § A/B testing § Polling

25

In-Context gives you the most useful results

26  

Knowledge Base (since 2003)

28

29  

Knowledge base feedback

Source: Martine Smets,

Microsoft Customer

Support

30  

Knowledge Base Resolve Rate

Human Translation Machine Translation

Microsoft is using a customized version of Microsoft Translator

Statistical MT - The Simple View

TranslationEngine

Collect and store parallel and target

language data

Web mined data

Train statistical models

Government dataMicrosoft manuals

DictionariesPhrasebooksPublisher data

High-Performance Computing Cluster

TranslationEngine

User InputText, web pages, Chat etc

Distributed Runtime

Translated Output

Translation APIs and UX

31

Collaboration: MT + Your community

Your  community

Your  Web  Site

Your  App

Microsoft  Translator  

Collaborative  TM

Enormous  language  knowledge

Microsoft  Translator  API

Remember the collaborative TM? There is more.

Collaboration: You, your community, and Microsoft

Your  community

Your  Web  Site

Your  App

Your  TMs

Your  previously  translated  documents

Microsoft  Translator  

Collaborative  TM

Microsoft  Translator  Hub

Enormous  language  knowledge

Microsoft  Translator  API

Your  custom  MT  system

Your  collaborators

You, your community and Microsoft working together to create the optimal MT system for your terminology and style

34  

39  

Just visit http://hub.microsofttranslator.com to do it yourself

Office 2013 Beta Send-a-smile program

§ 107 languages § 234M words translated § $22B revenue, > 60% outside U.S.

§ > 100,000 Send-a-smiles received § > 500 bugs fixed

Example of Business Intelligence use

Contacts

Web site www.microsoft.com/translator Licensing & Pricing Questions [email protected] General & Customer Questions [email protected]

43