Conversational Speech Translation - Challenges and Techniques, by Chris Wendt, Microsoft
TAUS MT SHOWCASE, Microsoft Translator, Chris Wendt, Microsoft, 10 October 2013
-
Upload
taus-enabling-better-translation -
Category
Technology
-
view
473 -
download
0
Transcript of TAUS MT SHOWCASE, Microsoft Translator, Chris Wendt, Microsoft, 10 October 2013
TAUS MACHINE TRANSLATION SHOWCASE
Microsoft Translator 09:30 – 09:50 Thursday, 10 October 2013 Chris Wendt Microsoft
Microsoft Translator Chris Wendt [email protected]
TAUS MT Showcase October 10, 2013 - Santa Clara, California
Why MT? The purpose
The Crude § Extent of localization § Data Mining & Business Intelligence § Globalized NLP § Triage for human translation
Research § Machine Learning § Statistical Linguistics § Same-language translation
The Good § Breaking down language barriers § Text, Speech, Images & Video § Language Preservation
NOT: § Spend less money § Take the job of human translators § Perform miracles
Microsoft Translator – Quick Facts
§ Linguistically informed statistical MT system § 41 languages – from any language to any other language § Runs in Microsoft Datacenter § Simple web service API: SOAP, REST, AJAX, OData, web site widget § 2 million characters/month free § Available in the Enterprise Agreement, as a monthly subscription § For extreme confidentiality situations available on-premise
§ Highly customizable: – Collaborative Translations – Involve community, coworkers and customers – Hub: Custom engine training via an easy-to use UI
§ Web Scale – Powers translations in Bing, Microsoft Office, Microsoft SharePoint, Internet Explorer,
Yammer – Powers translations in Facebook, Twitter, eBay, and many other government and enterprise
sites
4
Microsoft Translator at a Glance
World-class Statistical Machine Translation Built on over a decade of work at Microsoft Research
Big Data Powered Trained with billions of “parallel” sentences (Bing index & licensed)
General Purpose System Powers Bing Translator, supports 40+ languages, any-to-any
Unprecedented Customization Capability Hub train before translation + CTF edit after translation
Powerful Cloud API Rich, secure API enabling integrations, 99.9% availability
Fully integrated across the stack, Translator extends the value of Microsoft platform and your solutions built on the Microsoft platform for our customers including consumer facing applications such as Bing Translator, Bing Toolbar, Bing Dictionary, and Windows Phone App.
+80,000 more.
A few of our customers and partners….
Enabling Translation in Many Products
Powerful Tools and Customization
Our machine learning & big-data based translation technology brings the power of instant translations to break down language barriers for users, developers, webmasters, translators and businesses. Robust, industry leading tools such as the HUB and CTF allow for unprecedented customization of the translation experience.
Instant translation and language services in web, desktop and mobile applications. Highly scalable and robust cloud-based, machine-translation service from Microsoft. Supports SOAP, REST, AJAX, OData, and the Translator web site translation widget. Extensibility for development on SharePoint, Office , Windows Phone, and more…..
Instant translations of web pages without the need to write any code. Use the AJAX API to roll-your-own widget. Use the integrated “Collaborative Translations” (CTF) functionality to tap into your community.
Custom translation portal to build, train, and deploy customized automatic language translation systems. Combine your data with Bing big data to tune the translation output to best fit your content. Free with any level of Translator subscription (including the free tier).
Override, modify or vote for the translated output to best fit the content. Provide the end-user alternative translations. Import the edits back into Hub for further training.
Hub CTF Widget Powerful API
Give these a try! (Demo)
Bing Translator
Lync Conversation Translator
Translator Widget for Webpages
Word Web App
Contextual Thesaurus
Price Competitively priced
§ Monthly subscription § Free for up to 2 million characters per month § Base price: $10 per million characters § Discounted for higher volumes § Paid by credit card or via Microsoft Enterprise agreement
10
Extent of localization Methods of applying MT
11
Post-Editing
§ Goal: Human translation quality
§ Increase human translator’s productivity
§ In practice: 0% to 25% productivity increase – Varies by content, style and
language
Raw publishing
§ Goals: – Good enough for the purpose – Speed – Cost
§ Publish the output of the MT system directly to end user
§ Best with bilingual UI § Good results with technical
audiences
Extent of localization Methods of applying MT
12
Post-Editing
§ Goal: Human translation quality
§ Increase human translator’s productivity
§ In practice: 0% to 25% productivity increase – Varies by content, style and
language
Raw publishing
§ Goals: – Good enough for the purpose – Speed – Cost
§ Publish the output of the MT system directly to end user
§ Best with bilingual UI § Good results with technical
audiences
Post-Publish Post-Editing “P3” § Know what you are human
translating, and why § Make use of community
– Domain experts – Enthusiasts – Employees – Professional translators
§ Best of both worlds – Fast – Better than raw – Always current
The Triangle You can have only two. Not anymore!
13
Price
Speed Quality
P3
P3: Post-Publishing Post-Edit
The cost/quality curve Optimize for the knee
14
Highly visible marketing content
Low pageview supporting content
No cost No translation
Low cost MT+TM+ Community
High cost Fully qualified HT
Very high cost Expert reviewed translation/ transcreation
User satisfaction
Good enough for the intended purpose
$
Collaboration: MT + Your community
Your community
Your Web Site
Your App
Microsoft Translator
Collaborative TM
Enormous language knowledge
Microsoft Translator API
TranslationRequest Response
Match first
Translate if no match
What makes this possible – fully integrated 100% matching TM
Collaborative TM entries: § Rating 1 to 4: unapproved § Rating 5 to10: Approved § Rating -10 to -1: Rejected 1 to many is possible
Measuring Quality: Human Evaluations Knowledge powered by people
§ Absolute § 3 to 5 independent human evaluators are asked to rank translation
quality for 250 sentences on a scale of 1 to 4 – Comparing to human translated sentence – No source language knowledge required
23
4 Ideal Grammatically correct, all information included
3 Acceptable Not perfect, but definitely comprehensible, and with accurate transfer of all important information
2 Possibly Acceptable
May be interpretable given context/time, some information transferred accurately
1 Unacceptable Absolutely not comprehensible and/or little or not information transferred accurately
Also: Relative evals, against a competitor, or a previous version of ourselves
Measuring Quality: BLEU* Cheap and effective – but be aware of the limits
§ A fully automated MT evaluation metric – Modified N-gram precision, comparing a test
sentence to reference sentences
§ Standard in the MT community – Immediate, simple to administer – Correlates with human judgments
§ Automatic and cheap: runs daily and for every change
§ Not suitable for cross-engine or cross-language evaluations
24
* BLEU: BiLingual Evaluation Understudy
Result are always relative to the test set.
Measuring Quality In Context Real-world data
§ Instrumentation to observe user’s behavior § A/B testing § Polling
25
In-Context gives you the most useful results
Source: Martine Smets,
Microsoft Customer
Support
30
Knowledge Base Resolve Rate
Human Translation Machine Translation
Microsoft is using a customized version of Microsoft Translator
Statistical MT - The Simple View
TranslationEngine
Collect and store parallel and target
language data
Web mined data
Train statistical models
Government dataMicrosoft manuals
DictionariesPhrasebooksPublisher data
High-Performance Computing Cluster
TranslationEngine
User InputText, web pages, Chat etc
Distributed Runtime
Translated Output
Translation APIs and UX
31
Collaboration: MT + Your community
Your community
Your Web Site
Your App
Microsoft Translator
Collaborative TM
Enormous language knowledge
Microsoft Translator API
Remember the collaborative TM? There is more.
Collaboration: You, your community, and Microsoft
Your community
Your Web Site
Your App
Your TMs
Your previously translated documents
Microsoft Translator
Collaborative TM
Microsoft Translator Hub
Enormous language knowledge
Microsoft Translator API
Your custom MT system
Your collaborators
You, your community and Microsoft working together to create the optimal MT system for your terminology and style
Office 2013 Beta Send-a-smile program
§ 107 languages § 234M words translated § $22B revenue, > 60% outside U.S.
§ > 100,000 Send-a-smiles received § > 500 bugs fixed
Example of Business Intelligence use
Contacts
Web site www.microsoft.com/translator Licensing & Pricing Questions [email protected] General & Customer Questions [email protected]