New Breakthroughs in Machine Transation Technology

25
No Hardware. No Software. No Hassle MT. New Breakthroughs in Machine Translation Technology in association with #KantanWebina r

description

Tony O’Dowd takes us through some of the most innovative technologies offered on the KantanMT.com platform which are helping a growing community of KantanMT users to develop and self-manage custom Machine Translation engines in the cloud. Maxim Khalilov then illustrates bmmt’s journey with Machine Translation on KantanMT. He discusses what they have achieved so far in terms of MT engine development and showcases the value that his team is bringing to their growing international client base through the use of Machine Translation.

Transcript of New Breakthroughs in Machine Transation Technology

Page 1: New Breakthroughs in Machine Transation Technology

No Hardware. No Software. No Hassle MT.

New Breakthroughs in Machine Translation Technology

in association with#KantanWebinar

Page 2: New Breakthroughs in Machine Transation Technology

KantanMT.ComNO HARDWARE. NO SOFTWARE. NO HASSLE MT

Tony O’DowdFounder & Chief Architect

New Breakthroughs in Machine Translation Technology

Page 3: New Breakthroughs in Machine Transation Technology

What we aim to cover today?

What is KantanMT.com?

Challenges of the L10N Industry Making the right Project Management decisions

Going beyond the baseline of MT quality

Conclusions15 minutes

Page 4: New Breakthroughs in Machine Transation Technology

What is KantanMT.com?

Statistical MT System Cloud-based =

Highly scalable

Inexpensive to operate

Quick to deploy

Our Vision To put Machine Translation:

Customization

Improvement

Deployment

…into your hands

Active KantanMT Engines

6,191

Training Words Uploaded

28,243,234,615Member Words

Translated

427,526,741

Fully Operational 15 months

Page 5: New Breakthroughs in Machine Transation Technology

Initial Steps of any project are: Determine Scope

How long will it take?

How much will it cost?

What is my margin?

Determine resources How many Translators will I need?

Introducing KantanAnalytics™ …think Fuzzy-Match report and you’ve got it in one!

Challenge #1

How can Project Managers ‘manage’ Post-Editing Projects?

Page 6: New Breakthroughs in Machine Transation Technology

KantanAnalytics™

Kantan TotalRecall – Advanced TM

% of TM hits in this job

KantanMT – automated translations

% of automated translations for this job

Range of QE ScoresQE range defined to match existing fuzzy match ranges used by L10N industry

Quality Estimation ScoresSegment level QE scores – akin to fuzzy match scores

Word Counts – Project Stats

Can be used to develop Project TimeLine and Tiered Pricing Model for Post-Editing Projects

Placeholder & Tag CountsUsed by PM for complexity sur-charges

KantanAnalytics embeds QE scores into

TRADOS Studio

MemoQ

XLIFF

Page 7: New Breakthroughs in Machine Transation Technology

KantanAnalytics™

Helping PMs make the right business decisions!

Page 8: New Breakthroughs in Machine Transation Technology

KantanAnalytics™ - Helping PMs make the right decisions

Page 9: New Breakthroughs in Machine Transation Technology

Challenge #2: Going beyond the baseline and

developing production ready MT!

Easy to build 1st baseline engine Aggregate Training Data – TM, Mono, Stock,

Terminology

Use Cloud-based platform, like KantanMT.com

Real Challenge: How do these platforms go beyond the baseline

engine and achieve higher levels of production quality

Introducing Kantan BuildAnalytics Data analytics and visualisation providing insights

into the customisation of SMT engines.

Page 10: New Breakthroughs in Machine Transation Technology

Kantan BuildAnalytics™Rapidly develop production ready engines

Summary Report

Training Rejects Reports

F-Measure Analysis

BLEU Analysis

TER Analysis

GAP Analysis

Timeline Report

Deep Tuning

Page 11: New Breakthroughs in Machine Transation Technology

Kantan BuildAnalytics™

F-Measure ScoreMeasures word recall & precision of KantanMT engines

DistributionsProvides distribution of F-Measure scores across all reference translations

Kantan Insight™Holistic analysis of score and advice on how to improve this for KantanMT engines

Detailed Analysis Segment level F-Measure analysis to help SMT Developers improve training material

Page 12: New Breakthroughs in Machine Transation Technology

Kantan BuildAnalytics™

Detailed Reports for: F-Measure, BLEU and TER

Page 13: New Breakthroughs in Machine Transation Technology

Kantan BuildAnalytics™

Gap Analysis – quickest way of improving fluency

Page 14: New Breakthroughs in Machine Transation Technology

Kantan BuildAnalytics™

Training Rejects Report – Improve training data rapidly

Page 15: New Breakthroughs in Machine Transation Technology

Kantan BuildAnalytics™

Timeline – Tracks history of KantanMT engines

Page 16: New Breakthroughs in Machine Transation Technology

Kantan BuildAnalytics™ - Rapid MT Customisation

Page 17: New Breakthroughs in Machine Transation Technology

bmmt GmbH and KantanMT:

The Real-World Use of Machine Translation

Maxim KhalilovTechnical Lead

bmmt GmbH

[email protected]

KantanMT webinarApril 10, 2014

Page 18: New Breakthroughs in Machine Transation Technology

MT in industry: context and rationale

The combination of these two technologies, well-established TM and cutting-edge MT, plus post-editing allows the creation of a high-quality translation that reads just as well as a “classically” produced translation.

Page 19: New Breakthroughs in Machine Transation Technology

MT in industry: what about cost?

The cost structure changes when machine translation is integrated into the translation pipeline. When machine translation is adopted, the data preparation and quality assurance (editing) costs rise whereas translation costs fall to as low as zero. Most importantly, the total cost of translation is reduced dramatically as illustrated.

Page 20: New Breakthroughs in Machine Transation Technology

MT case study

Customer: big German machine manufacturer

Project: 51,000 words, technical documentation. English into German. Approach: hybrid MT/TM.

Settings: the files were processed through Trados Studio 2011.

Implementation: KantanMT

Description: Roughly 7,000 words came from TM as high matches. The remainder went through MT-based pretranslation, followed by a post-editing cycle, with the overall goal to produce the same level of quality as in an all-human translation.

Training material: Our customer had not worked in this language combination before, so there was no TM to go on. But we knew that the English authors based their work on material that the customer had previously translated from German into English. Thus we reversed the language direction of the TM and trained a customer-specific engine with this TM.

Results: As a result, 44,000 words were post-edited to a final quality level that the customer was very happy with.

Cost savings > 30%.

Page 21: New Breakthroughs in Machine Transation Technology

MT: benefits of KantanMT solution

Fully automated system training

One-click system customization

Automatic data pre-processing

Fully automated translation

Automatic pre- and post-processing

Quality assessment

KantanWatch

Gap Analysis

Reject Report

No worry about maintenance and infrastructure

Page 22: New Breakthroughs in Machine Transation Technology

MT: benefits of KantanMT solution

Transparent file format conversion

Training material conversion: TM conversion, monolingual material

Documents to translate: TMS format into MTable format

SDLXliff

Smooth terminology integration

Consistent terminology

Tag handling and mark-up transfer

Source: <x id="16480"/>SWord1 SWord2 SWord3 SWord4 <g id="16481">Number</g><g id="16480">SWord 8 SWord 9</g>

Target: <x id="16480"/>TWord1 TWord2 TWord3 TWord4 <g id="16480">TWord 8 TWord 9</g><g id="16481">Number</g>

Page 23: New Breakthroughs in Machine Transation Technology

bmmt GmbH

Founded in 2013 by a group of language industry experts who wanted to offer innovative translation technology solutions

Three operations centers in Germany: Munich, Berlin and Stuttgart

bmmt GmbH heavily relies on KantanMT services from 2013

Primary industries: Automotive and Trucks, Machine Engineering, Telecomunications, Construction, IT

Types of documents: workshop texts, product catalogues & other highly repetitive information documents

Primary source language: German

Integration: SDL Trados, SDL WorldServer and others

Find more: www.machine-translation.eu

Page 24: New Breakthroughs in Machine Transation Technology

BerlinAlt-Moabit 9210559 BerlinPhone: +49 30-3117505-15Fax: +49 30-3117505-20

MunichBernhard-Wicki-Straße 580636 MunichPhone: +49 89 2000037-17Fax: +49 89 2000037-11

StuttgartRuppmannstraße 33b70565 StuttgartPhone: +49 711 16646-66Fax: +49 711 16646-50

bmmt [email protected]

Thank you

Page 25: New Breakthroughs in Machine Transation Technology

No Hardware. No Software. No Hassle MT.

New Breakthroughs in Machine Translation Technology

in association with#KantanWebinar

Tony O’Dowd, [email protected] Khalilov, [email protected]

Speakers