Practical Visualization of ITS 2.0 Categories for Real World Localization Process

19
(C) 2013 Logrus International (C) 2013 Logrus International Practical Visualization of ITS 2.0 Categories for Real World Localization Process Part of the Multilingual Web-LT Program

description

Practical Visualization of ITS 2.0 Categories for Real World Localization Process. Part of the Multilingual Web-LT Program. WHAT IS ITS and why it’s so important. - PowerPoint PPT Presentation

Transcript of Practical Visualization of ITS 2.0 Categories for Real World Localization Process

Page 1: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

Practical Visualization of ITS 2.0 Categories for Real World Localization Process

Part of the Multilingual Web-LT Program

Page 2: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

WHAT IS ITS AND WHY IT’S SO IMPORTANT The Internationalization Tag Set (ITS) is a set of attributes and elements designed to

provide internationalization and localization support in XML and HTML documents. It also defines implementations of these concepts

XML developers can use this namespace to integrate internationalization features directly into their own XML schemas and documents

The set is currently almost ready/frozen

We believe that this is a one of the key standards for localization industry The set includes a number of categories of crucial importance to translators:

Terminology note and Localization Note metadata Translate (yes/no) metadata to mark non-translatable text

ITS metadata make it possible to include various instructions for translators into documents, add terminology and comments, and mark non-translatable segments

Will reduce inconsistency in adding translation instructions to documents Provides a universal interface for transferring translation metadata between tools

Page 3: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

WHY ARE WE DOING THIS: DETAILS

To make it possible to comment translatable content irrespective of its nature To make these instructions easily accessible to translators and editors

Including recommendations, instructions, terminology suggestions Independent from translation tools

Saving time: The text is already marked with context information One doesn’t have to think whether smth. NEEDS TO BE TRANSLATED or not One doesn’t have to think whether smth. IS A TERM or not

Key advantages/improvements: Time (i.e. cost) Quality (fewer translation errors) Also very important for machine translation applications (post-editing in

context)

Page 4: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

WHY ARE WE DOING THIS: WORKFLOW PARADIGM CHANGE

FROM: Bulk manual translation of “raw” content or post-editing “raw” machine-translation

output When external terminology glossaries, localization instructions and reference data are

matched with content in indirect manner mostly in translator’s brain on-the-fly and to the extent of his/her understanding of these instructions and personal skills

TO: Using natural language processing (NLP) tools and ITS metadata markup to pre-

populate content to be translated or post-edited with context-related information When external terminology glossaries, localization instructions and reference data are

matched with content directly through automated process of preliminary linguistic analysis Pre-processing is controlled by dedicated qualified linguists/terminologists/editors

PROVIDED THAT: Glossaries, instructions and reference data are converted into format compatible with NLP

tools and ITS markup And corresponding content searching algorithms are created (including fuzzy algorithms)

Page 5: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

WHAT IS BEING DEVELOPED ITS 2.0 implementation project, a part of the Multilingual Web-LT program funded

by EU Developing the ITS Browser Plugin as a building block of future “Work In Context

System” (WICS) Making it possible to view standard ITS (Internationalization Tag Set) translation-

related metadata contained in XML, XLIFF, or HTML files Can be done in parallel with translating using CAT tools or for reviewing materials The JavaScript plugin would support most popular browsers For previewing XML or XLIFF, standalone filters for conversion into HTML will be

used Implementation:

Standard-based preview solution: HTML5, Java Script, Web browser A script located in the same folder as HTML files The script is started by the browser automatically It is expected that both scripts and filters will be publicly available

Page 6: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

THE PROJECT IDEA

ITS metadata-enriched XML or XLIFF files: what’s inside?

Previewing ITS metadata in Web browser while translating content in any CAT tool

Standard-based preview solution: HTML5, Java Script, Web browser

Next step: ITS metadata as a carrier for localization instructions and any reference data

Page 7: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

THE WORK BREAKDOWN: PROJECT COMPONENTS

Visual designs Java scripts to render and navigate metadata and content Rich sample files Content format conversion algorithms:

XML+ITS -> HTML5+ITS* XLIFF+ITS -> HTML5+ITS* XML+ITS -> XLIFF+ITS (just an example) HTML+ITS -> HTML5+ITS*

* For the purposes of visualization, some redundant ITS syntax options for HTML are not supported.

Page 8: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

THE PROJECT CORE: VISUAL DESIGNS

Screen space limitations in localization process:

Page 9: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

THE PROJECT CORE: VISUAL DESIGNS (CONT.)

Collapsed view of metadata

Page 10: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

THE PROJECT CORE: VISUAL DESIGNS (CONT.)

Expanded view of metadata

Page 11: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

THE PROJECT CORE: VISUAL DESIGNS (CONT.)

Summary view of metadata

Page 12: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

THE PROJECT CORE: VISUAL DESIGNS (CONT.)

Color highlighting to indicate metadata linked to content

Page 13: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

THE PROJECT CORE: VISUAL DESIGNS (CONT.)

Visual “tags” to indicate metadata linked to content

Page 14: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

THE PROJECT CORE: VISUAL DESIGNS (CONT.)

Visual tags to highlight metadata (example)

Page 15: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

DEVELOPMENT STATUS

Sample files: to be completed by end of May File conversion algorithms: to be completed by Sep 30: XML+ITS -> XLIFF+ITS (July) (sample) XML+ITS -> HTML5+ITS (August) HTML+ITS -> HTML5+ITS (August) XLIFF+ITS -> HTML5+ITS (September) Visualization scripts: to be completed by end of June

Page 16: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

KNOWN ISSUES: FORMAT CONVERSIONS

“Translation” of XPath expressions from source XML to target HTML

XLIFF: MRK element to be used instead of SPAN Selection between SPAN and DIV elements in output HTML Merging external ITS rule files into internal list of rules

Page 17: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

KNOWN ISSUES: METADATA VISUALIZATION

Parsing local standoff markup along with other rules Parsing list of merged ITS rules Hyperlinks embedded in metadata Static definitions like “Do not translate” for Translate category Highlighting active ITS item Displaying summary of all ITS items Parsing nested ITS metadata Differences in Java Script implementation between browsers Navigation through content and ITS items Fragmentation of content to avoid large pieces of text to be

displayed

Page 18: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

LIVE DEMO

The demo samples are built on the preliminary versions of visual designs and illustrate just a few ITS data categories:Localization NoteTerminologyTranslate

Page 19: Practical Visualization of ITS 2.0 Categories for Real World Localization Process

(C) 2013Logrus International

(C) 2013Logrus International

THANK YOU!

Questions?