The Case for Connectivity - GALA Global | Globalization and … › sites › default › files ›...

43
The Case for Connectivity David Filip, ADAPT Centre, Trinity College Dublin Klaus Fleischmann, Kaleidoscope, GALA Board Member Serge Gladkoff, GALA Ambassador, Logrus Global TAPICC Standards Initiative

Transcript of The Case for Connectivity - GALA Global | Globalization and … › sites › default › files ›...

Page 1: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

The Case for Connectivity

David Filip, ADAPT Centre, Trinity College Dublin

Klaus Fleischmann, Kaleidoscope, GALA Board Member

Serge Gladkoff, GALA Ambassador, Logrus Global

TAPICC Standards Initiative

Page 2: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Introduction Serge Gladkoff

GALA Ambassador

Logrus Global

Klaus Fleischmann

GALA Board of Directors

Kaleidoscope GmbH

David Filip Trinity College Dublin – ADAPT

Dr. David Filip is the Convener/Chair of the OASIS XLIFF OMOS TC. He also serves as the Liaison Officer, Secretary and Editor of OASIS XLIFF TC, XLIFF TC Liaison at Unicode Localization Interoperability (ULI) TC, Advisory Editorial Board Member for the MultiLingual Magazine, Programme Committee Member for the ASLING Translating and the Computer Conferences, Co-moderator Standards & Interoperability IG at JIAMCATT, NSAI expert at ISO TC 37 SC3 and SC5, ISO/IEC JTC1 WG9, WG10, SC38, TBX Steering Committee member, TAPICC Steering Committee member.

Page 3: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Agenda

General Intro, Project

Status

(Short) Q&A

Track 1

Workshop

Tracks 2-4 Overview

Q&A

Page 4: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

TAPICC Goal

Enable Innovation and Growth Through

• Common base standards

• Interoperability

• Automation

• Collaboration

• Seamless and valid data exchange

Page 5: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Interoperability

Page 6: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Disclaimer I know there are less than 380

lines on the screen, but you get the idea.

But…

Page 7: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

But…

Things change… …and become very costly to maintain

• For clients

• For LSPs

• For tools vendors

Page 8: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

myTool myWorkflow

The Vision: Standard APIs

1000+ IT firms

1500+ CMS systems

Page 9: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Breakdown of Steps

Find, categorize and prioritize use cases

Find out what solutions already exist

Make this information retrievable

Harmonize business (meta)data models

Create implementable classes

Useful deliverables to GALA and the industry

Page 10: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Properties

Community-Driven

Open-Source Legal Base

Interactive and collaborative

Administered by GALA

Pre-Standardization

Level

Grounded in XLIFF and UBL

Page 11: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Setup

Deliverables

• Categorized resource catalog

• Data model mapping

• API Classes

Mode

• Open Source

• Steering Committee

• Subcommittees

• Community participation

Page 12: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Scope

Track 1

• Supply Chain Automation

• Business metadata

model

• Payload standardization

Track 2

• Transfer of localizable content on

segment / unit level

• Between localization or

other tools

Track 3

• Markup / Enrichment of

localizable content

• TM Matches, MT Output,

Terminology, „Good enough“ layout, QA data

etc.

Track 4

• Enable a high-fidelity

rendering of layout

information

• To allow in-layout

translation

Page 13: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Benefits

Learning materials

Common data model to reuse

Consultations about XLIFF and CLDR, API and software

Sample code

Implementable classes

Savings on R&D

Automation and Interoperability

For GALA members, industry and project participants

Page 14: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Questions so far?

Page 15: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

TRACK ONE WORKSHOP

Look at Status of Tracks 1-3

Discuss required business metadata for a „New Basics“

common model

Define concrete subtasks

Find a "buy-side" steering

committee member

Page 16: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

What happened so far…

September

Project Statement

Initial Call

Steering Committee

Infrastructure (Connect,

github)

Kickoffs in Montreal and

Stuttgart

Definition of „tracks“

OpenSource Policy and

Project Charter

Compiled legacy data:

COTI, XLIFF, TIPP,

Linport, STS

https://www.gala-global.org/publications/translation-api-class-and-cases-project-statement-tapicc

https://www.gala-global.org/tapicc-legal-agreement

Special thanks to James

Bryce Clark, OASIS

Page 17: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

The Plan for Track 1

Amsterdam: Brainstorm Medata

Then: harmonize and summarize

Create a draft data model / parameter set with

canonical names and datatypes

Create prominent

serialization (JSON, XML…)

Finalize data model as first

deliverable (eg in

MuleESB)

Defining “the least common denominator ” of universal data model is key.

Page 18: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

The overall landscape What exists already and can be leveraged? We don´t want to reinvent wheels.

XLIFF &OMOS COTI, CMIS TIPP, LINPORT,STS

CLDR Many existing

APIs Enthusiastic participation

Page 19: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

XLIFF Bitext Payload Standard - https://www.oasis-open.org/committees/xliff

CLDR http://cldr.unicode.org/

• The Unicode CLDR provides key building blocks for software to support the world's languages, with the largest and most extensive standard repository of locale data available

We will not reinvent the wheel

CMIS Content Management Interoperability Services

DQF / MQM issue typology

Page 20: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

TIPP Based on XLIFF for Payload, plus Metadata and Workflow https://code.google.com/archive/p/interoperability-now/

• Envelope concept

• Manifest XML contains all metadata – manifest.xml

• Package Object Container (Payload XLIFF) – resources.zip

• Requests and Responses

• Specifies tasks to be completed and responses to expect

• Strict XLIFF or generic

• Business Metadata (STS)

• Source content language, audience, complexity…

• Target content language, register, layout…

• Production tasks prepare, translate…

• Environment technology, references…

• Relationships permissions, submissions, expectations

Only one task per language?

Only one target language?

Implementations?

No API Package definition only

Page 21: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

LINPORT Based on XLIFF for Payload, plus Metadata and Workflow - http://www.linport.org/

• Bilingual Task-level package

• STS as Business Metadata

• General Title, creator, date, identifier, contributor, rights

• Source Language, ID, type, audience, purpose, subject, term, volume, complexity, status

• Target Language(s), audience, purpose, content correspondence, term, format, guide, register...

• Workflow Layout, preparation, initial, quality check, technology, reference, workplace, copyright…

• Business Qualification, delivery, deadline, compensation, communication

Not a very flexible data model. Same info for all

files / languages. Implementations?

No API Package definition only

Page 22: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

COTI German CMS manufacturers: DERCOM - http://www.dercom.de/projekte

• Most complete and adopted project

• 3 Levels

• Level 1: Simply wrapper file for exchange (control file and payload)

• Level 2: Automated transfer of level 1

• Level 3: Concrete API workflow (SOAP-based)

• API describes

• Workflow: Create, start, finish, but also reject, cancel, update,

• Status updates: get/change metadata, report, download document

• Secure data transfer

• Issues

• Generic payload, not XLIFF or CMIS based

• Little „localization“ metadata, but expandable

Extensive API SOAP?

No standard payload, little L10N metadata

Implemented and in use

Page 23: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

STS - Structured Translation Specification http://www.ttt.org/specs, 21 parameters in a tree, human readable.

Source content

• Textual characteristics

• Source language

• Text type

• Audience

• Purpose

• Specialized language

• Subject field

• Terminology

• Volume

• Complexity

• Origin

Target content

• Target language information

• Target language

• Target terminology

• Audience

• Purpose

• Content correspondence

• Register

• File format

• Style

• Style Guide

• Style relevance

• Layout

Production Tasks

• Typical tasks

• Preparation

• Translation

• QA

• Additional tasks

Environment

• Technology

• Reference material

• Workplace

Relationships

• Permissions

• Copyright

• Recognition

• Restrictions

• Submissions

• Qualifications

• Deliverables

• Delivery (by)

• Deadline

• Expectations

• Compensation

• Communication

Page 24: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Plus the crowd already out there

1000+ IT firms

1500+ CMS systems

Page 25: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Track 1 – Supply Chain Automation Two levels make sense. But which info goes on which level?

Package level metadata

"Payload" metadata

Page 26: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Package-Level Data "Least common denominator" and extensibility

Existing

• Creator, Date&time, Title, Contact info, Tool, Organization

• Source and target languages, deadline -> Or should this be on the payload level?

• Task information: Type, additional tasks

• Description, Comment

• Source and target terminology &TM data

• Copyright, digital signature,

• Response creator

What else?

• MT admissibility and selection? Engine type?

• Monolingual or multilingual source/target?

• Project participants? (Translators, PMs...)

• Source system attributes ( ID?)

• Workflow-relevant metadata? Quote vs. Project?Different project types?

• Custom metadata?

Page 27: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Payload-Level Data "Least common denominator" and extensibility

Existing

• Creator, Date&time, Title, Tool etc? Or only on package level?

• Source text information: Subject, reference,target audience, register etc.

• Target text information

What else?

• Commercial information: Volume, payment info, analysis data?

• Source system attributes ( ID?)

• Reference external data (TMs, termbases, MT engine data)

• Quality requirements, QE metadata, DQF metadata ?

• Custom metadata?

• Automation information (Error handling, subtasks...)

Page 28: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Your turn now!

Page 29: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Tracks 2-4: What’s going on..

• XLIFF 2.1

• XLIFF OMOS – OM & JLIFF, TBX <->XLIFF mapping

• TBX revision at ISO

• ULI starts a new work item on wordcount and similarity algorithm

• FREME an NLP service framework

• UBL a major business document exchange standard

29

Page 30: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

TAPICC tracks

• 1) business metadata (see above)

• 2) real time unit level exchange between CAT tools (all sorts)

[XLIFF OMOS OM & JLIFF]

• 3) real time enrichment of bitext units with metadata

• Including matches [XLIFF OMOS - TMX successor],

• terminology [XLIFF OMOS – TBX mapping, FREME],

• entity disambiguation [FREME],

• error and QA reporting [XLIFF 2.1, FREME], you name it..

• 4) Real time previews of translated content in native

• Currently on the back burner

• Possibly XLIFF 2.2, some capability in XLIFF 2.0 already

30

Page 31: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

FREME: A content enrichment framework

• https://freme-project.github.io/community/

• GitHub repos

• https://github.com/freme-project

• Apache license

31

Page 32: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

FREME: Design of the framework

• Client makes a Web service request.

• The broker evokes the actual e-Service.

• The e-Services are part of the server (e.g. e-Entity), or provided externally (e.g. e-Translation).

• Supportive modules provide conversion of digital content formats or pipelining of services (e.g. e-Terminology followed by e-Translation)

• FREME = a framework, not a platform: modular approach & ease of extensibility

Page 33: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

FREME: All You need is standards

• HTTP to make web service requests

• No dependency on a given programming language

• Standards to represent enrichment information

• See next slide

• Write a wrapper for your existing tools to enable them to produce & consume the enrichment information

• Enable distributed data and language technology services

Page 34: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

XLIFF OMOS TC – Quick Facts • Convened 8th Dec 2015, made great progress since

• Members: BYU, ENLASO, Hoyos Labs, Genivia, Intel, LRC, Microsoft, SDL, Spartan Software, TCD – ADAPT, UNIGE, Vistatec, WIPRO, +2 Individual

• Charter https://www.oasis-open.org/committees/xliff-omos/charter.php

• Purpose – Even more interoperability, NOT ONLY through XML. Take the data model to new environments. Facilitate roundtrip among XML and JSON pipelines and more..

• Scope/Deliverables

• Abstract Object Model for XLIFF 2 (XLIFF 2.x) https://github.com/oasis-tcs/xliff-omos-om/

• JSON Serialization of that model https://github.com/oasis-tcs/xliff-omos-jliff

• TBX mapping, TMX next

• Etc.

• IPR Mode – Non-Assertion

34

Page 35: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

XLIFF OMOS TC – Membership

35

Bryan Schnabel, Felix Sasaki

Page 36: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

XLIFF OMOS TC – Charter – Scope/Deliverables

https://www.oasis-open.org/committees/xliff-omos/charter.php

• Purpose – Even more interoperability, NOT ONLY through XML. Take the data model to new environments. Facilitate roundtrip among XML and JSON pipelines and more..

• Scope/Deliverables

• Abstract Object Model for XLIFF 2 (XLIFF 2.x) – serialization independent

• JSON Serialization of that model -> JLIFF 1

• TMX next – major new version with inline data model consistent with XLIFF 2

• Mappings, TBX Basic mapping of XLIFF 2 gls:

• APIs

• Reference Architectures, SOA, ESB ..

36

Page 37: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

XLIFF OMOS TC – Charter – IPR Mode

IPR Mode

Non-Assertion

A progressive IPR mode, a kind of RF but even more easy on both IP owners and implementers. Great for Open Source adoption!

No need to negotiate RF licensing conditions for essential use of IP in standards implementations

37

Page 38: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

XLIFF OMOS TC – Charter – Audience

Audience

• Multilingual content and software architects and strategists, multilingual content publishers

• GILT services architects and developers

• Content owners and managers that seek to publish their content in multiple localized versions

• Software providers for internationalization, localization, and translation tools and processes, including language technology components

• Technical communicators employing localization tools and processes for multilingual publishing of their content

• Localization service providers who need to interact seamlessly with localizable and localized content of their customers

38

Page 40: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

XLIFF OMOS TC – Audience – FEISGILTT http://locworld.com/feisgiltt2016-cfp/

7th XLIFF Symposium

• Hot topics

• XLIFF Object Model

• XLIFF in JSON

Federated Interoperability Track

1st TMX Symposium

• Is this the time for TMX 2.0? If you are a stakeholder in the TMX community heavily relying on TMX 1.4b, we want to hear your feature wishlist. TMX related submissions can be proposed on the FEISGILTT EasyChair https://easychair.org/conferences/?conf=feisgiltt2016

40

Page 41: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

XLIFF OMOS TC – Audience - Join

Public TC page

https://www.oasis-open.org/committees/tc_home.php?wg_abbrev=xliff-omos

• Join

• Send a comment

https://www.oasis-open.org/committees/comments/index.php?wg_abbrev=xliff-omos

• Publicly archived mailing lists

https://lists.oasis-open.org/archives/xliff-omos/

41

Page 42: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

Questions?

Page 43: The Case for Connectivity - GALA Global | Globalization and … › sites › default › files › ... · 2019-12-31 · • GILT services architects and developers • Content owners

THANK YOU Want to Contribute? Discussion

Join the Connect Group: www.gala-

global.org/tapicc -> TAPICC group

Submit information on projects we have missed

so far via the Connect Group

Submit information on projects we have missed

so far via the Connect Group

Participate in the public review of our

output

Participate in collaborative R&D and

content creation, contribute code