Pierre CadieuxPierre CadieuxPresident, i18N Inc.President, i18N Inc.
[email protected]@i18n.ca
© 2010 i18N Inc. All rights reserved. AWG-2
Pierre CadieuxB.Sc. & M.Sc. Computer Science
• Technologist (programmer) at heart• Twenty years experience
V.P. Technology at ALIS• 12 years, starting as a programmer• Arabic/Farsi bi-di component• Licensed to Microsoft
Technology Director at Bowne• First Web Globalization Model
Technology editor, LISA newsletter
Internationalization, U. of Montréal
President, i18N Inc.
© 2010 i18N Inc. All rights reserved. AWG-3Your Multilingual Product!
SERVICES WORKSHOPS
§ Internationalization & Localization Testing§ Unicode Testing§ Testing Methods and Tools
and many more …
• Test plan development• Test case design• Outsourced multilingual testing• Multiple platforms
Testing
§ Unicode for Programmers§ C/C++ Internationalization§ Java/J2EE Internationalization§ .NET Internationalization§ Oracle Internationalization
• On-site or remote coaching • On-site development • Outsourced internationalization projects• Technologies: C/C++/C#, Java, J2EE, .NET,
XML, HTML+JS, VB, Oracle, SQL Server,...
Development
§ Internationalizing Software Architecture• Internationalization audit
Architecture
§ Managing Global Requirements§ All About Internationalization
• Globalization assessmentRequirements
© 2010 i18N Inc. All rights reserved. AWG-4
We don't do translation…
We don't do localization…
We do not compete with localization vendors,
we partner with them.
We help localization vendors
expand their service offering.
© 2010 i18N Inc. All rights reserved. AWG-5
Web Site GlobalizationSource Site
German
French Spanish Japanese
Target Site(s)
Globalization = LocalizationInternationalization
© 2010 i18N Inc. All rights reserved. AWG-6
French Spanish Japanese
Target Site(s)
Source Site
German
Japanese
SOURCE
TARGET(S)
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Globalization
German
SpanishContent Management System
Database
Text FilesMulti-
mediaFiles
French
Globalization = LocalizationInternationalization
© 2010 i18N Inc. All rights reserved. AWG-7
LocalizationCycle
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Automated Localization Cycle
WorkflowWorkflow Templates,
Jobs, Metrics
Automated
Internationalization
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-8
Linguistic AssetsTranslation memories,
glossaries, MT, ...
WorkflowWorkflow Templates,
Jobs, Metrics
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Central Data Structures
Workflow
• IP-based engine
• Anyone/anywhere
• Stores rules, steps,roles, permissions,org. charts, etc.
• Stores processknowledge
Linguistic Assets
• Stores translationknowledge
• TMs, glossaries
• Dictionaries
• Machine Translation
• Content & Context
LocalizationCycle
Automated
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-9
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Data Maintenance
Linguistic Asset
Maintenance
Workflow Management
• Process definition tool
• GUI interface + script
Linguistic Asset Maintenance
• Linguist’s console
• maintain TMs
• maintain glossaries
• configure MT
Job Management
• Management console
• Job tracking & control
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-10
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
CONTENTWORKFLOW
CONTENTWORKFLOW
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Interface
Content Interface
• Generic content access
• Various repositories- files- databases- CMS- CRM- Personalization systems
• Various locations (LAN)
Linguistic Asset
Maintenance
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-11
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
CONTENTWORKFLOW
CONTENTWORKFLOW
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Workflow Interface
Workflow Interface
• Exchange events
• Integrated Mgmt Interface
• User roles & permissions
• Collaborative workflows…
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-12
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
CONTENTWORKFLOW
CONTENTWORKFLOW
Change Detection
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Change Detection
Change Detection
• Detects changes- modified files- DB updated fields- CMS events
• May ignore non-essentialchanges
• Manual or automatic
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-13
Customer location (front-end) Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
CONTENTWORKFLOW
CONTENTWORKFLOW
Job Creation
Change Detection
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Job Creation
• Creates meaningful jobs
• Provides content & context
• Specifies target languages
• Specifies other parameters- priorities, dependencies- deadline, budget- preferred resources
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-14
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
CONTENTWORKFLOW
CONTENTWORKFLOW
Extraction
Job Creation
Change Detection
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Extraction
Extraction
• Parses content formats
• Extracts text string
• May handle attributes
• Converts to an internalformat (often XML based)
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-15
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Job Creation
Change Detection
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Segmentation & Leveraging
Segmentation & Leveraging
• Segment text (rules?)
• Analyzes content with:- translation memories- glossaries- machine translation
• Produces re-use info- count- quality
ExtractionSOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-16
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
CONTENTWORKFLOW
CONTENTWORKFLOW
Costing & Approval
Job Creation
Change Detection
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Costing & Approval
Costing & Approval
• Uses leveraging data
• Computes cost
• Generates quote
• Gets approval if required
Extraction
Segmentation& Leveraging
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-17
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
CONTENTWORKFLOW
CONTENTWORKFLOW
Costing & Approval
Work Distribution
Job Creation
Change Detection
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Work Distribution
Work Distribution
• Distribute work to officesand translators
• Vendor management- preferred translators- domains & certifications- other tasks (e.g. DTP)
• Resource leveling
Extraction
Segmentation& Leveraging
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-18
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
CONTENTWORKFLOW
CONTENTWORKFLOW
Costing & Approval
Work Distribution
Translation
Job Creation
Change Detection
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Translation
Translation
• The most important step
• Doing the work (1500-2000 words/day)
• Translator’s workbench:- translation memory- exact & fuzzy matches- terminology- concordance searches
Extraction
Segmentation& Leveraging
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-19
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
CONTENTWORKFLOW
CONTENTWORKFLOW
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Job Creation
Change Detection
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Review
Review
• Review translation
• Access to source & target
• Capacity to correct
• Language quality metrics?
Extraction
Segmentation& Leveraging
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-20
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
CONTENTWORKFLOW
CONTENTWORKFLOW
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Job Creation
Change Detection
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Functional Testing
Functional Testing
• General testing facilities:- bug reporting- bug tracking- bug resolution
• Testing environment:- customer staging site- in-house replication
Extraction
Segmentation& Leveraging
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-21
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Linguistic Asset
Maintenance
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
CONTENTWORKFLOW
CONTENTWORKFLOW
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Work Completion
Job Creation
Change Detection
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Work Completion
Work Completion
• Milestone: “work done OK”
• Automatic update ofcentral Linguistic Assets
• Perform Linguistic AssetMaintenance:- TM merge- TM partitioning & tagging
Extraction
Segmentation& Leveraging
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-22
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Linguistic Asset
Maintenance
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Work Completion
Job Creation
Change Detection
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Delivery & Notification
Delivery & Notification
• Upload to staging site(if not already done in QA)
• WHAT to deliver WHERE
• Global-local collaboration
• Security
Extraction
Segmentation& Leveraging
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-23
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Linguistic Asset
Maintenance
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Work Completion
Job Creation
Change Detection
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Billing & Collecting
Billing & Collecting
• Invoice generation
• Financial system interface
• Automated billing(customer)
• Automated payment(vendors)
Extraction
Segmentation& Leveraging
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-24
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Linguistic Asset
Maintenance
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
TARGET(S)
CONTENTWORKFLOW
CONTENTWORKFLOW
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Work Completion
Job Creation
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Archival
Job Archival
• Store workflow data• Data mining • Business intelligence:
- quality metrics- vendor ratings- financial reports- process improvement- ISO 9000 certification
Extraction
Segmentation& Leveraging
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-25
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Linguistic Asset
Maintenance
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Work Completion
Job Creation
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
The GMS Model
Extraction
Segmentation& Leveraging
(Human) Translation is still the central, most important step.
SOURCE
TARGET(S)
© 2010 i18N Inc. All rights reserved. AWG-26
Extraction
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
xml:tm
xml:tm
XML Text Memory
• LISA OSCAR
• xml:tm 1.0, February 2007• translation & author memory
embedded in document
• reduces manual operations, number of files, and cost
• streamlines localization process (see also OAXAL)
SOURCE
TARGET(S)
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-27
Extraction
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
JCR
JCR
JCR
Java Content Repository API
•JSR 170:2005, JSR 283:2009• Level 1:
- read/write content/metadata• Level 2:
- version management- events
• JCR compliant: Oracle, MS, IBM, EMC, FileNet, Vignette, OpenText, Interwoven, …
SOURCE
TARGET(S)
xml:tm
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-28
Extraction
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
JCR
JCR
Wf-XML
Wf-XML
TBX OLIF2
WfMC
TMX
Workflow Management Coalition
• Reference model: 5 interfaces
- I/1: Process definition (XPDL)
- Client applications (WAPI)
- Invoked apps (WAPI)
- Interoperability (Wf-XML)
- Audit Data Spec (v1.1, Sep '98)
• Benefit: seamless integration
• Many other orgs & standards
SOURCE
TARGET(S)
xml:tm
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-29
More Recent (Workflow) Standardization Players
© M. zur Muehlen 2003 REPRODUCED WITH PERMISSION
Business Process Management Initiative• Business Process Modeling Language (BPML)• Business Process Modeling Notation (BPMN)• Business Process Query Language (BPQL)
Electronic Business XML (ebXML)• Business Process Schedule Specification (BPSS)
OASIS• Business Transaction Protocol (BTP)
HP Labs / W3C• Web Services Conversation Languange (WSCL)
SUN/BEA / W3C• Web Services Choreography Interface (WSCI)
IBM• Web Services Flow Language (WSFL)
Microsoft• XLANG
DARPA• DARPA Agent Markup Language – Services (DAML-S)
Business Process Execution Language for Web Services(BPEL4WS)
© 2010 i18N Inc. All rights reserved. AWG-30
Process-related Standards
© M. zur Muehlen 2003 REPRODUCED WITH PERMISSION
© 2010 i18N Inc. All rights reserved. AWG-31
Extraction
Customer location Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
JCRXLIFF
JCR TBX OLIF2
XLIFF
TMX
XML Localization InterchangeFile Format
• v1.2 Feb 2008; v2.0 on-going
• Stores extracted text + skeleton (mostly for software)
• Can also store: - binary objects (arbitrary files)- project data, history, phase- fuzzy and exact matches- etc.
• Represents job as it happens
SOURCE
TARGET(S)
Wf-XML
Wf-XML
xml:tm
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-32
Extraction
Customer location Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
JCRXLIFF
TWS
JCR TBX OLIF2
TWS
TMX
Translation Web Services(spec 1.0.3, May 2007)
• moribund; IDIOM was driver• QuerySupport
- services: xlat, DTP, eng.- language pairs
• GetQuote & AcceptQuoteor ProcessJob
• Delivery & Notification• TM upload & download• Security
TWS
TWS
SOURCE
TARGET(S)
Wf-XML
Wf-XML
xml:tm
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-33
Extraction
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
JCRXLIFF
TWS
JCR TBX OLIF2
Outside In
TMX
TWS
TWS
SOURCE
TARGET(S)
Outside In
Oracle Outside In(ex-STELLENT, ex-INSO)
• De facto industry standard
• TextAccess extracts text
• ContentAccess more general
• Supports 260+ file formats
• Used in Stellent QuickView
• UNI-DIRECTIONAL
i.e. Does not put text back in!
Wf-XML
Wf-XML
xml:tm
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-34
Extraction
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
JCRXLIFF
TWS
JCR TBX OLIF2
ITS
TMX
TWS
TWS
SOURCE
TARGET(S)
ITS
Internationalization Tag Set
• W3C
• very powerful rules to specify localization info for XML docs; identifies:
• Translatable elements & attributes (to be avoided)
• Inline vs. block elements
• Sub-flows (e.g. inline footnotes)
Outside In
Wf-XML
Wf-XML
xml:tm
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-35
Extraction
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
JCRSRXXLIFF
TWS
JCR TBX OLIF2
SRX
TMX
Segmentation Rules eXchange
• LISA OSCAR
• SRX 1.0:2004; SRX 2.0:2008
• Uses regular expressions
• Rules/exceptions per language
• Segmentation WG objectives:- describe segmentation rules- standardize segmentation- standardize word counting
TWS
TWS
SOURCE
TARGET(S)
ITS
Outside In
Wf-XML
Wf-XML
xml:tm
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-36
Extraction
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
JCRXLIFF
TWS
JCR
TMX
TBX OLIF2
TMX
TMX
Translation Memory eXchange
• TMX 1.4b:2005; 2.0 on-going
• TM interchange
• Level 1: Plain Text Only
• Level 2: Text + Markup• Supported by SDL/TRADOS,
Catalyst, GlobalSight, STAR,Google Translate, etc.
• TMX logo certification
SRX
TWS
TWS
SOURCE
TARGET(S)
ITS
Outside In
Wf-XML
Wf-XML
xml:tm
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-37
Extraction
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
JCRXLIFF
TWS
JCR
TMX
TBX OLIF2
TMX'
SRX
TWS
TWS
SOURCE
TARGET(S)
ITS
Outside In
Wf-XML
Wf-XML
xml:tm
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-38
Extraction
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
JCRXLIFF
TWS
JCR TBX OLIF2
TBX
TMX
TermBase eXchange
• LISA TBX=ISO 30042:2008• Terminology interchange• Concept-oriented
(language-independent)• Multilingual, non-directed• See also TBX-Basic
SRX
TWS
TWS
SOURCE
TARGET(S)
ITS
Outside In
Wf-XML
Wf-XML
xml:tm
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-39
Extraction
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
JCRXLIFF
TWS
JCR TBX OLIF2
OLIF
TMX
Open Lexicon Interchange Format(v2.1:2005; v3 Beta available)
• Lexical / terminological exchangefocus on NLP lexicons and MT
• Directed & bilingual• "Word-sense" oriented
(concept within language)
• MT related info: test and actions
SRX
TWS
TWS
SOURCE
TARGET(S)
ITS
Outside In
Wf-XML
Wf-XML
xml:tm
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-40
Extraction
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
JCRXLIFF
TWS
JCR TBX OLIF2
GMX-V
TMX
SRX
TWS
TWS
SOURCE
TARGET(S)
GMX-VGlobal content Metric-Volume
• LISA OSCAR v1.0:2007• Word & character counts• Multiple text units per file that
can be translatable or not• Multiple count categories:
- Inline element count - linking inline element count- numeric & measurements- punctuation, protected, etc.
ITS
Outside In
Wf-XML
Wf-XML
xml:tm
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-41
Extraction
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
JCR
SAEJ2450
XLIFF
TWS
JCR TBX OLIF2
SAE J2450
TMX
SRX
TWS
TWS
SOURCE
TARGET(S)
SAE J2450• Translation Quality Metric
for the automotive industry
• Seven error categories: wrong term, syntax, omission, agreement, spelling, punctuation, miscellaneous.
• Each can be minor or serious
• Two assignment rules
• Numerical weights
• Now in LISA QA Model app.
GMX-V
ITS
Outside In
Wf-XML
Wf-XML
xml:tm
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-42
Extraction
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
JCR
SAEJ2450
XLIFF
TWS
JCR TBX OLIF2
LISA QA Model
TMX
LISA QA Model
• v3.1 released jan 2006
• Quality metric for GILT:- online help, manuals, software
• Covers:- linguistic testing- GUI validation
• Windows application:- Extra-fast data entry- Reporting
SRX
TWS
TWS
SOURCE
TARGET(S)
GMX-V
LISA QA Model
ITS
Outside In
Wf-XML
Wf-XML
xml:tm
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-43
Localization Model and Standards
Extraction
Linguistic AssetsTranslation memories,
glossaries, MT, ...
Job Management
Workflow Management
WorkflowWorkflow Templates,
Jobs, Metrics
Billing & Collecting
Delivery & Notification
CONTENTWORKFLOW
CONTENTWORKFLOW
Segmentation& Leveraging
Costing & Approval
Work Distribution
Translation
Review(Edit&Proof)
Functional Testing
Change Detection
JobArchival
Internationalization
Content Management System
Database
Text FilesMulti-
mediaFiles
Content Management System
Database
Text FilesMulti-
mediaFiles
Job Creation
Linguistic Asset
Maintenance
Work Completion
JCR
SAEJ2450
XLIFF
TWS
JCR TBX OLIF2
TMX
SRX
TWS
TWS
SOURCE
TARGET(S)
LISA QA Model
© 2010, i18N Inc. Developed by P. Cadieux: [email protected]
GMX-V
ITS
Outside In
Wf-XML
Wf-XML
xml:tm
xml:tm
© 2010 i18N Inc. All rights reserved. AWG-44
For More Info On Standardshttp://www.lisa.org/XML-Text-Memory-xml.107.0.htmlxml:tm
http://www.lisa.org/Global-information-management-Metrics-eXchange-Volume-GMX-V.105.0.htmlGMX-V
http://www.lisa.org/Segmentation-Rules-e.40.0.htmlSRXhttp://www.oasis-open.org/committees/trans-wsTWShttp://www.oasis-open.org/committees/xliffXLIFF
http://jcp.org/en/jsr/detail?id=283JCR
http://www.olif.netOLIF
http://www.lisa.org/LISA-QA-Model-3-1.124.0.htmlLISA QA Model
http://standards.sae.org/j2450_200508SAE J2450
http://www.lisa.org/Term-Base-eXchange.32.0.htmlTBXhttp://www.lisa.org/Translation-Memory-e.34.0.htmlTMX
http://www.wfmc.org/wfmc-wf-xml.htmlWf-XML
All links verified valid on October 2, 2010.
Top Related