Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization...

49
Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards

Transcript of Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization...

Page 1: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Addison Phillips, Chair

W3C Internationalization WG

Towards the Promised Land:

Globalization Developments in Web Standards

Page 2: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Presenter

• Globalization Architect, Amazon Lab126 We make the Kindle

• Chair, W3C Internationalization WG

Page 3: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Acknowledgements

• This presentation owes much of its content to these contributors:– Richard Ishida (W3C International Activity lead)– Felix Sasaki (W3C MLW-LT)– Aharon Lanin (Google, bidi maven)– Norbert Lindenberg (ES-I18N)– Koji Ishii (Rakuten)

Page 4: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

The Web: vastly improved or room for improvement?

• Why “the promised land”?The promise of a multilingual Web is being realized and new W3C specifications help demonstrate that.Many features are implemented.

• Why only “towards”We’ve waited a long time.Many features we’ll talk about today are not implemented yet or are only partially implemented.

Page 5: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

• What issues are more or less solved on the Web?

• What are we doing to address the remaining problems?

• How can you influence the outcomes?

Page 6: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

ق��ا! �ة ح �ة عالم جعل شبكة الويب العالموب جهانی را به درستی جهانی سازیم!

عالمگیر ویب کو حقیقی طور پر عالمگیر بناناՀամաշխարհային ցանցն իրոք համաշխարհային դարձնելը

ᑖᑦᓱᒪ ᐃᑭᐊᖅᑭᕕᒃ ᓯᓚᕐᔪᐊᓕᒫᒥᒃ ᓈᕆᑎᑉᐹ.

"Дүниежүзілік торды" нағыз дүниежүзілік етеміз!

वरलड� व�ईड व�बला�ई यथा�था�म विवशववया�पी� बना�उना� !የዓለም አቀፉን ድር በእውነት አለም አቀፍ ማድረግ!

Κάνοντας τον Παγκόσμιο Ιστό πραγματικά Παγκόσμιο

ਵਰਡ ਵ�ਈਡ ਵ�ਬ ਨ� ਵ�ਕਈ ਵਿਵਸਵ-ਵਿਵਆਪੀ� ਬਨ�ਉਣਾ� !缔造真正全球通行的万维网ליצור מהרשת רשת כלל עולמית באמת!

ˈmeɪkɪŋ ðə wɜːld waɪd wɛb ˈtruːlɪ ˈwɜːldˈwaɪd

ワールド・ワイド・ウェッブを世界中に広げましょうធវើ���ឲយ�ធវើ� �លវា�យធវើ� �បមានទ�ទា�ងព�ភពធវើ�កព�បរា�កដមែ�ន!

전세계의 월드 와이드 웹으로 만들기 !

Gwneud y we fyd-eang yn wirioneddol fyd-eang!

การท�าให� World Wide Web แพร�หลายไปท��วโลกอย�างแท�จร�งའཛམ་ག� ང་ཡ ངས་འབ� ལ་འད� ་ ང ་མ་འབད་རང་ འཛམ་ག� ང་ཡ ངས་ལ་ཁབ་ཚགསཔ་བཟ ་བ།

"The Path W3C follows to making text on the Web truly global is Unicode." Tim Berners-Lee

Unicode

Page 7: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

ق��ا! �ة ح �ة عالم جعل شبكة الويب العالموب جهانی را به درستی جهانی سازیم!

عالمگیر ویب کو حقیقی طور پر عالمگیر بناناՀամաշխարհային ցանցն իրոք համաշխարհային դարձնելը

ᑖᑦᓱᒪ ᐃᑭᐊᖅᑭᕕᒃ ᓯᓚᕐᔪᐊᓕᒫᒥᒃ ᓈᕆᑎᑉᐹ.

"Дүниежүзілік торды" нағыз дүниежүзілік етеміз!

वरलड� व�ईड व�बला�ई यथा�था�म विवशववया�पी� बना�उना� !የዓለም አቀፉን ድር በእውነት አለም አቀፍ ማድረግ!

Κάνοντας τον Παγκόσμιο Ιστό πραγματικά Παγκόσμιο

ਵਰਡ ਵ�ਈਡ ਵ�ਬ ਨ� ਵ�ਕਈ ਵਿਵਸਵ-ਵਿਵਆਪੀ� ਬਨ�ਉਣਾ� !缔造真正全球通行的万维网ליצור מהרשת רשת כלל עולמית באמת!

ˈmeɪkɪŋ ðə wɜːld waɪd wɛb ˈtruːlɪ ˈwɜːldˈwaɪd

ワールド・ワイド・ウェッブを世界中に広げましょうធវើ���ឲយ�ធវើ� �លវា�យធវើ� �បមានទ�ទា�ងព�ភពធវើ�កព�បរា�កដមែ�ន!

전세계의 월드 와이드 웹으로 만들기 !

Gwneud y we fyd-eang yn wirioneddol fyd-eang!

การท�าให� World Wide Web แพร�หลายไปท��วโลกอย�างแท�จร�งའཛམ་ག� ང་ཡ ངས་འབ� ལ་འད� ་ ང ་མ་འབད་རང་ འཛམ་ག� ང་ཡ ངས་ལ་ཁབ་ཚགསཔ་བཟ ་བ།

http://googleblog.blogspot.com/2012/02/unicode-over-60-percent-of-web.html

Unicode

Page 8: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Encoding declarations<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"

"http://www.w3.org/TR/html4/strict.dtd">

<html lang='en'>

<head>

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

</head>

...

<!DOCTYPE html>

<html>

<head>

<meta charset=utf-8>

</head>

...

• Strong encouragement to use UTF-8.

• New meta charset declaration. Either approach will work, but check you don't have both.

• Must be completely within the first 1024 bytes of the file.

Page 9: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

HTML5 Encoding Spec

• Rules for determining, parsing, handling legacy encodings.

Page 10: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

<h2><a id="რჩეული">რჩეული ფოტოსურათი</a></h1>

<p><a href="/wiki/ჭიამაია" title="ჭიამაია" class="mw-redirect">ჭიამაია</a> (Coccinellidae), ხოჭოების ოჯახს ეკუთვნის. აქვს ამობურცული, მომრგვალო ან ოვალური სხეული. ზურგზე ღია ფონზე შავი ლაქები აყრია, იშვიათად ...

Unicode versions and ids

Page 11: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

History: CharMod

CharMod was the start of the International Activity, based on requirements originally published in 1998. So how is this news?

Page 12: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

I◌Ôzeli◌Ôto◌Õu◌Öl

NFD

ÍzelítőülNFC

Ha a világ beszélni akarna, Unicode-ul szólalna meg. Regisztráljon már most a Tizedik Nemzetközi Unicode Konferenciára, melyet 1997. március 10-12-én rendeznek Meinz-ban, Németországban. Ezen a konferencián az iparág több neves szakértője is résztvesz. Ízelítőül a témákból: a világháló és a Unicode nemzetközisítése és lokalizálása, a Unicode alkalmazása működő rendszerekben és alkalmazásokban, szövegelrendezésnél, és többnyelvű számítógépeken.

Normalization

Page 13: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Evolution & Revolution

Page 14: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

W3C ،نشاط التدويل

W3Cنشاط التدويل، ✘

✔<description dir="rtl">W3Cنشاط التدويل، </description>

Bidirectional text support

Page 15: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Bidi isolation for inserted text

<span dir=rtl>לילית</span> - 3 reviews✘

Page 16: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Bidi isolation for inserted text

• CSS3 added the “isolate” value to the unicode-bidi property.

• HTML5 adds a new <bdi> element, with unicode-bidi:isolate in the default stylesheet.

• The <output> element behaves the same way.

Page 17: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Determining direction at run time

Page 18: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Determining direction at run time

• HTML5 adds new “auto” value for the dir attribute.

• CSS3 adds a “plaintext” value to the unicode-bidi property to allow per-paragraph auto-direction, primarily for use on <textarea> and <pre> elements.

• dir=auto sets the unicode-bidi CSS property to “plaintext” for <textarea> and <pre> elements, to “bidi-override isolate” for <bdo> elements, and to “isolate” otherwise.

• It estimates a direction according to the UBA method.

<p>Your search - <span class=booktitle dir=auto> תורהצה

CSS</span> - did notיוותדודיק תתתת match any documents.</p>

Page 19: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Unicode Isolate Controls

Four new codepoints:• U+2066 LEFT-TO-RIGHT ISOLATE (LRI)• U+2067 RIGHT-TO-LEFT ISOLATE (RLI)• U+2068 FIRST STRONG ISOLATE (FSI)• U+2069 POP DIRECTIONAL ISOLATE (PDI)

FSIפיצה!PDI - 3 reviews ==> !3 - פיצה reviews

Page 20: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Unicode Isolates -> HTML Markup

• http://www.w3.org/International/wiki/Html-bidi-isolation Needs Comments!– @direction (isolating)– Option options rejected:

• Change dir to be isolating• Use <bdi> for isolation• Add ‘rli’ ‘lri’ to @dir (<span dir=“rli”>)• Add @isolate (<span dir=“rtl” isolate>)

Page 21: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Other bidi changes

• Reporting the chosen direction of <input> and <textarea> in form submissions (@dirname)

• <br> should should serve as a bidi separator

• Block elements as bidi separators (isolating)

• <title> supports the dir attribute

• <option> supports the dir attribute and be displayed accordingly both in the dropdown and after being chosen

Page 22: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Implementers of user agents need to be prodded by the public to support the developing marketplace !

A ক国

hanging

alphabeticideographic

सथि�वि�

CSS3

Page 23: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Requirements for Japanese Layout

Page 24: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

What about my language?

• Other language groups interested in building documents can do so– Korean nearing

FPWD– Indic languages– ???

Page 25: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Vertical text

Page 26: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Writing Mode

CSS3 has a new module for “writing mode” that supports vertical text.

http://www.w3.org/TR/css3-writing-modes/

Page 27: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Ruby annotation

Page 28: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

<ruby>凝<rt>ぎょう</rt>視<rt>し</rt></ruby>

<ruby<rb>凝</rb><rt>ぎょう</rt></ruby> <ruby><rb>視</rb><rt>し</rt></ruby>

<ruby><rbc><rb>凝</rb><rb>視</rb></rbc><rtc><rt>ぎょう</rt><rt>し</rt></rtc></ruby>

Ruby annotation

Page 29: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Ruby Annotation• http://rishida.net/misc/ruby/ruby-authoring.html

Page 30: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Zusätzlich erleichtert PLS die Eingrenzung von Anwendungen, indem es Aussprachebelange von anderen Teilen der Anwendung abtrennt.

* { hyphens: auto; }

Zusätzlich er-leichtert PLS die Eingrenzung von Anwendungen, in-dem es Aussprac-hebelange von an-deren Teilen der Anwendung ab-trennt.

Hyphenation

Page 31: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Hyphenation Support

• Hyphenation support is starting to become available.

– Still works best with embedded (server-side) hinting

– Language support??

Still in flux… development needed

Page 32: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

<DOCTYPE html>

<html lang=it>

<head>

<meta http-equiv=Content-Language content="en, it">

</head>

...

• Attributes indicate the language of text inside that element for text processors. Only one language value allowed.

• Meta elements indicate the language of the expected readership. Multiple languages are ok.

• Attributes override other declarations.

Language declarations

Page 33: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

<DOCTYPE html>

<html lang=it>

<head>

<meta http-equiv=Content-Language content="en, it">

</head>

...

• Attributes indicate the language of text inside that element for text processors. Only one language value allowed.

• Meta elements indicate the language of the expected readership. Multiple languages are ok.

• Attributes override other declarations.

• The meta element with Content-Language is now non-conforming.

Language declarations

Page 34: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

BCP 47 improvements

• Basis for Java7, JavaScript, PHP, .Net and other locale systems

• -u- extension

– Unicode Locales (RFC 6067)

• :lang pseudo-attribute

– CSS selection• -t- extension

– Transliterations and transformations (RFC 6497)

Page 35: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

<time datetime="2004-08-08">8 ส�งหาคม ๒๕๔๗</time>

<form>

<input type="date">

</form>

Improved Date/Time Support

Page 36: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Locale Sensitivity

• Still an issue for the Web– Date pickers not locale or language sensitive– No markup-based control over format– Time zone support is spotty

Page 37: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

JavaScript gets locales at last!

• ECMAScript ‘intl’ extension work– Locales based on BCP 47 language tags– Date, number formatting– Collation– and more…

• Core spec addressing Unicode needs, particularly supplementary character support

http://wiki.ecmascript.org/doku.php?id=strawman:i18n_api

Page 38: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

ES I18N Spec

• Internationalization API Specification• Developed by ECMA TC 39 + experts• Collation, number, date & time formatting• Started fall 2010• Implementations and test suite in progress• Approved in December 2012

Page 40: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Webapps at W3C

• Various technologies that make Web-based applications possible are under development. Some samples:

– IDL– Web sockets, Web storage, Web workers– XHR– Widgets– Selectors– File APIs– DOM

Page 41: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

The Widget Spec

• Widget containers deliver “apps” cross-platform based on HTML5– Extensive localization model– Ability to set base locale

<widget xmlns=http://www.w3.org/ns/widgets defaultlocale=“en”>

<name short="Weather"> Weather! a totally awesome application! </name>

<name short=" هوا و <"xml:lang="fa" dir="rtl "آب<span dir="ltr" xml:lang="en">Weather!</span> برنامه

بزرگ <name/> واقعا</widget>

Page 42: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

42

ITS 2.0

• Internationalization Tag Set (ITS) 2.0• Currently being defined in W3C

MultilingualWeb-LT Working Group• Latest Draft 6 December 2012 (“Last Call”)

http://www.w3.org/TR/its20/ • WG Homepagehttp://www.w3.org/International/multilingualweb/lt/• ITS 2.0 test suite

https://github.com/finnle/ITS-2.0-Testsuite/

Page 43: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

43

“Translate” locally in HTML5 or XML (example: DocBook)

<!DOCTYPE html><html> ...<p>The <span translate=no>World Wide Web Consortium</span> is making the World Web Web worldwide!</p>...</html>

<db:article ...><db:para>The <db:emphasis its:translate="no">World Wide Web Consortium</db:emphasis> is making the World Web Web worldwide!</db:para> ...</db:article>

Part of HTML5 !!

Page 44: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

markup for bidirectional text

normalization

working with case sensitivity

more information about date & time

Capturing guidance for spec developers and implementers (and you)

Page 45: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Tests

Page 46: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Articles

Tutorials

Technical

notes

Tests

Talks

Tools

Reviews

http://www.w3.org/International/

International Activity

Page 47: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

http://validator.w3.org/i18n-checker/

Checker tool

1. Discover

2. Check

Page 48: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

Get involved!• Follow the discussions on the internationalization mailing lists

(eg. [email protected]), and track other technologies for internationally relevant topics. Follow our RSS feeds and twitter channels (@webi18n and @multilingweb)

• Read and review specifications (http://www.w3.org/TR/tr-technology-drafts) and send comments to the www-international list or direct to the Working Group.

• Discuss local requirements for the Multilingual Web, and if you identify missing features, find ways to coordinate proposals.

• Use features needed for non-Latin script support and push implementers to include more in browsers and authoring tools.

• Join the Working Group

Page 49: Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization Developments in Web Standards.

The Web needs your help

this is your Web – not the W3C's

we need you to make the Web worldwide

get involved

Thank youhttp://www.inter-locale.com/whitepaper/imug2013