Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization...
-
Upload
marcia-nichols -
Category
Documents
-
view
215 -
download
0
Transcript of Addison Phillips, Chair W3C Internationalization WG Towards the Promised Land: Globalization...
Addison Phillips, Chair
W3C Internationalization WG
Towards the Promised Land:
Globalization Developments in Web Standards
Presenter
• Globalization Architect, Amazon Lab126 We make the Kindle
• Chair, W3C Internationalization WG
Acknowledgements
• This presentation owes much of its content to these contributors:– Richard Ishida (W3C International Activity lead)– Felix Sasaki (W3C MLW-LT)– Aharon Lanin (Google, bidi maven)– Norbert Lindenberg (ES-I18N)– Koji Ishii (Rakuten)
The Web: vastly improved or room for improvement?
• Why “the promised land”?The promise of a multilingual Web is being realized and new W3C specifications help demonstrate that.Many features are implemented.
• Why only “towards”We’ve waited a long time.Many features we’ll talk about today are not implemented yet or are only partially implemented.
• What issues are more or less solved on the Web?
• What are we doing to address the remaining problems?
• How can you influence the outcomes?
ق��ا! �ة ح �ة عالم جعل شبكة الويب العالموب جهانی را به درستی جهانی سازیم!
عالمگیر ویب کو حقیقی طور پر عالمگیر بناناՀամաշխարհային ցանցն իրոք համաշխարհային դարձնելը
ᑖᑦᓱᒪ ᐃᑭᐊᖅᑭᕕᒃ ᓯᓚᕐᔪᐊᓕᒫᒥᒃ ᓈᕆᑎᑉᐹ.
"Дүниежүзілік торды" нағыз дүниежүзілік етеміз!
वरलड� व�ईड व�बला�ई यथा�था�म विवशववया�पी� बना�उना� !የዓለም አቀፉን ድር በእውነት አለም አቀፍ ማድረግ!
Κάνοντας τον Παγκόσμιο Ιστό πραγματικά Παγκόσμιο
ਵਰਡ ਵ�ਈਡ ਵ�ਬ ਨ� ਵ�ਕਈ ਵਿਵਸਵ-ਵਿਵਆਪੀ� ਬਨ�ਉਣਾ� !缔造真正全球通行的万维网ליצור מהרשת רשת כלל עולמית באמת!
ˈmeɪkɪŋ ðə wɜːld waɪd wɛb ˈtruːlɪ ˈwɜːldˈwaɪd
ワールド・ワイド・ウェッブを世界中に広げましょうធវើ���ឲយ�ធវើ� �លវា�យធវើ� �បមានទ�ទា�ងព�ភពធវើ�កព�បរា�កដមែ�ន!
전세계의 월드 와이드 웹으로 만들기 !
Gwneud y we fyd-eang yn wirioneddol fyd-eang!
การท�าให� World Wide Web แพร�หลายไปท��วโลกอย�างแท�จร�งའཛམ་ག� ང་ཡ ངས་འབ� ལ་འད� ་ ང ་མ་འབད་རང་ འཛམ་ག� ང་ཡ ངས་ལ་ཁབ་ཚགསཔ་བཟ ་བ།
"The Path W3C follows to making text on the Web truly global is Unicode." Tim Berners-Lee
Unicode
ق��ا! �ة ح �ة عالم جعل شبكة الويب العالموب جهانی را به درستی جهانی سازیم!
عالمگیر ویب کو حقیقی طور پر عالمگیر بناناՀամաշխարհային ցանցն իրոք համաշխարհային դարձնելը
ᑖᑦᓱᒪ ᐃᑭᐊᖅᑭᕕᒃ ᓯᓚᕐᔪᐊᓕᒫᒥᒃ ᓈᕆᑎᑉᐹ.
"Дүниежүзілік торды" нағыз дүниежүзілік етеміз!
वरलड� व�ईड व�बला�ई यथा�था�म विवशववया�पी� बना�उना� !የዓለም አቀፉን ድር በእውነት አለም አቀፍ ማድረግ!
Κάνοντας τον Παγκόσμιο Ιστό πραγματικά Παγκόσμιο
ਵਰਡ ਵ�ਈਡ ਵ�ਬ ਨ� ਵ�ਕਈ ਵਿਵਸਵ-ਵਿਵਆਪੀ� ਬਨ�ਉਣਾ� !缔造真正全球通行的万维网ליצור מהרשת רשת כלל עולמית באמת!
ˈmeɪkɪŋ ðə wɜːld waɪd wɛb ˈtruːlɪ ˈwɜːldˈwaɪd
ワールド・ワイド・ウェッブを世界中に広げましょうធវើ���ឲយ�ធវើ� �លវា�យធវើ� �បមានទ�ទា�ងព�ភពធវើ�កព�បរា�កដមែ�ន!
전세계의 월드 와이드 웹으로 만들기 !
Gwneud y we fyd-eang yn wirioneddol fyd-eang!
การท�าให� World Wide Web แพร�หลายไปท��วโลกอย�างแท�จร�งའཛམ་ག� ང་ཡ ངས་འབ� ལ་འད� ་ ང ་མ་འབད་རང་ འཛམ་ག� ང་ཡ ངས་ལ་ཁབ་ཚགསཔ་བཟ ་བ།
http://googleblog.blogspot.com/2012/02/unicode-over-60-percent-of-web.html
Unicode
Encoding declarations<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
"http://www.w3.org/TR/html4/strict.dtd">
<html lang='en'>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
</head>
...
<!DOCTYPE html>
<html>
<head>
<meta charset=utf-8>
</head>
...
• Strong encouragement to use UTF-8.
• New meta charset declaration. Either approach will work, but check you don't have both.
• Must be completely within the first 1024 bytes of the file.
HTML5 Encoding Spec
• Rules for determining, parsing, handling legacy encodings.
<h2><a id="რჩეული">რჩეული ფოტოსურათი</a></h1>
<p><a href="/wiki/ჭიამაია" title="ჭიამაია" class="mw-redirect">ჭიამაია</a> (Coccinellidae), ხოჭოების ოჯახს ეკუთვნის. აქვს ამობურცული, მომრგვალო ან ოვალური სხეული. ზურგზე ღია ფონზე შავი ლაქები აყრია, იშვიათად ...
Unicode versions and ids
History: CharMod
CharMod was the start of the International Activity, based on requirements originally published in 1998. So how is this news?
I◌Ôzeli◌Ôto◌Õu◌Öl
NFD
ÍzelítőülNFC
Ha a világ beszélni akarna, Unicode-ul szólalna meg. Regisztráljon már most a Tizedik Nemzetközi Unicode Konferenciára, melyet 1997. március 10-12-én rendeznek Meinz-ban, Németországban. Ezen a konferencián az iparág több neves szakértője is résztvesz. Ízelítőül a témákból: a világháló és a Unicode nemzetközisítése és lokalizálása, a Unicode alkalmazása működő rendszerekben és alkalmazásokban, szövegelrendezésnél, és többnyelvű számítógépeken.
Normalization
Evolution & Revolution
✘
W3C ،نشاط التدويل
W3Cنشاط التدويل، ✘
✔<description dir="rtl">W3Cنشاط التدويل، </description>
Bidirectional text support
Bidi isolation for inserted text
<span dir=rtl>לילית</span> - 3 reviews✘
Bidi isolation for inserted text
• CSS3 added the “isolate” value to the unicode-bidi property.
• HTML5 adds a new <bdi> element, with unicode-bidi:isolate in the default stylesheet.
• The <output> element behaves the same way.
✓
✗
Determining direction at run time
✗
Determining direction at run time
• HTML5 adds new “auto” value for the dir attribute.
• CSS3 adds a “plaintext” value to the unicode-bidi property to allow per-paragraph auto-direction, primarily for use on <textarea> and <pre> elements.
• dir=auto sets the unicode-bidi CSS property to “plaintext” for <textarea> and <pre> elements, to “bidi-override isolate” for <bdo> elements, and to “isolate” otherwise.
• It estimates a direction according to the UBA method.
<p>Your search - <span class=booktitle dir=auto> תורהצה
CSS</span> - did notיוותדודיק תתתת match any documents.</p>
Unicode Isolate Controls
Four new codepoints:• U+2066 LEFT-TO-RIGHT ISOLATE (LRI)• U+2067 RIGHT-TO-LEFT ISOLATE (RLI)• U+2068 FIRST STRONG ISOLATE (FSI)• U+2069 POP DIRECTIONAL ISOLATE (PDI)
FSIפיצה!PDI - 3 reviews ==> !3 - פיצה reviews
Unicode Isolates -> HTML Markup
• http://www.w3.org/International/wiki/Html-bidi-isolation Needs Comments!– @direction (isolating)– Option options rejected:
• Change dir to be isolating• Use <bdi> for isolation• Add ‘rli’ ‘lri’ to @dir (<span dir=“rli”>)• Add @isolate (<span dir=“rtl” isolate>)
Other bidi changes
• Reporting the chosen direction of <input> and <textarea> in form submissions (@dirname)
• <br> should should serve as a bidi separator
• Block elements as bidi separators (isolating)
• <title> supports the dir attribute
• <option> supports the dir attribute and be displayed accordingly both in the dropdown and after being chosen
Implementers of user agents need to be prodded by the public to support the developing marketplace !
A ক国
hanging
alphabeticideographic
सथि�वि�
CSS3
Requirements for Japanese Layout
What about my language?
• Other language groups interested in building documents can do so– Korean nearing
FPWD– Indic languages– ???
Vertical text
Writing Mode
CSS3 has a new module for “writing mode” that supports vertical text.
http://www.w3.org/TR/css3-writing-modes/
Ruby annotation
<ruby>凝<rt>ぎょう</rt>視<rt>し</rt></ruby>
<ruby<rb>凝</rb><rt>ぎょう</rt></ruby> <ruby><rb>視</rb><rt>し</rt></ruby>
<ruby><rbc><rb>凝</rb><rb>視</rb></rbc><rtc><rt>ぎょう</rt><rt>し</rt></rtc></ruby>
Ruby annotation
Ruby Annotation• http://rishida.net/misc/ruby/ruby-authoring.html
Zusätzlich erleichtert PLS die Eingrenzung von Anwendungen, indem es Aussprachebelange von anderen Teilen der Anwendung abtrennt.
* { hyphens: auto; }
Zusätzlich er-leichtert PLS die Eingrenzung von Anwendungen, in-dem es Aussprac-hebelange von an-deren Teilen der Anwendung ab-trennt.
Hyphenation
Hyphenation Support
• Hyphenation support is starting to become available.
– Still works best with embedded (server-side) hinting
– Language support??
Still in flux… development needed
<DOCTYPE html>
<html lang=it>
<head>
<meta http-equiv=Content-Language content="en, it">
</head>
...
• Attributes indicate the language of text inside that element for text processors. Only one language value allowed.
• Meta elements indicate the language of the expected readership. Multiple languages are ok.
• Attributes override other declarations.
Language declarations
<DOCTYPE html>
<html lang=it>
<head>
<meta http-equiv=Content-Language content="en, it">
</head>
...
• Attributes indicate the language of text inside that element for text processors. Only one language value allowed.
• Meta elements indicate the language of the expected readership. Multiple languages are ok.
• Attributes override other declarations.
• The meta element with Content-Language is now non-conforming.
✘
Language declarations
BCP 47 improvements
• Basis for Java7, JavaScript, PHP, .Net and other locale systems
• -u- extension
– Unicode Locales (RFC 6067)
• :lang pseudo-attribute
– CSS selection• -t- extension
– Transliterations and transformations (RFC 6497)
<time datetime="2004-08-08">8 ส�งหาคม ๒๕๔๗</time>
<form>
<input type="date">
</form>
Improved Date/Time Support
Locale Sensitivity
• Still an issue for the Web– Date pickers not locale or language sensitive– No markup-based control over format– Time zone support is spotty
JavaScript gets locales at last!
• ECMAScript ‘intl’ extension work– Locales based on BCP 47 language tags– Date, number formatting– Collation– and more…
• Core spec addressing Unicode needs, particularly supplementary character support
http://wiki.ecmascript.org/doku.php?id=strawman:i18n_api
ES I18N Spec
• Internationalization API Specification• Developed by ECMA TC 39 + experts• Collation, number, date & time formatting• Started fall 2010• Implementations and test suite in progress• Approved in December 2012
Demos
• Locale Extension• http://norbertlindenberg.com/javascript/
demos/Collation.html• http://norbertlindenberg.com/javascript/
demos/DateTimeFormat.html• Core Extension• http://norbertlindenberg.com/javascript/
demos/RegExp.html• http://norbertlindenberg.com/javascript/
demos/Supplementary.html
Webapps at W3C
• Various technologies that make Web-based applications possible are under development. Some samples:
– IDL– Web sockets, Web storage, Web workers– XHR– Widgets– Selectors– File APIs– DOM
The Widget Spec
• Widget containers deliver “apps” cross-platform based on HTML5– Extensive localization model– Ability to set base locale
<widget xmlns=http://www.w3.org/ns/widgets defaultlocale=“en”>
<name short="Weather"> Weather! a totally awesome application! </name>
<name short=" هوا و <"xml:lang="fa" dir="rtl "آب<span dir="ltr" xml:lang="en">Weather!</span> برنامه
بزرگ <name/> واقعا</widget>
42
ITS 2.0
• Internationalization Tag Set (ITS) 2.0• Currently being defined in W3C
MultilingualWeb-LT Working Group• Latest Draft 6 December 2012 (“Last Call”)
http://www.w3.org/TR/its20/ • WG Homepagehttp://www.w3.org/International/multilingualweb/lt/• ITS 2.0 test suite
https://github.com/finnle/ITS-2.0-Testsuite/
43
“Translate” locally in HTML5 or XML (example: DocBook)
<!DOCTYPE html><html> ...<p>The <span translate=no>World Wide Web Consortium</span> is making the World Web Web worldwide!</p>...</html>
<db:article ...><db:para>The <db:emphasis its:translate="no">World Wide Web Consortium</db:emphasis> is making the World Web Web worldwide!</db:para> ...</db:article>
Part of HTML5 !!
markup for bidirectional text
normalization
working with case sensitivity
more information about date & time
…
Capturing guidance for spec developers and implementers (and you)
Tests
Articles
Tutorials
Technical
notes
Tests
Talks
Tools
Reviews
http://www.w3.org/International/
International Activity
http://validator.w3.org/i18n-checker/
Checker tool
1. Discover
2. Check
Get involved!• Follow the discussions on the internationalization mailing lists
(eg. [email protected]), and track other technologies for internationally relevant topics. Follow our RSS feeds and twitter channels (@webi18n and @multilingweb)
• Read and review specifications (http://www.w3.org/TR/tr-technology-drafts) and send comments to the www-international list or direct to the Working Group.
• Discuss local requirements for the Multilingual Web, and if you identify missing features, find ways to coordinate proposals.
• Use features needed for non-Latin script support and push implementers to include more in browsers and authoring tools.
• Join the Working Group
The Web needs your help
this is your Web – not the W3C's
we need you to make the Web worldwide
get involved
Thank youhttp://www.inter-locale.com/whitepaper/imug2013