CLDR: The Common Locale Data Repository Locales for the World

download CLDR: The Common Locale Data Repository Locales for the World

If you can't read please download the document

description

CLDR: The Common Locale Data Repository Locales for the World. Lisa Moore George Rhoten Mark Davis Steven Loomis. Agenda. Why CLDR? CLDR data Tools and vetting Today and the future. Agenda. Why CLDR? CLDR data Tools and vetting Today and the future. - PowerPoint PPT Presentation

Transcript of CLDR: The Common Locale Data Repository Locales for the World

  • CLDR:The Common Locale Data Repository

    Locales for the WorldLisa MooreGeorge Rhoten Mark Davis Steven Loomis

    Dublin, Ireland, October, 2006*

    AgendaWhy CLDR?CLDR dataTools and vettingToday and the future

    Dublin, Ireland, October, 2006*

    AgendaWhy CLDR?CLDR dataTools and vettingToday and the future

    Dublin, Ireland, October, 2006*

    Locales does anything stay the same?"Theatre Center News: The date of the last version of this document was 2003320. A copy can be obtained for $50,0 or 1.234,57 . We would like to acknowledge contributions by the following authors (in alphabetical order): Alaa Ghoneim, Behdad Esfahbod, Ahmed Talaat, Eric Mader, Asmus Freytag, Avery Bishop, and Doug Felt."

    Dublin, Ireland, October, 2006*

    Locales the many differencesLocales specify user preferencesLinguistic and cultural differencesLanguages, scripts, writing systems, ordering, directionality, formatting, numbers, sizesEven in the same locale, interoperability issues across platformsGlobal economics has increased the need for greater globalization support in computer systemsEveryone expects more!

    Dublin, Ireland, October, 2006*

    Add the Universal Character EncodingUnicode: Unique character codes for all languages

    Dublin, Ireland, October, 2006*

    The Need for Common Locale DataComputing environments often contain a variety of operating systems and software.Historically locale sensitive data research has been done by individuals and/or companies.Because of political changes, it is easy for locale data to become out of date.It is difficult to get complete agreement on correctness.

    Dublin, Ireland, October, 2006*

    Common Locale Data ProjectBegan as Common XML Locale Repository (CXLR) developed by OpenI18N in 2003CLDR project began in 2004Hosted by Unicode Consortiumhttp://www.unicode.org/cldr/Goals:Common, necessary software locale data for all world languagesCollect and maintain locale dataXML format for effective interchangeFreely available

    Dublin, Ireland, October, 2006*

    CLDR in use (partial list)Libraries and EnvironmentsICU International Components for UnicodeJDK Java Development KitOperating SystemsSolarisAIXMacOS XApplicationsOpenOffice.orgAcrobatModernBill

    Dublin, Ireland, October, 2006*

    AgendaWhy CLDR?CLDR dataTools and vettingThe future

    Dublin, Ireland, October, 2006*

    What is a Locale?A locale is an identifier referring to linguistic and cultural preferencesen_US, en_GB, ja_JPThese preferences can change over time due to cultural and political reasonsIntroduction of new currencies, like the EuroStandard sorting of Spanish changesMany of these preferences have varying degrees of standardization12 and 24 hour format in the United StatesThis is a very broad topic

    Dublin, Ireland, October, 2006*

    Types of Locale DataDates/time/calendar formatsNumber/currency formatsMeasurement systemCollation specificationSortingSearchingMatchingTranslated names for language, territory, script, timezones, currencies,Script and characters used by a language

    Dublin, Ireland, October, 2006*

    Locale Data Markup LanguageLocale data described using XMLCLDR data uses LDMLStructure of CLDR controlled by Locale Data Markup Language (LDML) specification http://unicode.org/reports/tr35

    Dublin, Ireland, October, 2006*

    LDML Data Categories

    Dublin, Ireland, October, 2006*

    Names

    Provides translated display names for languages, territories, scripts, variants and keywords used in CLDR.Most of this information is at the language level, since it typically does not vary by territory, only language.An example: ICU Locale Explorer

    Dublin, Ireland, October, 2006*

    Names ExamplesFrom ga.xml (Irish):AfarAbcisisAraibisAndra Aontas na nimrochta Arabacha

    Dublin, Ireland, October, 2006*

    Characters

    Allows for creation of exemplar character sets. An exemplar set specifies the set of characters that must be present in order to properly render the language.Auxiliary exemplar set defines additional characters that may appear in foreign words or phrases.Lower case only

    Dublin, Ireland, October, 2006*

    Date Formats

    Defines representation of calendars using various calendaring systems (Gregorian, Buddhist, Islamic, Japanese, etc.)Defines formatting for dates, times, eras and time zoneswide, abbreviated, or narrowDate and time formats use patterns of letters to define proper formattingWeek informationRelative day/time translations (for example, yesterday, tomorrow, etc. )An example: ICU Locale Explorer

    Dublin, Ireland, October, 2006*

    Characters / Dates ExamplesFrom ga.xml (Irish): [a b-e f-i j-o p-u v-z] [ ] DomhLuan

    Dublin, Ireland, October, 2006*

    Time Zone Names

    Based on Olson time zone databaseLocalized display names for standard, daylight, and generic representations of time zones.Short and long display names.

    Dublin, Ireland, October, 2006*

    Numbers

    Specifies proper localized formatting of numeric quantitiesDecimalScientificCurrencyPercentagesIncludes localized decimal, thousands separators, currency symbols, etc.

    Dublin, Ireland, October, 2006*

    Time Zones / CurrenciesFrom ga.xml (Irish) and root.xml:

    Men-Am Greenwich Am Samhraidh na hireann