I18n with PHP 5.3

24
PHP Internationalization with ICU By Stas Malyshev, Zend Technologies

description

Talk by Stas Malyshev of Zend at ZendCon 2009

Transcript of I18n with PHP 5.3

Page 1: I18n with PHP 5.3

PHP Internationalization with ICU

By Stas Malyshev, Zend Technologies

Page 2: I18n with PHP 5.3

2

What and why?

•ICU - http://icu-project.org/ (IBM)

•Unicode

•CLDR - http://cldr.unicode.org/

Page 3: I18n with PHP 5.3

3

Intl extension•Locale

•Collator

•Number & Currency formatter

•Date & Time formatter

•Message & Choice formatter

•Normalizer

•Graphemes

•IDN

•Calendars

•Resources

Page 4: I18n with PHP 5.3

4

Intl extension

•Dual API OO and procedural

•Same implementation underneath

collator_create() == new Collator()

numfmt_format() == NumberFormatter::format()

locale_get_default() == Locale::getDefault()

Page 5: I18n with PHP 5.3

5

Locale

•Relies on ICU locales

<language>[_<script>]_<country>[_<variant>][@<keywords>]

•Default locale

new Collator(Locale::DEFAULT)

Locale::setDefault, Locale::getDefault

You can use null

Page 6: I18n with PHP 5.3

6

Locale

Locale pieces

getPrimaryLanguage($locale)

getScript($locale)

getRegion($locale)

getVariant($locale)

getKeywords($locale)

Page 7: I18n with PHP 5.3

7

LocaleLocale display pieces

getDisplayName($locale, $in_locale = null)

getDisplayLanguage($locale, $in_locale = null)

getDisplayScript($locale, $in_locale = null)

getDisplayRegion($locale, $in_locale = null)

Example:

getDisplayScript(getScript("zh-Hant-TW"), "en-US") returns “Traditional Chinese”

Page 8: I18n with PHP 5.3

8

Locale building blocks•parseLocale() - returns array composed of locale

subtags

•composeLocale() - creates locale ID out of subtags

parseLocale('sr-Latn-RS') returns

array('language'=>'sr', 'script'=>'Latn', 'region'=>’RS’)

composeLocale(array('language'=>'sr', 'script'=>'Latn', 'region'=>’RS’)) returns ‘sr-Latn-RS’

Page 9: I18n with PHP 5.3

9

Locale guessing

•acceptFromHttp - Accept-Language to locale

•lookup – find in the list

•filterMatches – are they the same?

Page 10: I18n with PHP 5.3

10

Collator

•Comparing, sorting strings

•Collation level (strength)

•All ICU collator attributes

Numeric collation

Ignoring punctuation

•Not yet: custom “tailoring” rules

Page 11: I18n with PHP 5.3

11

Collator

$coll = new Collator("fr_CA");

if ($coll->compare("côte", "coté") < 0) {

     echo "less\n"; 

} else {

     echo "greater\n"; 

}  côte < coté

Page 12: I18n with PHP 5.3

12

Collator

$strings = array("cote", "côte", "Côte", "coté","Coté", "côté", "Côté", "coter");

$coll = new Collator("fr_CA"); 

$coll->sort($strings);

cotecôteCôtecotéCotécôtéCôtécoter

sort($array, $flags)asort($array, $flags)sortWithSortKeys($array)

Page 13: I18n with PHP 5.3

13

NumberFormatter

•Formatting and parsing

•Numbers and currency

numfmt_create($locale, $style, $pattern = null)

NumberFormatter::PATTERN_DECIMAL NumberFormatter::ORDINALNumberFormatter::DECIMALNumberFormatter::DURATIONNumberFormatter::CURRENCY NumberFormatter::SCIENTIFICNumberFormatter::PERCENT NumberFormatter::SPELLOUT

Page 14: I18n with PHP 5.3

14

NumberFormatterFormatting

$fmt = new NumberFormatter(‘en_US’,                           NumberFormatter::DECIMAL);

echo $fmt->format(1234);

// result is 1,234

$fmt = new NumberFormatter(‘de_CH’,                           NumberFormatter::DECIMAL);

echo $fmt->format(1234);

// result is 1'234

Page 15: I18n with PHP 5.3

15

NumberFormatterParsing

$fmt = new NumberFormatter(‘de_DE’,                           NumberFormatter::DECIMAL);

$num = ‘1.234,567 min’;

$fmt->parse($num, NumberFormatter::TYPE_DOUBLE, $pos);

// result is 1234.567 , $pos = 9

$fmt->parse($num, NumberFormatter::TYPE_INT32);

// result is 1234

Page 16: I18n with PHP 5.3

16

MessageFormatter

•Formatting and parsing whole messages, including data inside

•Also allows choice between things printed:

0≤are no files|1≤is one file|1<are many files

Page 17: I18n with PHP 5.3

17

MessageFormatter

$fmt = new MessageFormatter("en_US", "{0,number,integer}  monkeys on {1,number,integer} trees  make {2,number} monkeys per tree");echo $fmt->format(array(4560, 123, 4560/123));

$fmt = new MessageFormatter("de", "{0,number,integer}  Affen über {1,number,integer} Bäume  um {2,number} Affen pro Baum");

echo $fmt->format(array(4560, 123, 4560/123));

Page 18: I18n with PHP 5.3

18

IntlDateFormatter

•Allows using locale-dependent canned patterns

•Short, medium, long date & time

Long: Tuesday, April 12, 1952 AD or 3:30:42pm PST

Medium: January 12, 1952 or 3:30:32pm

Short: 12/13/52 or 3:30pm

•Also allows free-form patterns

"yyyy.MM.dd G 'at' HH:mm:ss vvvv"

1996.07.10 AD at 15:08:56 Pacific Time

Page 19: I18n with PHP 5.3

19

IntlDateFormatter

$fmt = new IntlDateFormatter( "en_US" , IntlDateFormatter::FULL, IntlDateFormatter::FULL,'America/Los_Angeles',IntlDateFormatter::GREGORIAN);echo $fmt->format(0);

// Wednesday, December 31, 1969 4:00:00 PM PT $fmt = new IntlDateFormatter( "de-DE" , IntlDateFormatter::FULL, IntlDateFormatter::FULL,'America/Los_Angeles',IntlDateFormatter::GREGORIAN);echo $fmt->format(0); // Mittwoch, 31. Dezember 1969 16:00 Uhr GMT-08:00

Page 20: I18n with PHP 5.3

20

Normalizer

•Brings Unicode text to one of the normal forms: NFC, NFD, NFKC, NFKD

•normalize(), isNormalized()

$combining_ring_above = "\xCC\x8A";  // 'COMBINING RING ABOVE' (U+030A) $chars = Normalizer::normalize( 'A' . $combining_ring_above, Normalizer::FORM_C );

echo urlencode($chars);

// %C3%85 i.e. // 'LATIN CAPITAL LETTER A WITH RING ABOVE' (U+00C5)

Page 21: I18n with PHP 5.3

21

Grapheme functions

•Graphemes are multi-char entities, like letter + accent mark(s)

•Same as string functions, but operate on grapheme units

•Strlen, substr, strpos, strstr

•Extraction function – extract to fill limited buffer, but always keep graphemes whole

Page 22: I18n with PHP 5.3

22

IDN

idn.icann.org ↔ xn--5dbqzzl.idn.icann.org.עברית

русский.idn.icann.org ↔ xn--h1acbxfam.idn.icann.org

•idn_to_ascii

•idn_to_utf8

Page 23: I18n with PHP 5.3

23

TODO

•ResourceHandler

•Transliteration

•StringSearch

•Tighter integration with other modules in 6.0

Page 24: I18n with PHP 5.3

24

Thanks!http://php.net/intl for futher information.