L10n and I18n in the Real World Dan Moore Moore Consulting June 9, 2005.

22
L10n and I18n in the Real World Dan Moore Moore Consulting June 9, 2005

Transcript of L10n and I18n in the Real World Dan Moore Moore Consulting June 9, 2005.

L10n and I18n in the Real World

Dan MooreMoore Consulting

June 9, 2005

My experience

● Contractor who worked on 3 year old ecommercish site.– Added to existing working application

● Java on Tomcat with Oracle backend● 36 countries,15 languages, two frameworks.● Vast majority of work outlined here was done

by Zia Consulting– They get the blame!

● Reflects state of system as of February

Your experience

● How many folks have worked on i18n applications– J2SE– J2EE– J2ME?

● What locales (any programming lang)– European– Asian– Other

Outline

● General process● Definitions● Display● Locale Identification● Data flow● Issues● Resources● Ask questions anytime

General process

Locale

Key

Output

“HELLO_KEY”

“en_US” “Hello there.”

Datastore

Framework

Definitions

● Locale– ISO standard: language_country_variant

● fr_CA, en_US_jive● Internationalization● Localization● Character set

– UTF-8 or ascii● Bundle (resource)

Bundle Examples

● Greeting.properties– HELLO_KEY=Hello there.– GOODBYE_KEY=Bye

● Greeting_fr.properties– HELLO_KEY=Bonjour.– GOODBYE_KEY=Au revoir

● Greeting_kr.properties– HELLO_KEY=\uc0ac\uc774\ud2b8\ub9f5– GOODBYE_KEY=\uc0bc\ucd74\ud2b2\ub345

● One bundle, multiple files

Display

● Web● PDF/Email● Localized features● Other localization possibilities

Web

● Some class matches keys and locales to generate text

● Jetspeed (velocity)– $l10n.HELLO_KEY– Path like configuration

● Expresso (taglib) – <bean:message key="HELLO_KEY"

schema="com.zia..." />– Schema pointed to one bundle. – Also provided java method to do so.

● Struts similar to expresso

PDF/Email

● iText– Specify encoding, possibly specify TT font– com.lowagie.text.pdf.BaseFont– createFont() method

● Character set in content-type header– text/plain;charset=utf-8– text/html; charset=utf-8

Localized features

● Different sections of site were localized in different languages– Product A supported for en and de, but product

B only for en– In db, tie features (Product A) to locales– Always have a fallback locale of en

● Allow users to change locale easily● Locale specific fields on forms

– In db, tied fields (last name) on forms to locales.

Other localization possibilities

● Currency– DecimalFormat class

● Dates– Not localized—one common format

● Name of company● Sorting

– Not localized that I saw– Could have been done on client side

Locale identification

● How do you handle statelessness● Custom solution vs headers

– Headers: browser– Cookies, url rewriting or hidden form fields– Localization feature required folks to switch– Other business reasons (pass locale via params)– Consider headers

● Look at user set– Everyone knows some english, technical crowd.

● Locale Choice page– Image “Please choose your country...”– Drop down box (in English)

Data Flow

● Message bundles– Infrequently changing

● Database loads– Frequently changing

● Why not all in database?

Message bundles process

Excel Access

Framework

Native .txtfiles

.properties files

Client

Us

deployment

manual

Export

native2ascii

HELLO_KEY=HiKeys

defined

HELLO_KEY=\uc0ac\uc774\ud2b8\ub9f5

Message bundles continued

● Dynamic generation of strings– Sample value: Email Dan now!– EMAIL_KEY1=Email and EMAIL_KEY2=now!– EMAIL_KEY=Email {0} now!– Struts allows in taglib; JS and Exp don't

● Access/excel– Character limits (1024)

● Images and common properties– Separate property file

● Native2ascii– Ant task

Native2ascii example

<native2ascii encoding="UTF-8" src="indir" dest="outdir" ext="properties"> <include name="*.txt"/> <exclude name="readme.txt"/> <exclude name="CVS/**"/> </native2ascii>

Data process

Legacy systems

SQL Server

Test staging tables

Access

Text files

Client

Us

sqlldr

feeds export

Export to UTF-8 text

test db

prod staging

PL/SQL

ODBC

prod

Data process continued

● Oracle– NLS_LANG=american_america.AL32UTF8

● Sqlldr– CHARACTERSET UTF-8– Case study on sqlldr i18n on OTN

● PL/SQL● Setup your database browser● Tried ODBC

– 3 rows/sec

Issues

● Translation time● QA of output

– External testing resources● Scheduling restart times

– “It's 5 o'clock somewhere.”● Locale fk everywhere that data is displayed● Custom locale code (not en_US, rather

eng_US)

Resources

● ziaconsulting.com● blogs.msdn.com/michkap/default.aspx● java.sun.com/docs/books/tutorial/i18n/● mooreds.com/weblog/archives/000199.html● joelonsoftware.com/Unicode.html● ppewww.ph.gla.ac.uk/~flavell/www/lang-

neg.html● mooreds.com/i18n/● databasejournal.com/features/oracle/

article.php/3493691● ant.apache.org/manual/OptionalTasks/

native2ascii.html

Thanks

● Reviewers– Ben Galde– Susan Mowery– Corey Snipes– Karen Josey

● Ziaconsulting– Mike Mahon

● Y'all