Go Global Fearless(I18N & L10N)

23
Go Global Fearless Conquer the world by Internationalizing your product! D V enkata Rajesh Principal QA Engineer Progress Software

Transcript of Go Global Fearless(I18N & L10N)

Page 1: Go Global Fearless(I18N & L10N)

Go Global Fearless

Conquer the world by Internationalizing your product!

D V enkata RajeshPrincipal QA Engineer Progress Software

Page 2: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.2

Agenda

Introduction –I18N , L10N

All about I18N & L10N - Terminology!

Unicode – Deep dive into details

Localization testing tips

Page 3: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.3

Why L10N?

TOP 10 Global Internet Websites have 81% of User base outside America

92% of the Top 25 Grossing iPhone apps in China use Chinese names

80% of the Top 25 Grossing Android apps in Japan use Japanese names

 

41% of the total app global revenue came from Asia, while North America generated 31% and Europe23%

72.4% of global consumers indicated that they prefer to use their native language when shopping online

Sources: KPCB, Common Sense Advisory, App Annie 2014

Page 4: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.4

Localization with trending Mobile & Cloud era – Social media

Page 5: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.5

I18N and L10N

Internationalization is a process of designing a software application to adapt to various languages and regions without any changes in source

Localization is the process of customizing a software application that was originally designed for a domestic market so that it can be released in foreign markets

Page 6: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.6

Internationalization Process

Page 7: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.7

Internationalization process

Source code

Hard coded contents

Resource bundles

Move contents to a properties file

MessagesBundle_fr_FR.propertiesMessagesBundle_en_US.properties

Page 8: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.8

EvolutionCharacter sets, Code pages, Encoding

Page 9: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.9

11000011 10000000

ÀéçЉД文字निخ�

ÀU+00C0

Process of text to encoding

Page 10: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.10

Code Pages

IBM code pages

ISO code pages

Microsoft code pages

Code pages NameISO 8859-1 Latin-1ISO 8859-2 Latin-2ISO 8859-3 Latin-3ISO 8859-4 Latin-4ISO 8859-5 CyrillicISO 8859-6 ArabicISO 8859-7 GreekISO 8859-8 HebrewISO 8859-9 Latin-5

ISO 8859-10 Latin-6ISO 8859-11 ThaiISO 8859-13 Latin-7ISO 8859-14 Latin-8ISO 8859-15 Latin-9ISO 8859-16 Latin-10

Code pages Name

CP 1250 Latin 2

CP 1251 Cyrillic

CP 1252 Latin 1

CP 1253 GreekCP 1254 Latin 5

CP 1255 Hebrew

CP 1256 ArabicCP 1257 Baltic

CP 1258 Viet NamCP 874 Thai

Code pages Name37 USA/Canada - CECP256 International #1259 Symbols, Set 7

273Germany F.R./Austria - CECP

274 Old Belgium Code Page275 Brazil - CECP276 Canada (French) - 94

850Personal Computer - Multilingual Page

278 Finland, Sweden - CECP280 Italy - CECP281 Japan (Latin) - CECP282 Portugal - CECP

284 Spain/Latin America - CECP

285 United Kingdom - CECP

Page 11: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.11

Common Encoding Problems

Tofuhollow boxes

Mojibakegarbage characters

Question Marks (conversion not supported)

Page 12: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.12

Unicode

Deep dive into Normalization , Compatibility, Replacement characters ..

Page 13: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.13

Unicode - Encodes the world’s scripts

Code space of up to 0x10FFFF (about 1.1 million) characters

Currently encodes 120,737 characters

Currently allocated code points 264,256

U+0041 <= hex notation

Plane Allocated code points Assigned characters

 0 BMP 65,392 55,181

 1 SMP 14,000 11,833

2 SIP 53,424 53,386

3 TIP 16,672 799

14 SSP 368 337

15 PUA-A 65,536

16 PUA-B 65,536

Totals 264,256 120,737

Page 14: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.14

Four Normalization Forms

Form Dcanonical decomposition

Form Ccanonical decomposition followed by composition

Form KDCompatibility decomposition

Form KCCompatibility decomposition followed by composition

ways to represent:U+01FA

U+00C5 U+0301U+00C1 U+030AU+212B U+0301

U+0041 U+0301 U+030AU+0041 U+030A U+0301

Ǻ

Page 15: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.15

Unicode Encoding Forms

UTF-32• Uses 32-bit code units • All characters are the same width

UTF-16• Uses 16-bit code units• BMP characters use one 16-bit code unit• Supplementary characters use two special 16-bit code units: a “surrogate pair”

UTF-8• Uses 8-bit code units (bytes!)• It’s a multi-byte encoding! • Characters use between 1 and 4 bytes• ASCII is ASCII in UTF-8

Page 16: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.16

Localization testing tips

Page 17: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.17

Case study: A Website + 10 languages + 4 Browsers + 20 test cases

LOCALIZATIoN

L10N

TESTING

Page 18: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.18

Localization testing UI checks

Layout Hot keys Text Graphics• Text truncation• Control truncation• Misalignment• Overlapping• Tabbing order• Oversized dialogs• Different layout in general

• Duplicated hotkeys• Missing hotkey• Inappropriate hotkey

• Un-translated text• Mistranslated text• Unexpected text• Inconsistent translation• Technical inaccuracy• Double space after full stop• Wrong alphabetical order• Wrong date/time format• Corrupt characters

• Missing graphics• Different graphics• Un-translated icons

Page 19: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.19

Pseudo Localization testing

A way to evaluate a website or software product’s readiness for the localization process

Considered a part of the internationalization testing process

1. Identify hard-coded strings that should be translatable

2. Find strings in the source files that shouldn’t be translated

3. Identify design and layout issues that will affect the

software or site when it is translated

Page 20: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.20

Regional differences

Page 21: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.21

More localization testing

Aspect Challenge

Limitation of screen sizeCharacter count and font of characters differ in various languages.

DirectionSome languages are written left to right, whereas others are written right to left.

Spelling rules and upper and lower case conversions Rules differ based on locale.

Regional StandardsApplications may have to be compatible with not only national languages, but also the regional languages

Data HandlingDifferent data storages and processing mechanisms along with different encoding/code pages.

Context and Special Characters

The translation of special characters needs to be handled carefully as different characters may have different meanings in different languages.

Collation And Sorting Sorting and collation rules differ in various languages.

Page 22: Go Global Fearless(I18N & L10N)

© 2014 Progress Software Corporation. All rights reserved.22

Page 23: Go Global Fearless(I18N & L10N)