Free Pascal compiler internationalisation

Post on 09-Jan-2016

74 views 1 download

Tags:

description

Free Pascal compiler internationalisation. Rimgaudas Laucius Institute of Mathematics and Informatics, Vilnius University Lithuania. Introduction. Institute of Mathematics and Informatics, Informatics Methodology Department Software localisation Teaching of informatics and programming - PowerPoint PPT Presentation

Transcript of Free Pascal compiler internationalisation

Free Pascal compiler internationalisation

Rimgaudas LauciusInstitute of Mathematics and Informatics,

Vilnius University

Lithuania

Introduction

• Institute of Mathematics and Informatics, Informatics Methodology Department• Software localisation• Teaching of informatics and programming• E-learning and standards• Informatics terminology

• Vilnius University• Localisation course

Localisation in Lithuania

• One of the four priorities emphasised in the strategic project for the development of the information society in Lithuania is:

“to uphold the inheritance of Lithuanian language and culture implementing the information technologies and telecommunications”

Open Source in Lithuania

• Research which was carried out in 2004, “Open Source in Education” revealed that open source software integration into education has a large positive economical and also pedagogical effect

• Education requires high quality and fully localised software

• Open source software is more flexible in terms of localisation

Free Pascal compiler

• Excellent, open source compiler• Works under all widely used operating

systems Windows, Linux and others• Widely used. Has been used in International,

Baltic and national Lithuanian Olympiads in informatics for a few years already

• Replacement for obsolete Turbo Pascal system in Lithuanian schools

FPS

Compilers’ internationalisation

• Internationalisation is part of the software development process, so the internationalisation of development tools is very important

• Most contemporary software development tools are not internationalised enough

• Though this research is done on Free Pascal compiler, most of represented issues are common to most of compilers

Programming language standards

• Internationalisation relates with programming language standards

• Pascal programming language standards• Standards of other languages

Examples of internationalised compilers

• There are not many of these examples• One of the most well known internationalised

programming system is LOGO• Vector Pascal

Structure of Free Pascal

• Free Pascal is system made up of the compiler program itself and run-time library (RTL)

• Compiler and RTL interaction:

• Sometimes to change compiler one needs to change the RTL

Support of multilingual source code

• This is the first stage of compiler internationalisation

• There are many scripts which require more than the 8-bit character set

UTF-8 implementation

• Unicode ~ UTF-8• Some utilities used by compilers do not

support pure Unicode (Unicode chars may be treated as pairs of 8-bit chars; example U+0900 ~ 09 00, (tab and end of string))

• Allows step by step implementation of lexical extensions

Lexical extensions

• Strings• Identifiers• Directives• Reserved words• Operators• Numbers

Strings

• WideString implementation issues– Compatibility with other systems

– Ambiguity

– Conversions between Unicode and other character sets

Ambiguity example

procedure go(const s: WideString); begin ... end;

procedure go(const s: String); begin ... end;

beginGo('Hi');

end.

Which overloaded procedures have to be called?

Unicode support layer

• Unicode support layer wraps OS APIs’ in an OS independent way.

• Under Win9x implements Microsoft Layer for Unicode (MSLU)

Identifiers

• Identifiers have to reflect clear meaning of object, be easy to comprehend and memorize. Best way to support these features is to allow use of identifiers written in vernacular language

• Unicode Standard Annex #31: Identifier and Pattern Syntax

Directives

• Names• Parameters

– Logical (ON, OFF)

– Strings ({$warning Possible malfunctioning})

– File names ({$includepath ..\inc})

Reserved words

• Unification myth– Compared 13 similar programming languages

(Algol, Pascal, Modula, Ada, C, Java,…)

– Only ~3% of reserved words are same

– 56% met only in particular language

• Possible unambiguous translation

Example of localised reserved words

Operators

• Unicode has all mathematical symbols needed to express mathematical operations

• Example:

Numbers

• There are various scripts to express decimal numbers.

• Example:

Decimal separator

• JAV, GB ‘.’• Most European countries ‘,’

• Localisation of delimiter may cause ambiguity.

Solution needs to extend syntax of numbers.

25,88 – real number

25, 88 – two numbers

Punctuation

• Spaces: general U+0020, nonbreaking U+00A0, ideographic U+3000, etc

• Quotes: “English”, "Lithuanian“,

• Etc

Bi-directional text

• Bi-directional text is an issue of text representation, not the compiler

Unicode file names support

• Handling of files requires OS API, so it have to be done via RTL’s Unicode support layer

• Compilers have to use MSLU under Win9x

Input/Output

• File input/output requires additional support for Unicode encoding

• Windows console does not support Unicode– It can be replaced but is it the best solution?

Localisation framework

• Strings and other resources have to be externalised for easy localisation

• Localisation kits have to be prepared

• Questions?• Thank you

• Contact E-mail: