Lecture 6: Data Representation II

56
1 Lecture 6: Data Representation II Reading: Sebesta 5.4-5.7, 7.3-7.4 (Supplementary Texts: Pratt 5.1.4, 6.1.4, 6.4; Tucker & Noonan 3.1, 4.3, 5.4.1, 5.5.1) (Resources on C and Pascal available on the internet)

description

Lecture 6: Data Representation II. Reading: Sebesta 5.4-5.7, 7.3-7.4 (Supplementary Texts: Pratt 5.1.4, 6.1.4, 6.4; Tucker & Noonan 3.1, 4.3, 5.4.1, 5.5.1) (Resources on C and Pascal available on the internet). Overview. Types Motivation for typed languages Issues in Type Checking - PowerPoint PPT Presentation

Transcript of Lecture 6: Data Representation II

Page 1: Lecture 6: Data Representation II

1

Lecture 6: Data Representation II

Reading: Sebesta 5.4-5.7, 7.3-7.4(Supplementary Texts: Pratt 5.1.4, 6.1.4, 6.4;

Tucker & Noonan 3.1, 4.3, 5.4.1, 5.5.1)(Resources on C and Pascal available on the internet)

Page 2: Lecture 6: Data Representation II

2

OverviewTypes

Motivation for typed languages Issues in Type Checking

How to type check? How to cater for Polymorphism Type Equivalence When to type check? Strong and Weak typed languages

Page 3: Lecture 6: Data Representation II

3

1. Motivation for typed languages Untyped Languages: perform any operation on any

data. Example: Assembly

movi 5 r0 // Move integer 5 (2’s complement) to r0addf 3.6 r0 // Treat bit representation in r0 as a

// Floating point representation and add 3.6// to it.

Result? You can be sure that r0 does not contain 8.6! (+) Flexibility : “I can do anything I want to and you can’t

stop me” (–) Ease of Error Checking. (programs are prone to

errors, especially huge ones). “I am human, my brain is limited, I can’t remember and monitor everything.”

Page 4: Lecture 6: Data Representation II

4

1. Motivation for typed languagesTyped Languages: A type represents a set of values. Programs / procedures /

operators are functions from an input type to an output type.

Type Checking is the activity of ensuring that the operands / arguments of an operator / procedure are of compatible type through the use of a set of rules for associating a type with every expression in the language. (These rules are known as the type system).

A type error results when an operator is applied to an operand of inappropriate/incompatible type.

Output of a type system: There are type-errors (wrt type system) => Program is NOT type-safe. There are no type-errors (wrt type system) => Program is type-safe.

Page 5: Lecture 6: Data Representation II

5

1. Motivation for typed languages

TC says that there are type errors

TC says that there are no type errors

Program really has errors

Program really does not have errors

Usually true Possible TC errs on the conservative side

?????

Program MAY still have errors

1. It may still have type errors due to unsafe features of a language. This is due to bad type system design.

2. It may have logic errors. This serves to show that type errors is but one of the many errors you encounter.

Page 6: Lecture 6: Data Representation II

6

1. Motivation for typed languages

TC says that there are type errors

Program really has errors

Program really does not have errors

Usually true Possible TC errs on the conservative side

Typed Languages

(+) Error Detection

(+) Program documentation

(–) Loss of Flexibility (but it’s ok, I don’t lose much freedom anyway since I don’t usually program in that way in the first place. I gain more than what I lose).

Page 7: Lecture 6: Data Representation II

7

2. Issues in Type CheckingHow to type-check?How to cater for polymorphism?What is your definition of “compatible

type”?When to perform type checking? Is your language strongly or weakly

typed?

Page 8: Lecture 6: Data Representation II

8

2.1 How to type-check?Definition: Type statements are of the form:

<expr> : <type>meaning that an expression <expr> ‘is-of-the-type’ (the ‘:’ symbol) <type>.

Examples: 3 : int 3+4 : int 3.14 : real “abc” : String while (x < 5) {x++;} : Stmt

Page 9: Lecture 6: Data Representation II

9

2.1 How to type-check?Definition: Type rules are of the form:

e1 : t1 e2 : t2 … en : tn

f e1 e2 …en : t(rule name)

where each ei : ti is a type statement, n 0.

The rule is interpreted as “IF e1 is of type t1 and … and en is of type tn THEN f e1 e2 …en is of type t.”

Page 10: Lecture 6: Data Representation II

10

2.1 How to type-check?Examples of type rules: Rule for constants:

E1 : int E2 : int

E1 + E2 : int(+)

Rule for addition:

1 : int 2 : int 3 : int

E1 : int E2 : int

E1 == E2 : bool(==)

Rule for boolean comparison:

Page 11: Lecture 6: Data Representation II

11

2.1 How to type-check?Examples of type rules:

x : T E : T

x := E; : Stmt(:=)

Rule for if-statment:

Rule for assignment statement:

E1 : Bool S1 : Stmt S2 : Stmt

if (E1) {S1} else {S2} : Stmt(if)

Page 12: Lecture 6: Data Representation II

12

2.1 How to type-check?Rules of Type Checking

Type of value => known in advance Type of variable => known in the declaration Type of function => known from the type of

arguments (in declaration) and type of result (also in declaration).

Type of expression => inferred from sub-expression.

Page 13: Lecture 6: Data Representation II

13

2.1 How to type-check? Given the program:

int x;…x := x+1;…

…And Given the rules:

1 : int 2 : int 3 : int

E1 : int E2 : intE1 + E2 : int

(+)

E1 : int E2 : intE1 == E2 : bool

(==)

x : T E : Tx := E; : Stmt

(:=)

E1 : Bool S1 : Stmt S2 : Stmt

if (E1) {S1} else {S2} : Stmt(if)

A program/expression is type-safe if we can construct a derivation tree to give a type for that program/expression.

x:=x+1; : Stmt

x : int x+1 : int(:=)

x : int 1 : int(+)

Page 14: Lecture 6: Data Representation II

14

2.1 How to type-check? Given the program:

int x; float y;…if (x == 3) { y := x;} else { x := x+1;}…

…And Given the rules:

1 : int 2 : int 3 : int

E1 : int E2 : intE1 + E2 : int

(+)

E1 : int E2 : intE1 == E2 : bool

(==)

x : T E : Tx := E; : Stmt

(:=)

E1 : Bool S1 : Stmt S2 : Stmt

if (E1) {S1} else {S2} : Stmt(if)

A program/expression is type-safe if we can construct a derivation tree to give a type for that program/expression.

if (x==3) {y:=x;} else {x:=x+1;} : Stmt

x==3 : Bool y:=x; : Stmt x:=x+1; : Stmt(if)

x : int 3 : int(==)

???(:=)

x : int x+1 : int(:=)

x : int 1 : int(+)

Follow the rules! Try to build tree. Cannot build tree => Not type safe

Page 15: Lecture 6: Data Representation II

15

Issues in Type CheckingHow to type-check?How to cater for polymorphism?What is your definition of “compatible

type”?When to perform type checking? Is your language strongly or weakly

typed?

Page 16: Lecture 6: Data Representation II

16

2.2 How to cater for Polymorphism Polymorphism = poly (many) + morph (form)

Polymorphism is the ability of a data object to take on or assume many different forms.

Polymorphism can be categorized into 2 types Ad-hoc Polymorphism Universal Polymorphism

Page 17: Lecture 6: Data Representation II

17

2.2 How to cater for Polymorphism

Coercion Overloading Parametric Inclusion

Polymorphism

Ad-Hoc Universal

Ad-Hoc polymorphism is obtained when a function works, or appears to work on several different types (which may not exhibit a common structure) and may behave in unrelated ways for each type.

Universal polymorphism is obtained when a function works uniformly on a range of types; these types normally exhibit some common structure.

Cardelli and Wegner’s classification (1985)

Page 18: Lecture 6: Data Representation II

18

2.2 How to cater for Polymorphism

Coercion Overloading Parametric Inclusion

Polymorphism

Ad-Hoc Universal

Cardelli and Wegner’s classification (1985)

This lecture Covered in FP & OO

Page 19: Lecture 6: Data Representation II

19

2.2 Polymorphism – CoercionCOERCION

A coercion is a operation that converts the type of an expression to another type. It is done automatically by

the language compiler.(If the programmer manually forces a type conversion, it’s called casting)

E : int

E : float(Int-Float Coercion)

int x; float y;

...

y := x;

...

Page 20: Lecture 6: Data Representation II

20

2.2 Polymorphism – CoercionExample of the use of COERCION

int x; float y;…if (x == 3) { y := x;} else { x := x+1;}…

1 : int 2 : int 3 : int

E1 : int E2 : intE1 + E2 : int

(+)

E1 : int E2 : intE1 == E2 : bool

(==)

x : T E : Tx := E; : Stmt

(:=)

E1 : Bool S1 : Stmt S2 : Stmt

if (E1) {S1} else {S2} : Stmt(if)

if (x==3) {y:=x;} else {x:=x+1;} : Stmt

x==3 : Bool y:=x; : Stmt x:=x+1; : Stmt(if)

x : int 3 : int(==)

y : float x : float(:=)

x : int x+1 : int(:=)

x : int 1 : int(+)

E : intE : float

(Int-Float Coercion)

Add in new rule…

x : int(Coercion)

Page 21: Lecture 6: Data Representation II

21

2.2 Polymorphism – CoercionCoercion

Widening NarrowingWidening coercion converts a value to a type that can include (at least approximations of) all of the values of the original type.

Widening is safe most of the time. It can be unsafe in certain cases.

Narrowing coercion converts a value to a type that cannot store (even approximations of) all of the values of the original type.

Narrowing is unsafe. Information is lost during conversion of type.

int intfloat floatTheoretically

speaking,

int float

Page 22: Lecture 6: Data Representation II

22

2.2 Polymorphism – CoercionCoercions

(+) Increase flexibility in programming Example:

float x,y,z;int a,b,c;

If I have no coercions, and I intend to add y and a and store in x, then writing…

x = y + ((float) a);

…is too much of a hassle. Therefore coercion is good.

Page 23: Lecture 6: Data Representation II

23

2.2 Polymorphism – CoercionCoercions

(–) Decrease Reliability (error detection) Example:

float x,y,z;int a,b,c;

If I have coercions and I intend to add x and y and store in z, but I accidentally write…

z = x + a;

…then my error will go undetected because the compiler will simply coerce the a to a float.

Therefore coercion is bad.

Page 24: Lecture 6: Data Representation II

24

2.2 Polymorphism – CoercionCoercions: A lot of them: PL/I, Fortran, C, C++ Fewer : Java (permits only widening) Very Few: Ada

Page 25: Lecture 6: Data Representation II

25

2.2 Polymorphism – OverloadingOVERLOADING

An overloaded operation has different meanings, and different types, in different contexts.

E1 : int E2 : int

E1 + E2 : int(+-int)

E1 : float E2 : float

E1 + E2 : float(+-float)

Page 26: Lecture 6: Data Representation II

26

2.2 Polymorphism – Overloading

1 : int

Example of the use of Overloadingint x,y,z; float a,b,c;…if (x == 3) { x := y + z;} else { a := b + c;}…

2 : int 3 : int

E1 : int E2 : intE1 + E2 : int

(+)

E1 : int E2 : intE1 == E2 : bool

(==)

x : T E : Tx := E; : Stmt

(:=)

E1 : Bool S1 : Stmt S2 : Stmt

if (E1) {S1} else {S2} : Stmt(if)

if (x==3) {x:=y+z;} else {a:=b+c;} : Stmt

x==3 : Bool x:=y+z; : Stmt a:=b+c; : Stmt(if)

x : int 3 : int(==)

b:float c:float(+ -float)

a : float b+c : float(:=)

x : int y+z : int(:=)

y:int z:int(+)

Add in new rule…

E1 : float E2 : floatE1 + E2 : float

(+-float)

Page 27: Lecture 6: Data Representation II

27

2.2 Polymorphism – OverloadingOverloading

(+) Increase flexibility in programming Examples are when user wants to use an

operator to express similar ideas. Example:

int a,b,c;int p[10], q[10], r[10];int x[10][10], y[10][10], z[10][10];a = b * c; // integer multiplicationp = a * q; // Scalar multiplicationx = y * z; // Matrix multiplication

Therefore overloading is good.

Page 28: Lecture 6: Data Representation II

28

2.2 Polymorphism – OverloadingOverloading

(–) Decrease Reliability (error detection) Examples are when user intends to use the

operator in one context, but accidentally uses it in another.

Example In many languages, the minus sign is overloaded to

both unary and binary uses.x = z–y and x = -y

will both compile. What if I intend to do the first, but accidentally leave out the ‘z’?

Page 29: Lecture 6: Data Representation II

29

2.2 Polymorphism – Overloading

Even for common operations, overloading may not be good.

Exampleint sum, count;float average;...average = sum / count;

Since sum and count are integers, integer division is performed first before result is coerced to float.That’s why Pascal has div for integer division and / for floating point division.

Overloading(–) Decrease Reliability (error detection)

Page 30: Lecture 6: Data Representation II

30

2.2 Polymorphism – Overloading

Do you allow the user to perform overloading? (Flexibility) Or are all overloaded functions predefined in the language? (controlled reliability)

If you allow the user to perform overloading, then can the user overload existing operators in the language? (eg. C++ allows you to overload +,-,*,/ to an extent that + can become * and * can become +!!!) Again power and flexibility vs reliability (the dangers of misuse).

Overloading

Page 31: Lecture 6: Data Representation II

31

2.2 Polymorphism – SummaryCoercion and Overloading:

Use it but don’t abuse it.Use it wisely, don’t overdo it.

Just like fire. Useful and yet dangerous if not managed carefully.

Page 32: Lecture 6: Data Representation II

32

2.2 Polymorphism – Summary

Coercion Overloading Parametric Inclusion

Polymorphism

Ad-Hoc Universal

Ad-Hoc polymorphism is obtained when a function works, or appears to work on several different types (which may not exhibit a common structure) and may behave in unrelated ways for each type.

Universal polymorphism is obtained when a function works uniformly on a range of types; these types normally exhibit some common structure.

Cardelli and Wegner’s classification (1985)

Page 33: Lecture 6: Data Representation II

33

Issues in Type CheckingHow to type-check?How to cater for polymorphism?What is your definition of “compatible

type”?When to perform type checking? Is your language strongly or weakly

typed?

Page 34: Lecture 6: Data Representation II

34

2.3 Type Equivalencetype // type definitions

Q = array [1..10] of integer;S = array [1..10] of integer;T = S;

var// variable declarationsa : Q;b : S;c : T;d : array [1..10] of integer;

begina := b; // Is this allowed?// Meaning to say “Is a and b // the same type?”

a := c; // Is this allowed?a := d; // Is this allowed?b := c; // Is this allowed?

end.

type // type definitionsQueue = array [1..10] of integer;Stack = array [1..10] of integer;Tree = Stack;

var// variable declarationsa : Queue;b : Stack;c : Tree;d : array [1..10] of integer;

begina := b; // Is this allowed?// Meaning to say “Is a and b // the same type?”

a := c; // Is this allowed?a := d; // Is this allowed?b := c; // Is this allowed?

end.

If you had said “yes” to most of it, chances are that you are adopting structural equivalence. If you had said “no” most of the time, then it is likely you are adopting name equivalence.

Page 35: Lecture 6: Data Representation II

35

2.3 Type EquivalenceDifference between type names and

anonymous type names.The type of a variable is either described

through: A type name: (1) those names defined using

a type definition command. (eg. ‘type’ for Pascal, ‘typedef’ for C.), or… (2) the primitive numeric types (eg. int, float)

Or directly through a type constructor (eg. array-of, record-of, pointer-to). In this case, the variable has an anonymous type name.

Page 36: Lecture 6: Data Representation II

36

2.3 Type Equivalence

type // type definitionsQ = array [1..10] of integer;S = array [1..10] of integer;T = S;

var// variable declarationsa : Q;b : S;c : T;d : array [1..10] of integer;

begina := b; // Is this allowed?// Meaning to say “Is a and b // the same type?”

a := c; // Is this allowed?a := d; // Is this allowed?b := c; // Is this allowed?

end.

Example

Q,S,T are type names

d has a type, but d does not have a type name.

Page 37: Lecture 6: Data Representation II

37

2.3 Type EquivalenceWhen are two types equivalent ()?

Rule 1: For any type name T, T T.Rule 2: If C is a type constructor and T1 T2, then CT1 CT2 .

Rule 3: If it is declared that type name = T, then name T.Rule 4 (Symmetry): If T1 T2,then T2 T1.

Rule 5 (Transitivity): If T1 T2 and T2 T3, then T1 T3.

What rules do you want to use?

Page 38: Lecture 6: Data Representation II

38

2.3 Type EquivalenceWhen are two types equivalent ()?

Rule 1: For any type name T, T T.Rule 2: If C is a type constructor and T1 T2, then CT1 CT2 .

Rule 3: If it is declared that type name = T, then name T.Rule 4 (Symmetry): If T1 T2,then T2 T1.

Rule 5 (Transitivity): If T1 T2 and T2 T3, then T1 T3.

Structural Equivalence will use all the rules to check for type equivalence.

Page 39: Lecture 6: Data Representation II

39

2.3 Type EquivalenceWhen are two types equivalent ()?

Rule 1: For any type name T, T T.Rule 2: If C is a type constructor and T1 T2, then CT1 CT2 .

Rule 3: If it is declared that type name = T, then name T.Rule 4 (Symmetry): If T1 T2,then T2 T1.

Rule 5 (Transitivity): If T1 T2 and T2 T3, then T1 T3.

(Pure) Name Equivalence will use only the first rule. Unless the two variables have the same type name, they will be treated as different type

Page 40: Lecture 6: Data Representation II

40

2.3 Type EquivalenceWhen are two types equivalent ()?

Rule 1: For any type name T, T T.Rule 2: If C is a type constructor and T1 T2, then CT1 CT2 .

Rule 3: If it is declared that type name = T, then name T.Rule 4 (Symmetry): If T1 T2,then T2 T1.

Rule 5 (Transitivity): If T1 T2 and T2 T3, then T1 T3.

Declarative Equivalence will leave out the second rule.

Page 41: Lecture 6: Data Representation II

41

2.3 Type Equivalence

type // type definitionsQ = array [1..10] of integer;S = array [1..10] of integer;T = S;

var// variable declarationsa,x : Q;b : S;c : T;d : array [1..10] of integer;e : array [1..10] of integer;

begina := x; // Is this allowed?// Meaning to say “Is a and b // the same type?”a := b; // Is this allowed?a := c; // Is this allowed?a := d; // Is this allowed?b := c; // Is this allowed?d := e; // Is this allowed?

end.

Example

yes

yesyesyesyes

yes

SE

yes

nononono

no

NE

yes

nonoyesno

no

DE

R1: For any type name T, T T.R2: If C is a type constructor and T1 T2, then CT1 CT2 .

R3: If it is declared that type name = T, then name T.R4 (Symmetry): If T1 T2,then T2 T1.

R5 (Transitivity): If T1 T2 and T2 T3, then T1 T3.

Page 42: Lecture 6: Data Representation II

42

2.3 Type EquivalenceName Equivalence Easy to implement checking,

since we need only compare the name.

Very restrictive, inflexible.

type idxtype = 1..100; var count : integer; index : idxtype;

Structure Equivalence Harder to implement since

entire structures must be compared. Other issues to consider: eg. arrays with same sizes but different subscripts – are they the same type? (similar for records and enumerations)

More flexible, yet the flexibility can be bad too.

type celsius = real; fahrenheit = real;

var x : celsius; y : fahrenheit;

...x := y; // Allowed?

Page 43: Lecture 6: Data Representation II

43

2.3 Type Equivalence Different Languages adopt different rules. And the

rules may change for one language (people can change their minds too!)

Pascal Before 1982 – unknown. ISO1982 – Declarative Equivalence. ISO1990 – Structural Eqivalence.

C : Structural Equivalence, except for structs and unions, for which C uses declarative equivalence. If the two structs are in different files, then C goes back to structural equivalence.

C++ : Name Equivalence Haskell/SML : Structural Equivalence.

Page 44: Lecture 6: Data Representation II

44

Issues in Type CheckingHow to type-check?How to cater for polymorphism?What is your definition of “compatible

type”?When to perform type checking? Is your language strongly or weakly

typed?

Page 45: Lecture 6: Data Representation II

45

2.4 When to perform Type Checking?

Compile-Time

(Static Type Binding)

In theory, you can choose to type check at compile time or run-time.

In practice, languages try to do it as much statically as possible.

Eg. SML, Pascal

Run-Time

(Dynamic Type Binding)

No choice but to do dynamic type checking.

Eg. JavaScript, APL

When is the variable bound to the type?

When can I type check?

Page 46: Lecture 6: Data Representation II

46

2.4 When to perform Type Checking? Static Type Checking – done at compile time.

(+) Done only once (+) Earlier detection of errors (–) Less Program Flexibility (Fewer shortcuts and

tricks)

Page 47: Lecture 6: Data Representation II

47

2.4 When to perform Type Checking? Dynamic Type Checking – done at run time.

(–) Done many times (–) Late detection of errors (–) More memory needed, since we need to maintain

type information of all the current values in their respective memory cells.

(–) Slows down overall execution time, since extra code is inserted into the program to detect type error.

(+) Program Flexibility (Allows you to ‘hack’ dirty code.)

Refer to Pratt 5.1.4 for detailed discussion. Sebesta 5.5 is brief.

Page 48: Lecture 6: Data Representation II

48

2.4 When to perform Type Checking? Hybrid

Type check statically as much as possible. Those which you can’t type check statically, do it dynamically.

Are there such cases? Yes, when a language provides a construct to allow a memory location to store values of different types during different execution times (next section).

Page 49: Lecture 6: Data Representation II

49

Issues in Type CheckingHow to type-check?How to cater for polymorphism?What is your definition of “compatible

type”?When to perform type checking? Is your language strongly or weakly

typed?

Page 50: Lecture 6: Data Representation II

50

2.5 Strong Type Systems A programming language is defined to be

strongly typed if type errors are always detected STATICALLY.

A language with a strong-type system only allows type-safe programs to be successfully compiled into executables. (Otherwise, language is said to have a weak type system).

Programs of strong-type systems are guaranteed to be executed without type-error. (The only error left to contend with is logic error).

Page 51: Lecture 6: Data Representation II

51

2.5 Strong Type Systems

All programs

Programs from WEAK type

systems

Programs from STRONG type

systems

Programs which are SAFE to

execute

Page 52: Lecture 6: Data Representation II

52

2.5 Strong Type Systems

Fortran

Ada

Modula-3

C, C++

Java

Pascal

SML

Haskell

Strongly Typed?

Language Why?

No Allows variable of one type to refer to value of another type through EQUIVALENCE keyword.

No Library function UNCHECKED_CONVERSION suspends type checking.

No Same as Ada through use of keyword LOOPHOLE

No 1. Forced conversion of type through type casting

2. Union Types can compromise type safety

No Type Casting

Almost Variant Records can compromise type safety

Yes

YesAll variables have STATIC TYPE BINDING.

Page 53: Lecture 6: Data Representation II

53

2.5 Weak-Type Systems: Variant Recs Variant Records in C (via union keyword) compromises Type Safety

...typedef union { int X; float Y; char Z[4];} B;...B P;

Variant part all have overlapping (same) L-value!!! Problems can occur. What happens to the code below?

P.X = 142;printf(“%O\n”, P.Z[3])

All 3 data objects have same L-value and occupy same storage. No enforcement of type checking. Poor language and type system design

Page 54: Lecture 6: Data Representation II

54

2.5 Weak-Type Systems: Variant Recs Variant Records in Pascal tries to overcome C’s deficiency. They

have a tagged union type.

type whichtype = (inttype, realtype);type uniontype = recordcase V : whichtype of inttype : (X: integer); realtype: (Y: real);end

But the compiler usually doesn’t check the consistency between the variant and the tag. So we can ‘subvert’ the tagged field:var P: uniontypeP.V = inttype;P.X = 142;P.V = realtype; // type safety compromised

Page 55: Lecture 6: Data Representation II

55

2.5 Weak-Type Systems: Variant Recs Not only that, the tagged field is optional!!!

type whichtype = (inttype, realtype);type uniontype = record

case whichtype of inttype : (X: integer); realtype: (Y: real);end

Page 56: Lecture 6: Data Representation II

56

End of Lecture