Spr ch-05-compilers

41
.K. WAGH POLYTECHNIC, NASHIK-03 DEPARTMENT OF COMPUTER TECHNOLOGY Chapter-05 Compiler

Transcript of Spr ch-05-compilers

Page 1: Spr ch-05-compilers

K.K. WAGH POLYTECHNIC, NASHIK-03 DEPARTMENT OF COMPUTER TECHNOLOGY

Chapter-05

Compiler

Page 2: Spr ch-05-compilers

OVERVIEW OF LANGUAGE PROCESSING SYSTEM

By V. A. Pathan

Preprocessor

Compiler

Assembler

Linker-editor/Loader

Skeletal Source Program

Source Program

Target Assembly Program

Relocatable Machine Code

Library, relocatable obj file.

Absolute Machine Code

Page 3: Spr ch-05-compilers

Statements used in Program:1. Arithmetic Statements

position = initial + rate * 60

2. Non-Arithmetic Statementsif, goto, break etc.

3. Nonexecutable StatementsDeclarations, macros etc

Compiler

By V. A. Pathan

Page 4: Spr ch-05-compilers

Compiler is a translator program that translates a program written in High Level Language (the source program) into an equivalent program in Machine Level Language (the target program) . As an important part of a compiler is error showing to the programmer.

Compiler

By V. A. Pathan

Page 5: Spr ch-05-compilers

Phases of Compiler

By V. A. Pathan

Lexical Analyzer

Syntax Analyzer

Semantic Analyzer

Code Generator

Code Optimizer

Intermediate Code Generator

Symbol Table manager

Error Handler

Source Program

Target Program

Page 6: Spr ch-05-compilers

Phases of Compiler

By V. A. Pathan

Page 7: Spr ch-05-compilers

By V. A. Pathan

Symbol Table Management

An essential function of a compiler is to record the identifiers used in the source program and collect information about various attributes of each identifier.

These attributes may provide information about the storage allocated for an identifier, its type, its scope (where in the program it is valid) etc

Page 8: Spr ch-05-compilers

By V. A. Pathan

Error Detection and Reporting (Error Handling)

Each phase can encounter errors.

After detecting an error, a phase must somehow deal with that error, so that compilation can proceed, allowing further errors in the source program to be detected.

A compiler that stops when it finds the first error is not as helpful as it could be.

The syntax and semantic analysis phases usually handle a large fraction of the errors detectable by the compiler.

The lexical phase can detect errors where the characters remaining in the input do not form any token of the language.By V. A. Pathan

Page 9: Spr ch-05-compilers

By V. A. PathanBy V. A. Pathan

Error Detection and Reporting (Error Handling)

During syntax analysis the compiler tries to detect constructs that have the right syntactic structure but no meaning to the operation involved.

For example if we try to add two identifiers, one of which is the name of an array and the other the name of a procedure.

By V. A. Pathan

Page 10: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. Pathan

1. Lexical AnalysisThis is the first phase of compiler. This

phase is also called as scanning or linear scanning phase.

The compiler scans the source code from left to right, character by character, and groups these characters into tokens. Each token represents a logically cohesive sequence of characters such as variables, keywords, multi-character operators (>=,==,!= etc).

The main functions of this phase are:1. Identify the lexical units in source statement and

produce output as a sequence of tokens that the parser uses for syntax analysis.

2. Classify tokens into different lexical classes e.g. constants, reserved words, variables etc. and enter them in different tables.

3. To build literal table, identifier table and uniform symbol table.

By V. A. Pathan

Page 11: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

1. Lexical Analysis

By V. A. Pathan

Source

Program

Lexical Analyzer Parser

SymbolTable

Tokens

Get Next Token

Example: position = initial + rate * 60 position : identifier <id1>

= : operator initial : identifier <id2>

+ : operator rate : identifier <id3>

* : operator 60 : literal / constant

id1=id2 + id3 + 60

Page 12: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

Databases Used In Lexical Analysis Phase

1. Source Program : Original form of program written in high level language; appears to the compiler as a string of characters.

2. Terminal Table :A permanent data base that has an entry for each terminal symbol (e.g. arithmetic operators, keywords, nonalphamnumeric symbols ). Each entry consists of the terminal symbol, an indication of its classification, and its precedence.

3. Literal Table: Created by lexical analyzer to describe all literals used in the source program. There is only one entry for each literal, consisting of a value, a number of attributes, an address denoting the location of the literal at execution time, and other information.

By V. A. Pathan

Symbol Indicator Precedence

Literal Base Scale Precision Other Information

Address

Page 13: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

Databases Used In Lexical Analysis Phase

4. Identifier Table :Contains all variables in the program and temporary storage and information needed for to reference or allocate storage for them.

5. Uniform Symbol Table: Consists of a full or partial list of the tokens as they appear in the program. Created by lexical analysis and used for syntax and semantic analysis.

By V. A. Pathan

Name Data attributes Address

Page 14: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

2. Syntax Analysis

This phase is also called as parsing or hierarchical scanning phase.

A compiler determines whether the tokens recognized by the scanner are a syntactically legal statements.

The following operations are performed in this phase:

i. Obtain tokens from lexical analyzer.ii. Check whether the expression is syntactically

correct.iii. Report syntax error , if any.iv. Determine the statement class i.e. is it an

assignment statement, a condition statement etc.

v. Construct hierarchical structures called parse trees which represent syntactic structure of the program.

By V. A. Pathan

Page 15: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. Pathan

2. Syntax Analysis

By V. A. Pathan

Source

Program

Lexical Analyzer Parser

SymbolTable

Tokens

Get Next Token

Parse Tree

Rest of

Phases of

Compiler

Page 16: Spr ch-05-compilers

By V. A. Pathan

2. Syntax Analysis- Parse Tree

By V. A. Pathan

position = initial + rate * 60

=

+

*

position

initial

rate 60

=

+

*

id1

id2

id3 60

Page 17: Spr ch-05-compilers

2. Syntax Analysis- Parse Tree

By V. A. Pathan

=

identifier

identifier

identifier

+

*

60

Assignment statement

position = initial + rate * 60

position

expression

expression

expressionexpression

initial

expression

rate

number

Page 18: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

3. Semantic Analysis

A semantic analyzer checks the source program for semantic errors and collects the type information for the code generation.

Semantic analysis phase has the following functions:

i. Check phrases for semantic errors. e.g. int x = 10.5 should be detected as a semantic error.

ii. Maintain the symbol table which contains information about each identifier in a program .This information includes identifier type , scope of identifier etc.

By V. A. Pathan

Page 19: Spr ch-05-compilers

By V. A. Pathan

3. Semantic Analysis

By V. A. Pathan

iii. Using the symbol table, semantic analyzer enforces a large number of rules such as:

a. Every identifier is declared before its used.

b. No identifier is used in an inappropriate context(e.g. adding string to an integer).

c. Subroutine or function calls have a correct number and types of arguments.

d. Every function contains at least one statement that specifies a return value

Page 20: Spr ch-05-compilers

By V. A. Pathan

=

+

*

position

initial

rate

60

=

+

*

id1

id2

id3 int to real

60

int to real

By V. A. Pathan

3. Semantic Analysis

By V. A. Pathan

position = initial + rate * 60

Page 21: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

4. Intermediate Code Generation / Interpretation

The intermediate code generator produces a program in a different language, at an intermediate level between the source code and the machine code. Intermediate languages are sometimes assembly languages. The generation of an intermediate code offers the following advantages:

i. Flexibility: a single lexical analyzer/parser can be used to generate code for several different machines by providing separate back-ends that translate a common

intermediate language to a machine specific assembly language.

ii. Intermediate code is used in interpretation. The intermediate code is executed directly rather than translating it into binary code and storing it.

By V. A. Pathan

Page 22: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

4. Intermediate Code Generation / Interpretation

By V. A. Pathan

Source

Program

Lexical Analyzer Parser

Tokens

Get Next Token

Intermediate Code

Generator

SyntaxTree

Intermediate Code

Page 23: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

4. Intermediate Code Generation / Interpretation

e.g. x + y * z can be translated as t1 = y * z t2 = x + t1

Where t1 & t2 are compiler–generated temporary names.

e.g. :

t1 = inttoreal(60)t2 = id3*t1t3 = id2+t2id1 = t3

By V. A. Pathan

position = initial + rate * 60

Page 24: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

5. Code Optimization

Optimization improves programs by making them smaller or faster or both.

The goal of code optimization is to translate a program into a new version that computes the same result more efficiently – by taking less time, memory space, and other system resources.

Code optimization is achieved in 2 ways:

a) Rearranging computations in a program to make them execute more efficiently.

b) Eliminating redundancies in a program.

By V. A. Pathan

Page 25: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

5. Code Optimization

By V. A. Pathan

t1 := inttoreal(60)

t2 := id3*t1t3 := id2+t2id1 := t3

t1 := id3*60.0id1 := id2 +

t1

Code Optimization

position = initial + rate * 60The compiler can deduced that the conversion of 60 from integer to real representation can be done once and for all at compile time; so the inttoreal operation can be eliminated.

Besides t3 is used only once, to transmit its value to id1.

It then becomes safe to substitute id1 for t3, whereupon the last statement of intermediate code is not needed and the optimized code results.

Page 26: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

5. Code Optimization

Optimization Types1. Machine-dependent optimization:

optimization performed during code generation phase.

2. Machine-independent optimization:Optimization is performed in a separate

optimization phase.In this phase four techniques are used:

1. Elimination of common sub expression.2. Compile time compute.3. Boolean expression optimization.4. Move invariant computations outside

of loops.

3. Local TransformationsApplied over small segments of a program.

4. Global TransformationsApplied over larger segments consisting of loops or function bodies.

By V. A. Pathan

Page 27: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

5. Code Optimization- Data Bases Used

Matrix:

This is the major database used by the optimization phase.

For insertion and deletion of entries into the matrix, the chaining information is added to each entry, forward and backward pointers . This avoids necessity of reordering and relocating matrix entries when an entry is added or deleted.

The forward pointer is the index of next matrix entry and allow the code generation phase to go through the matrix in the proper order.

The backward pointer is the index of previous matrix entry and allows sequencing through the matrix as may be needed by the optimization technique.

By V. A. Pathan

Operator Operand1 Operand2 Forward Pointer Backward Pointer

Page 28: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

5. Code Optimization- Elimination of common sub expression.

Elimination of common sub expressionExpressions which yield same value and must be in

a same statement are common sub expressions.

e.g. Consider following statement.B=AA=C * D * ( D * C + B )

The elimination algorithm follows following steps:1. Place the matrix in a form so that common sub

expressions can be recognized. 2. Recognize two sub expressions as being equivalent.3. Eliminate one of them.4. Alter the rest of matrix to reflect the elimination of

this entry.

By V. A. Pathan

Page 29: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

5. Code Optimization- Elimination of common sub expression.

Source code: Matrix before optimization

B=AA=C * D * ( D * C + B )

By V. A. Pathan

M1 = B A 1 2

M2 * C D 2 3

M3 * D C 3 4

M4 + M3 B 4 5

M5 * M2 5 5 6

M6 = A M5 6 ?

M1 = B A 1 2

M2 * C D 2 3

M3 * C D 3 4

M4 + B M3 4 5

M5 * M2 M4 5 6

M6 = A M5 6 ? M1 = B A 1 2

M2 * C D 2 4

M3 * C D 2 4

M4 + B M2 4 5

M5 * M2 M4 5 6

M6 = A M5 6 ?

Matrix After steps1 & 2

Matrix After steps 3 & 4

Page 30: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

5. Code Optimization- Compile time evaluation.

Compile time evaluation ( Computation )

Certain computations in a program, involving constants can be performed during the compilation stage save both space and execution time.

The main optimization of this type is constant folding. If all the operands in an expression are constants, the operation cab be performed at compile time itself. The result of the operation, itself a constant, then replaces the original expression.

e.g. A = 2 * 150 / B ;Assignment of above type can be replaced by A = 300 / B

By V. A. Pathan

M1 * 2 150

M2 / M1 B

M3 = A M3

M1

M2 / 300 B

M3 = A M3

Before Optimization

After Optimization

Page 31: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

5. Code Optimization-Boolean expression optimization

Boolean expression optimization

Properties of Boolean expression can be used to shorten their computations.

e.g. In a statement IF A OR B OR C THEN …When A, B, and C are expressions

Rather than generating code that will test each of the expressions A, B, and C, only code is generated so that if A is computed as true, then B OR C is not computed, and similarly for B and C.

By V. A. Pathan

Page 32: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

5. Code Optimization-Move invariant computations outside of loops

Move invariant computations outside of loops

If a computation within a loop depends on a variable that does not change within that loop, the computations may be moved outside the loop.

This involves three steps:1. Recognition of invariant computations.2. Discovering where to move the invariant

computation.3. Moving the invariant computation.

i=0;do{

printf(“%d”, i);a=a+10;b=20;i++;

}while(i<10);By V. A. Pathan

Page 33: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

Storage Assignment

The purpose of this phase is to:

1. Assign storage to all variables referenced in the source program. 2. Assign storage to all temporary locations that are necessary for intermediate code generation. 3. Assign storage to literals. 4. Ensure that the storage is allocated and appropriate locations are initialized

Databases Used:1. Identifier table2. Literal Table3. Matrix4. Temporary storage table

By V. A. Pathan

Page 34: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

Storage Assignment

Temporary storage table:

Created by the interpretation phase to describe the temporary results of computations in the matrix. This table may be implemented as part of the identifier table since much of the information is of the same format.

By V. A. Pathan

Mi Base Scale Precision Storage Class

Other Information

Address

Page 35: Spr ch-05-compilers

By V. A. Pathan

Storage Assignment

Static allocation:Static allocation means that the data is allocated at

a place in memory that has both known size and address at compile time. Furthermore, the allocated memory stays allocated throughout the execution of the program.

The storage allocation phase first scan through the identifier table, assigning locations to each entry with a storage a storage class of static. It uses a location counter, initialized at zero, and follow following steps.

1. Update the location counter with any necessary boundary alignment.

2. Assign the current value of the location counter to the address field of the variable.

3. Calculates the length of the storage needed by the variable.

4. Updates the location counter by adding this length to it.

By V. A. Pathan

Page 36: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

5. Code Generation Phase

The final phase of the compiler is the generation of target code consisting normally of relocatable machine code or assembly code.

A knowledge of instructions and addressing modes in target computer is necessary for code generation phase.

Memory locations are selected for each of the variables used by the program.

Intermediate instructions are translated into a sequence of machine instructions that perform the same task.

By V. A. Pathan

Page 37: Spr ch-05-compilers

By V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. PathanBy V. A. Pathan

5. Code Generation Phase

By V. A. Pathan

t1 := inttoreal(60)

t2 := id3*t1t3 := id2+t2id1 := t3

t1 := id3*60.0

id1 := id2 + t1

Code Optimization

position = initial + rate * 60

MOV id3, R2MUL #60.0,

R2MOV id2, R1ADD R2, R1MOV R1, id1

Code Generation

Page 38: Spr ch-05-compilers

Questions

Q:1 Give standard code definitions for +, ∗ , –, = and generate code for following expression :

COST = RATE ∗ (START – FINISH) + 2 ∗ RATE ∗ (START – FINISH – 100)

Q:2 Explain use of reduction table in compiler.Q:3Describe uniform symbol table and explain process of tokenising with example.

Q:4 List and give syntax of database tables used in lexical analysis phase of compiler.

Q:5 Explain code optimization phase of compiler.Q:6 Explain in detail machine dependant optimization.Q:7 Explain compile time compute optimization with example.

Q:8 Explain purpose of storage assignment phase of compiler.

Q:9 Write a short note on optimization.

By V. A. PathanBy V. A. Pathan

Page 39: Spr ch-05-compilers

Questions

Q:10 Describe the main function of lexical phase of compiler.Q:11 Explain four purposes of storage assignment phase of compiler.Q:12 Describe the interpretation phase of compiler.Q:13 With neat diagram explain intermediate phase of compiler.Q:14 Explain the purpose of various phases of a compiler. Clearly mention the required input and output generated by each of these phases.Q:15 Define syntactic analysis.

Q:16 Consider a statement:z := a + b * c – d / e

Here, z, b, e are integers & a, c, d are float.By V. A. PathanBy V. A. Pathan

Page 40: Spr ch-05-compilers

By V. A. Pathan

Page 41: Spr ch-05-compilers

Thank You

By V. A. Pathan