Download - UNIT V Compiler Design - WordPress.com · UNIT V Compiler Design 2 | P a g e Types of symbol table Ordered symbol table Here , the entries of variables are made in alphabetical order.

UNIT V Compiler Design

1 | P a g e

UNIT-V

Symbol Table & Run-Time Environments

Symbol Table

Symbol table is a data structure used by compiler to keep track of

semantics of variable.

i.e. symbol table stores the information about scope and binding

information about names.

Symbol table is built in lexical and syntax analysis phases.

It is used by various phases as follows, semantic analysis phase refers

symbol table for type conflict issues. Code generation refers symbol table

to know how much run-time space is allocated? What type of space

allocated?

Use of symbol table

To achieve compile time efficiency compiler makes use of symbol table

It associates lexical names with their attributes.

The items to be stored in symbol table are,

Variable name

Constants

Procedure names

Literal constants & strings

Compiler generated temporaries

Labels in source program

Compiler uses following types of information from symbol table

Data type

Name

Procedure declarations

Offset in storage

In case of structure or record, a pointer to structure table

For parameters, whether pass by value or reference

Number and type of arguments passed

Base address


2 | P a g e

Types of symbol table

Ordered symbol table

Here , the entries of variables are made in alphabetical order.

Searching of ordered symbol table can be done using linear and binary

search.

Advantages :

The searching of particular variable is efficient.

Relationship between variables can be established easily.

Disadvantages:

Insertion of element is costly if there are large no. of entries in the

table.

Unordered symbol table

In this type of table, variable entries are not made in sorted manner.

Each time before inserting a variable in the table, a lookup is made to

check whether it is already present in the symbol table or not.

If the variable is not present, then an entry is made.

Advantage:

Insertion of variable is easier.

Disadvantage:

Searching is done using linear search.

For larger tables the method turns to be inefficient, because lookup is

made before every insertion.

How names are stored in symbol table?

There are two ways to store the names in symbol table.

Fixed-length name:

A fixed space is allocated for every symbol in the table. Space is wasted

in this type of storage, if name of the variable is small.


3 | P a g e

For example,

Variable- length names:

The amount of space required by string is used to store the names.

The name can be stored with the help of starting index and length of each

name.

E.g,

Name

Starting

Index length

Atribute

0 10

10 4

14 2

16 2

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17

c a l c u l a t e $ s u m $ a $ b $

Name

Atribute

C A L C U L A T E Float…

S U M Float…

A Int…

B Int…


4 | P a g e

Symbol table management Symbol table management is required for the following reasons,

For quick insertion of identifier and related information.

For quick search of identifier.

Following are commonly used data structures for symbol table

construction

List Data structure

Self organizing list/ Linked list

Binary tree

Hash tables

List data structure

Linear list is the simplest kind of mechanism to implement symbol table.

In this method array is used to store names and associated information.

New names can be added in the order they arrive.

The pointer available is maintained at the end of all stored records.

Name 1 Info 1

Name 2 Info 2

Name 3 Info 3

.

.

.

.

.

.

Name n Info n

To retrieve information about some name we start from the beginning and

go on searching up to available pointer.


5 | P a g e

If we reach the available pointer without finding the name, then we get an

error “use of undeclared name”.

While inserting a new name we should ensure that it should not be already

there. If it is already there, then another error occurs, i.e.. “multiple

defined name”.

Advantage is it takes less amount of space.

Self organizing list

This symbol table representation uses linked list. A link field is added to

each record.

We search the records in the order pointed by the link field.

A pointer “first”, is maintained to point to the first record of the symbol

table.

When the name is referenced or created it is moved to the front of the list.

The most frequently referred names will be tend to be front of the list.

Hence access time to most frequently referred names will be the least.

Insertion is easier.

Binary trees

The symbol table is represented as a binary tree as follows,

Left child

Symbols Information Right child

The left child stores address of previous symbol and right child stores

address of next symbol.

The symbol field is used to store the name of the symbol and information

field is used to store all attributes/information of the symbol.

The binary tree structure is basically a BST in which left side node is

always less and right side node is always more than the parent node.

Hence, insertion of symbol is efficient.

Searching process is efficient.

create a BST structure for following


6 | P a g e

Int m, n, p;

Int compute(int a, int b, int c)

{

t=a + b+ c;

Return (t); }

Void main()

{

Int k;

K= compute(10,20,30);

}

Hash tables

Hashing is an important technique used to search the records of symbol

table. This method is superior to list organization.

In hashing scheme two tables are maintained- a hash table and a symbol

table.

The hash table consists of k entries from o to k-1. these entries are

basically pointers to the symbol table pointing to the names in the symbol

table.

To determine whether the name is there in the symbol table, we use a hash

function ” h” such that h(name) will result any integer between o to k-1.

We can search any name using the function.

The hash function should result in uniform distribution of names in

symbol table.

The hash function should be such that there will be minimum no of

collisions.

Advantages of hashing is quick search is possible and the disadvantage is

it is complicated to implement. Extra space is required. Obtaining scope

of variables is difficult.


7 | P a g e

Runtime environment

The compiler demands a block of memory to the OS. This memory is

utilized for running or executing the compiled program.

This block of memory is called run time storage.

The run time storage is subdivided to hold the following,

The generated target code

Data objects

Information which keeps track of procedure activations

The size of generated code is fixed. Hence the target code occupies

statically determined area of the memory.

Compiler places the target code at the lower end of the memory.

The amount of memory required by the data objects is known at the

compile time and hence data objects also can be placed at the static data

area of memory.

To maximize the utilization of space at run time, the other two areas, stack

and heap are used at opposite ends of the remaining address space.

Stack is used to store data structures called activation records that gets

generated during procedure calls.

Heap area is the area of run time storage which is allocated to variables

during run-time.

Size of stack and heap is not fixed. i.e. it may grow or shrink during

program execution.

Storage Organization The executing target program runs in its own logical address space in which each

program value has a location.

The management and organization of this logical address space is shared between

the complier, operating system and target machine. The operating system maps the

logical address into physical addresses, which are usually spread throughout

memory.


8 | P a g e

Run-time storage comes in blocks, where a byte is the smallest unit of

addressable memory. Four bytes form a machine word. Multi byte objects

are stored in consecutive bytes and given the address of first byte.

The storage layout for data objects is strongly influenced by the addressing

constraints of the target machine.

A character array of length 10 needs only enough bytes to hold 10

characters, a compiler may allocate 12 bytes to get alignment, leaving 2

bytes unused.

This unused space due to alignment considerations is referred to as padding.

The size of some program objects may be known at run time and may be

placed in an area called static.

The dynamic areas used to maximize the utilization of space at run time are

stack and heap.

Storage organization strategies

Three different strategies based on the division of run time storage

Static allocation

Stack allocation

Heap allocation

Static Allocation

The size of data objects is known at compile time. The names of these

objects are bound to storage at compile time only and such allocation is

called as static allocation. Amount of storage do not change during run

time. At compile time, the compiler can fill the addresses at which the

target code can find the data it operates on.


9 | P a g e

Main limitation is recursive procedures are not supported by this type of

allocation because of the static nature.

Stack allocation

Here the storage is organized as a stack. (LIFO)

It is also called as control stack.

As activation begins the activation records are pushed onto the stack and

on completion of this activation the corresponding activation record is

popped out.

The local variables are stored in each activation record. Hence locals are

bound to corresponding activation record on each fresh activation.

The data structure can be created dynamically for stack allocation.

Here memory addressing is done using pointers or index registers hence

slower than static allocation.

Heap allocation

If the values of non-local variables must be retained even after the

activation record then such a retaining is not possible in stack allocation

because of its LIFO nature. Hence, heap allocation is used in such

situations.

The heap allocation allocates continuous block of memory when required

for storage of activation records or other data objects.

This memory can be deactivated when activation ends.

Heap management can be done by creating a linked list for the free blocks

and when any memory is deallocated that block can be appended to the

list.


10 | P a g e

Activation record

Activation record is a block of memory used for managing information

needed by a single execution of a procedure. The contents of activation

record are,

Temporaries: temporary variables used during evaluation of expressions.

Local data: it is the data that is local to the execution of procedure.

Saved machine status: this field holds information regarding to the status

of machine, before calling the procedure. This field contains the machine

registers and PC.

Control link: it is optional field. It points to activation record of calling

procedure. This link is also called as dynamic link.

Access link: it may be needed by the called procedure but found else

where, i.e. in another activation record.

Actual parameters: passed during call.

Return value: stores result of function call.


11 | P a g e

Example: Sketch of a quicksort program

Activation for Quicksort


12 | P a g e

Activation tree representing calls during an execution of quicksort

Downward-growing stack of activation records


13 | P a g e

Block and non block structure storage allocation

The storage allocation is done for two types of data,

Local data

Non-local data

The local data can be handled using activation record whereas non local data

can be handled using scope information.

Access to Local Data

The local data can be accessed with help of activation record.

The offset relative to base pointer of an activation record points to local data

variables within an activation record.

Hence, Reference to any variable x in procedure= Base pointer pointing to start

of procedure + offset of variable x from base pointer

E.g. consider the following,

Procedure A

Int a;

Procedure B

Int b;

Body of B;

Body of A;

Contents of stack along with base pointer is shown below,

Return Value

Saved Registers

Parameters

Locals: a

Activation

record for A offset


14 | P a g e

Access to non local Data

A procedure may sometimes refer to variables which are not local to it.

Such variables are called as non-local variables.

For the non local names there are two types of scope rules such as static and

dynamic.

The static scope rule is also called as lexical scope. In this type the scope is

determined by examining the program text.

The languages that use the static scope rules are called as block structured

language.

The dynamic scope rules determine the scope of declaration of the names at

run time by considering the current activation.

Static or lexical scope

Block is a sequence of statements containing the local data declarations and

enclosed within the delimiters.

The blocks can be nesting in nature.

The scope of declaration in a block structured language is given by most

closely nested loop or static rule.

offset

Activation record for B

Activation record for A


15 | P a g e

E.g.

Scope_test()

{

Int p, q;

{

Int p;

{

Int r;

}

……

}

….

{

Int q, s,t;

}

}

The storage for the names corresponding to a particular block can be shown

below.

B1

B2 B3

B4


16 | P a g e

The lexical scope can be implemented using access link and displays

Access link:

The implementation of lexical scope can be obtained by using pointer to each

activation record. These pointers are called access links.

If procedure P is nested within a procedure q then access link of p points to

access link of most recent activation record of procedure q.

Display:

It is expensive to traverse down access link every time when a particular non

local variable is accessed. To speed up the access to non locals can be achieved

by maintaining an array of pointers called display.

In display,

An array of pointers to activation record is maintained.

Array is indexed by nesting level

The pointers point to only accessible activation record.

The display changes when a new activation occurs and it must be reset

when control returns from the new activation.

Display stack

The advantage of using display is that , if p is executing, and it needs to access

element x belonging to some procedure q, we need to look only in display[i] where

i is the nesting depth of q.We will follow the pointer display[i] to the activation

record for q wherein x is found at some offset.

The compiler knows what ‘i’ is, so it can access display[i] easily.Hence no need to

follow a long chain of access links.

0

1

2


17 | P a g e

Heap Management

Heap is a portion of memory that holds data during the lifetime of the program.

In the heap, the static memory requirements such as global variables will be

allocated space.

In addition, any memory that is supposed to be used throughout the program is

also stored in heap.

Hence managing the heap is important.

A special software called Memory Manager manages allocation and

deallocation of memory.

Memory Manager:

Two basic functions of memory manager are,

Allocation:when a program requests memory for a variable, then the

memory manager produces a chunk of heap memory of requested size.

Deallocation:The memory manager deallocates the space adds it to the

pool of free space so that it can be reused.

Desired Properties of memory managers:

Space efficiency: should minimize the total heap space required by a

program

Program efficiency: should make better use of space, to run the program

faster to increase efficientcy.

Low overhead: allocation and deallocation should be effiecient.

Two types of memory allocation techniques are,

Explicit allocation

Implicit allocation

Explicit allocation is done using functions like new and dispose.

Whereas, implicit memory allocation is done by compiler using run time

support packages.

Explicit allocation

This is the simplest technique of explicit allocation where the size of block for

which memory is allocated is fixed.


18 | P a g e

In this technique, a free list is used which is a set of free blocks. Memory is

allocated from this list.

The blocks are linked each other in a list structure.

The memory allocation can be done by pointing previous node to the newly

allocated block.

Similarly deallocation can be done by dereferencing the previous link.

Explicit allocation of variable sized blocks

Due to frequent memory allocation and deallocation, the heap memory

becomes fragmented.

For allocating variable sized blocks we use strategies like first fit, worst fit and

best fit.

Implicit allocation

The implicit allocation is performed using user program and runtime packages.

The run time package is required to know when the storage block is not in use.

The format of storage block is given below.

Reference count: (RC) It is a special counter used during implicit allocation. If

any block is referred by any other block then its reference count is incremented

by one.

If the value of RC is 0, then it can be deallocated.

Marking techniques: it is an alternative approach whether the block is in use or

not. In this method , the user program is suspended temporarily and frozen

pointers are used to mark the blocks which are in use.

Parameter passing Mechanism

Types of parameters

Formal : parameters used in the function definition.

Actual: parameters passed during function call.

What is l-value & r-value?

R-value is the value of expression which is present at the right side of

assignment operator.

L-value is the address of memory location( or variable) present at the left


19 | P a g e

side of the assignment operator.

What are the Parameter passing methods ?

Call by value/pass by value.

Call by address/ Pass by reference.

Pass by copy-restore

Pass by Name

Example: swapping of two numbers( using ‘C ’ language)

Pass by Value:

In pass by value mechanism, the calling procedure passes the r-value of actual

parameters and the compiler puts that into the called procedure‟s activation record.

Formal parameters then hold the values passed by the calling procedure. If the

values held by the formal parameters are changed, it should have no impact on the

actual parameters.

Pass by Reference:

In pass by reference mechanism, the l-value of the actual parameter is copied to the

activation record of the called procedure. This way, the called procedure now has

the address (memory location) of the actual parameter and the formal parameter

refers to the same memory location. Therefore, if the value pointed by the formal

parameter is changed, the impact should be seen on the actual parameter as they

should also point to the same value.

Pass by Copy-restore:

This parameter passing mechanism works similar to „pass-by-reference‟ except

that the changes to actual parameters are made when the called procedure ends.

Upon function call, the values of actual parameters are copied in the activation

record of the called procedure. Formal parameters if manipulated have no real-time

effect on actual parameters (as l-values are passed), but when the called procedure

ends, the l-values of formal parameters are copied to the l-values of actual

parameters.


20 | P a g e

Example:

int y;

calling_procedure()

{

y = 10;

copy_restore(y); //l-value of y is passed

printf y; //prints 99

}

copy_restore(int x)

{

x = 99; // y still has value 10 (unaffected)

y = 0; // y is now 0

}

When this function ends, the l-value of formal parameter x is copied to the actual

parameter y. Even if the value of y is changed before the procedure ends, the l-

value of x is copied to the l-value of y making it behave like call by reference.

Pass by Name

Languages like Algol provide a new kind of parameter passing mechanism that

works like preprocessor in C language. In pass by name mechanism, the name of

the procedure being called is replaced by its actual body. Pass-by-name textually

substitutes the argument expressions in a procedure call for the corresponding

parameters in the body of the procedure so that it can now work on actual

parameters, much like pass-by-reference.

Garbage collection

The process of collecting unused memory (which was previously allocated to

variables/objects and no longer needed now) in a program and pool it in a form

to be used by other application is called as GARBAGE COLLECTION.

Few languages support automatic garbage collection and in other languages we


21 | P a g e

need to explicitly use the garbage collection techniques.

Basic idea is keep track of what memory is referenced and when it is no longer

accessible, reclaim the memory. Example: linked list.

Reference count garbage collectors

Garbage collection(GC) works as follows.

When an application needs some free space to allocate nodes and there is

no free space available to allocate memory to the objects then a system

routine called GARBAGE COLLECTOR is invoked.

The routine then searches the system for the nodes that are no longer

accessible from the external pointers. These nodes can be made available

for reuse by adding them to the available pool.

Reference count is a special counter used during implicit memory

allocation. If any block is referred by any other block then its reference

count is incremented by one.


22 | P a g e

The block is deallocated as soon as the reference count value becomes

zero.

These kind of garbage collectors are called as reference count garbage

collectors.

Advantages of GC are,

The manual memory management done by the programmer (using malloc,

realloc, free) is time consuming and error prone.

Reusability of memory is achieved using garbage collection.

Disadvantages are,

The execution of program is stopped or paused during garbage collection.

Sometimes a situation called Thrashing occurs.

Introduction

CODE GENERATION

The final phase of a compiler is code generator.

It receives an intermediate representation (IR) along with information in

symbol table.

Produces a semantically equivalent target program

Code generator main tasks:

Instruction selection

Register allocation and assignment

Instruction ordering


23 | P a g e

Issues in the Design of Code Generator:

The most important criterion is that it produces correct code

Input to the code generator: IR + Symbol table

We assume front end produces low-level IR, i.e. values of names in it can

be directly manipulated by the target machine.

Syntactic and semantic errors have been already detected.

The target program

Common target architectures are: RISC, CISC and Stack based machines.

In this chapter, we use a very simple RISC-like computer with addition of

some CISC-like addressing modes

Instruction selection:

The code generated must map the IR program into a code sequence that can be

executed by the target machine. The complexities of this mapping are,

The level of the IR(Intermediate representation)

The nature of the instruction-set architecture

The desired quality of the generated code.(speed & size)

Register allocation

Two sub problems are,

Register allocation: selecting the set of variables that will reside in

registers at each point in the program

Resister assignment: selecting specific register that a variable reside in.

Evaluation ordering

The order in which computations are performed can affect the efficiency

of the target code.

Because, some computation orders fewer registers to hold intermediate

results than other.

The Target Language

The target language performs following operations,


24 | P a g e

Load operations: LD r,x and LD r1, r2

Store operations: ST x,r

Computation operations: OP dst, src1, src2

Unconditional jumps: BR L

Conditional jumps: Bcond r, L like BLTZ r, L

The simple target machine model uses following addressing modes.

variable name: x

indexed address: a(r) like, LD R1, a(R2) means R1=contents(a + contents(R2))

integer indexed by a register : like LD R1, 100(R2)

Indirect addressing mode: *r means the memory location found in the location

represented by the contents of register r and *100(r) means

content(contents(100+contents(r)))

immediate constant addressing mode: like LD R1, #100

The three address statement x=y-z can be implemented by the machine instructions

LD R1,Y

LD R2,Z

SUB R1,R2

ST X,R1

suppose an array a whose elements are 8-byte real numbers. Then,

b = a [i]

LD R1, i //R1 = i

MUL R1, 8 //R1 = Rl * 8

LD R2, a(R1) //R2=contents(a+contents(R1))

ST b, R2 //b = R2

a[j] = c

LD R1, c //R1 = c

LD R2, j // R2 = j

MUL R2, 8 //R2 = R2 * 8

ST a(R2), R1 //contents(a+contents(R2))=R1

x=*p

LD R1, p //R1 = p

LD R2, 0(R1) // R2 = contents(0+contents(R1))

ST x, R2 // x=R2


25 | P a g e

conditional-jump three-address instruction

If x<y goto L

LD R1, x // R1 = x

LD R2, y // R2 = y

SUB R1, R1, R2 // R1 = R1 - R2

BLTZ R1, M // i f R1 < 0 jump t o M

Basic blocks and flow graphs

A graph representation of intermediate code that is helpful for code generation.

Partition the intermediate code into basic blocks, which are sequence of 3-

address code with properties that,

The flow of control can only enter the basic block through the first

instruction in the block. That is, there are no jumps into the middle of the

block.

Control will leave the block without halting or branching, except possibly

at the last instruction in the block.

The basic blocks become the nodes of a flow graph, whose edges indicate

which block can follow which other blocks.

Rules for finding leaders

The first three-address instruction in the intermediate code is a leader.

Any instruction that is the target of a conditional or unconditional jump is a

leader.

Any instruction that immediately follows a conditional or unconditional jump

is a leader.

Intermediate code to set a 10*10 matrix to an identity.


26 | P a g e

Three address code for the above code is,

Flow graph

Once basic blocks are constructed, the flow of control between these block can

be represented using edges.

There is an edge from B to C iff it is possible for the first instruction in block C

to immediately follow the last instruction in block B.

There are two ways such an edge can be justified.

There is a conditional jump from the end of B to beginning of C

C immediately follows B in the original order of the three address

instructions and B does not end in an unconditional jump.

Example: consider the above three address code, the leader instructions are,

1,2,3,10,12,13 because these statements are the targets of branch instructions. Now,

using these leaders we can construct 6 basic blocks and then these basic blocks can

be connected to each other using edges as shown below.


27 | P a g e

(Flow graph)

A Simple code generator

One of the primary issues in code generation is deciding how to use registers to

best advantage. Following are principal use of registers.

In most machine architectures, some or all of the operands of an operation must

be in registers in order to perform the operation.

Registers make good temporaries - places to hold the result of a subexpression

while a larger expression is being evaluated.

Registers used to hold global values which are computed in one basic block

and used in other blocks.

Registers are often used to help with run-time storage management, for

example, to manage the run-time stack, including the maintenance of stack

pointers and possibly the top elements of the stack itself.


28 | P a g e

Descriptors

For each available register, a register descriptor keeps track of the variable

names whose current value is in that register. Since we shall use only those

registers that are available for local use within a basic block, we assume that

initially, all register descriptors are empty. As the code generation progresses,

each register will hold the value of zero or more names.

For each program variable, an address descriptor keeps track of the location or

locations where the current value of that variable can be found. The location

might be a register, a memory address, a stack location, or some set of more

than one of these. The information can be stored in the symbol-table entry for

that variable name.

The Code-generation Algorithm

This algorithm uses a function called getReg(I) which select registers for each

memory location associated with instruction I.

Use getReg(x = y + z) to select registers for x, y, and z. Call these Rx, Ry and

Rz.

If y is not in Ry (according to the register descriptor for Ry), then issue an

instruction LD Ry, y', where y' is one of the memory locations for y (according

to the address descriptor for y).

Similarly, if z is not in Rz, issue and instruction LD Rz, z', where z' is a location

for x .

Issue the instruction ADD Rx , Ry, Rz.

Rules for updating the register and address descriptors

1. For the instruction LD R, x

Change the register descriptor for register R so it holds only x.

Change the address descriptor for x by adding register R as an additional

location.

2. For the instruction ST x, R, change the address descriptor for x to include its

own memory location.

3. For an operation such as ADD Rx, Ry, Rz implementing a three-address

instruction x = y + x

Change the register descriptor for Rx so that it holds only x.

Change the address descriptor for x so that its only location is Rx. Note


29 | P a g e

that the memory location for x is not now in the address descriptor for x.

Remove Rx from the address descriptor of any variable other than x.

4. When we process a copy statement x = y, after generating the load for y into

register Ry, if needed, and after managing descriptors as for all load statements (per

rule I):

Add x to the register descriptor for Ry.

Change the address descriptor for x so that its only location is Ry .

Example

Let us consider a basic block containing the following 3-address code.

t=a-b

u=a-c

v=t+u

a=d

d=v+u

Instructions generated and the changes in the register and address descriptors is

shown below.


30 | P a g e

Rules for picking register Ry for y

If y is currently in a register, pick a register already containing y as Ry. Do not

issue a machine instruction to load this register, as none is needed.

If y is not in a register, but there is a register that is currently empty, pick one

such register as Ry.

The difficult case occurs when y is not in a register, and there is no register that

is currently empty. We need to pick one of the allowable registers anyway, and

we need to make it safe to reuse.

Peephole optimizations

An alternative approach of code generation is to generate naïve code and then

improve the quality of the target code by applying optimization.

A simple but effective technique for locally improving the target code is

peephole optimization.

This is done by examining a sliding window of target instructions (called

peephole) and replacing the instruction sequences within the peephole by a

shorter or faster sequence whenever possible.


31 | P a g e

Characteristic of peephole optimizations

Redundant-instruction elimination

Flow-of-control optimizations

Algebraic simplifications

Use of machine idioms

Redundant-instruction elimination

LD R0,a

ST a,R0

Eliminating unreachable code

Example 1:

Sum=0

if(sum)

printf(“%d”,sum);

Example 2:

Int fun(int a, int b)

{ c=a+b;

Return c;

Printf(“%d”,c);}

Flow-of-control optimizations

goto L1

...

Ll: goto L2

Can be replaced by:

goto L2

Algebraic simplifications

There is no end to the amount of algebraic simplification that can be attempted

through peephole optimization. Only a few algebraic identities occur frequently

enough that it is worth considering implementing them .For example, statements

such as

x := x+0

Or

x := x * 1


32 | P a g e

Are often produced by straightforward intermediate code-generation algorithms,

and they can be eliminated easily through peephole optimization.

Reduction in Strength: Reduction in strength replaces expensive operations by

equivalent cheaper ones on the target machine. Certain machine instructions are

considerably cheaper than others and can often be used as special cases of more

expensive operators.

For example, x² is invariably cheaper to implement as x*x than as a call to an

exponentiation routine. Fixed-point multiplication or division by a power of two is

cheaper to implement as a shift. Floating-point division by a constant can be

implemented as multiplication by a constant, which may be cheaper.

X2 → X*X.

Use of Machine Idioms:

The target instructions have equivalent machine instruction for performing some

operations. Hence we can replace these target instructions by equivalent machine

instructions in order to improve efficiency. E.g. some machines have auto-

increment and auto-decrement addressing modes that are used to perform addition

and subtraction.

Register Allocation and Assignment

Instructions involving register operands are shorter and faster than those involving

operands in memory.

The use of registers is subdivided into two sub problems:

Register allocation – the set of variables that will reside in registers at a point in

the program is selected.

Register assignment – the specific register that a variable will reside in is picked.

Following are the techniques.

Global Register Allocation

Usage Counts

Register Assignment for Outer Loops

Register Allocation by Graph Coloring

Global register allocation

Previously explained algorithm does local (block based) register allocation

This resulted that all live variables be stored at the end of block


33 | P a g e

To save some of these stores and their corresponding loads, we might arrange

to assign registers to frequently used variables and keep these registers

consistent across block boundaries (globally)

Some options are:

Keep values of variables used in loops inside registers

Use graph coloring approach for more globally allocation

Usage counts

The usage count is the count for the use of some variable x in some register

used in any basic block.

Usage count gives the idea about how many units of cost can be saved by

selecting a specific variable for global register allocation.

The approximate formula for usage count for the loop L in some basic block B

can be given by,

∑ (use(x,B)+2*(live(x,B))

Sum over all blocks (B) in a loop (L)

For each uses of x before any definition in the block we add one unit of

saving

If x is live on exit from B and is assigned a value in B, then we add 2 units

of saving

Ex:

Here, usage count of a,b,c,d,e & f is 4,5,3,6,4 &4 respectively. Hence,

if you have three registers (R1,R2,R3) then , these registers are allocated to the


34 | P a g e

variables c,b and a because they have highest usage count value.

Flow graph of an inner loop

Register assignment for outer loops

Consider that there are two loops L1 and L2 where L2 is the inner loop and

allocation of variable a is to be done to some register.

Following criteria should be adopted for register assignment for outer loop.

If a is allocated in loop L2 then it need not be allocated in L1-L2.

If a is allocated in L1 and it is not allocated in L2 then store a on an entrance to L2

and load a while leaving L2.

If a is allocated in L2 and not in L1 then load a on entrance of L2 and store a on

exit from L2.

Register allocation by Graph coloring

When we need a register for allocation but all registers are occupied then we

need to make some register free for reusability. The register selection is done

using following technique.(Graph coloring)

Two passes are used in this technique.

In the first pass the specific machine instruction is selected for register

allocation. For each variable a symbolic register is allocated.

In the second pass the register interference graph is constructed. In this

graph, each node is a symbolic register and an edge connects two nodes

where one is live at a point where other is defined.

Use a graph coloring algorithm to assign registers so that no two registers

can interfere with each other with assigned physical registers.