Code Generation Compiler Baojian Hua [email protected].

21
Code Generation Compiler Baojian Hua [email protected]
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    242
  • download

    0

Transcript of Code Generation Compiler Baojian Hua [email protected].

Code Generation

CompilerBaojian Hua

[email protected]

Front End

source code

abstract syntax

tree

lexical analyzer

parser

tokens

IRsemantic analyzer

Back End

IR

TempMap

instruction selector

register allocator

Assem

Asseminstruction scheduler

Code Generation Generating code for some ISA

this course uses x86 Many components

instruction selection, register allocation, scheduling, …

Many different strategies for this time, we concentrate on a simple

one: stack machine and later in this course, we’d turn to more

advanced (and sophisticated) ones

What’s a stack machine? A stack machine has only an operand

stack and no (or few) registers all computation performed on the operand

stack architecture very simple and uniform

Long history: Date back at least to 70’s last century Renew industry’s interest in the recent

decade Sun’s JVM and Microsoft’s CLR, etc.

Stack Machine ISA: s86prog -> instr prog

->

instr -> push v

-> pop id

-> add

-> sub

-> times

-> divide

v -> num

-> id

// Sample Program

push 8

push 2

push x

times

sub

The simple expression lang’// recall our simple

// expression language

exp -> num

-> id

-> exp + exp

-> exp – exp

-> exp * exp

-> exp / exp

-> (exp)

// or in ML

datatype exp

= Int of int

| Id of string

| Add of exp * exp

| Sub of exp * exp

| Times of exp * exp

| Divide of exp * exp

// Sample Program

8-2*x

Code gen’ from exp to s86C (num) = push num

C (id) = push id

C (e1 + e2) = C (e1); C (e2); add

C (e1 – e2) = C (e1); C (e2); sub

C (e1 * e2) = C (e1); C (e2); times

C (e1 / e2) = C (e1); C (e2); divide

Code gen’ from exp to s86// or in ML

fun C (e) =

case e

of Num i => push i

| Id s => push s

| Add (e1, e2) =>

C (e1);

C (e2);

add

| … => (* similar *)

ExampleC (8-2*x) = C(8); C(2*x); sub = push 8; C(2*x); sub = push 8; C(2); C(x); times; sub = …

Moral Code generation for stack machine is

dirty simple recursive equation from point view of math recursive function from point view of CS think before hack!

But we’d have more to say about: variable storage more language features

statement, declarations, functions, etc..

Address space

Address space is the way how programs use memory highly architecture

and OS dependent right is the typical

layout of 32-bit x86/Linux

OS

heap

data

text

BIOS,VGA

0x00100000

stack

0xc00000000

0x08048000

0x00000000

0xffffffff

Static Storage

Static storage is an area of space in data section a typical use is to hold C/C++ file scope va

riables (static) and extern variable (global)

Exp lang’ has only static variables, all can be stored to static section so require a pass to collect all variables

Declarations// scale exp a bitprog -> decs expdecs -> int id; decs -> exp -> …

// or in ML

datatype decs =

T of {var: string,

ty: tipe} list

// Sample Program

int x;

8-2*x;

Code gen’ rulesD (int id; decs) =

id:

.int 0

D (decs)

D ( ) =

Statement// scale the exp a by adding the following:s -> id = e; -> if (e) s else s// compile:CS (id = e;) = C (e); pop id

Statement, cont’// s86 should also be modified!

// compile:

CS (if e s1 s2) =

C(e);

jz .Lfalse

.Ltrue:

CS(s1)

jmp .Lend

.Lfalse:

CS(s2)

.Lend

e

s1 s2

Moral It’s also straightforward to translate

other control structure in this style while, for, switch, etc..

This kind of code generation is called recursive decedent may be done at parsing time adopted in many compilers

read the offered article on Borland Turbo Pascal 3.0

you may safely ignore the Pascal-specific features

From s86 to x86 Run the generated s86 code?

design a virtual machine as we did in lab #1 this is also the way of JVM or CLR

translate to native code and then exec’ it

so-called just-in-time (JIT) the dominant OO method today…

Next, we discuss the 2nd method by mapping s86 to x86

Operand Stack// x86 does not have a dedicated operand stack?// Solution 1: use the control stack: ebp, esp// leave to you.// Solution 2: make a fake operand stack, as in:

.set PAGE, 4096

.dataopStack:

.space PAGE, 0xcctop:

.int opStack+PAGE// “top” points to stack top, and stack grows // down to lower address

Instructions// map fake s86 instructions to x86’s:

.macro s86push xsub dword ptr [top], 4mov ebx, [top]mov eax, \xmov [ebx], eax.endm

// others are similar.

// Care must be taken to take account of the // machine constraints. For instance, mem-mem // move is illegal on x86.