Introduction

25
D. M. Akbar Hussain: Department of Software & Media Technology 1 Compiler is tool: which translate notations from one system to another, usually from source code (high level code) to machine code (object code, target code, low level code). Introduction Introduction

description

Introduction. Compiler is tool: which translate notations from one system to another, usually from source code (high level code) to machine code (object code, target code, low level code). Compiler. Source Code. Compiler. Target Code. ?. Error Messages. What is Involved. - PowerPoint PPT Presentation

Transcript of Introduction

Page 1: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

1

Compiler is tool: which translate notations from one system to another, usually from source code (high level code) to machine code (object code, target code, low level code).

IntroductionIntroduction

Page 2: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

2

CompilerCompiler

CompilerCompiler

Error Messages

Source Code

Target Code

?

Page 3: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

3

What is InvolvedWhat is Involved

• Programming Languages

• Components of a Compiler

• Applications

• Formal Languages

Page 4: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

4

Programming LanguagesProgramming Languages

• We use natural languages to communicate

• We use programming languages to speak with computers

Page 5: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

5

Component of CompilersComponent of Compilers

• Analysis

• Lexical Analysis

• Syntax Analysis

• Semantic Analysis

• Synthesis

• Intermediate Code Generation

• Code Optimization

• Code Generation

Page 6: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

6

The InputThe Input

Read string input– sequence of characters– Character set:

• ASCII• ISO Latin-1• ISO 10646 (16-bit = unicode) Ada, Java• Others (EBCDIC, JIS, etc)

Page 7: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

7

The OutputThe Output

Tokens: kind, name

– Punctuation ( ) ; , [ ]

– Operators + - ** :=

– Keywords begin end if while try catch

– Identifiers Square_Root

– String literals “press Enter to continue”

– Character literals ‘x’

– Numeric literals • Integer • Floating_point

Page 8: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

8

Lexical AnalysisLexical Analysis

Free form languagesAll modern languages are free form – White space, tabs, new line, carriage return does not

matter just Ignore these.– Ordering of token is only important

Fixed format languages– Layout is critical

• Fortran, label in cols 1-6• COBOL, area A B• Lexical analyzer must know about layout to find tokens

Page 9: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

9

Relevant FormalismsRelevant Formalisms

Type 3 (Regular) Grammars Regular Expressions Finite State Machines

Useful for program construction, even if hand-written

Page 10: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

10

Interface to Lexical AnalyzerInterface to Lexical Analyzer

Either: Convert entire file to a file of tokens– Lexical analyzer is separate phase

Or: Parser calls lexical analyzer to supply next token– This approach avoids extra I/O– Parser builds tree incrementally, using successive tokens as tree

nodes

Page 11: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

11

Performance IssuesPerformance Issues

Speed– Lexical analysis can become bottleneck

– Minimize processing per character• Skip blanks fast

• I/O is also an issue (read large blocks)

– We compile frequently• Compilation time is important

– Especially during development

– Communicate with parser through global variables

Page 12: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

12

SyntaxSyntax Analysis Analysis (SA) (SA)

Page 13: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

13

SemanticSemantic Analyser Analyserzerzer

Page 14: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

14

Intermediate Code GenerationIntermediate Code Generation

Page 15: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

15

Code OptimizationCode Optimization

Page 16: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

16

Code GenerationCode Generation

Page 17: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

17

Data structure toolsData structure tools

Syntax tree: Literal table: Symbol table:

Page 18: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

18

Error handlerError handler

One of the difficult part of a compiler. Must handle a wide range of errors Must handle multiple errors. Must not get stuck. Must not get into an infinite loop (typical simple-minded

strategy:count errors, stop if count gets too high).

Page 19: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

19

Kinds of errorsKinds of errors

?

Page 20: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

20

Sample compilerSample compiler

TINY: Pages 22-26, 4-pass compiler for the TINY language based on Pascal.

C-Minus based on C : Pages 26-27 and Appendix A.

Page 21: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

21

TINY ExampleTINY Example

read x;

if x > 0 then

fact := 1;

repeat

fact := fact * x;

x := x - 1

until x = 0;

write fact

end

Page 22: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

22

C-Minus ExampleC-Minus Example

int fact( int x ){ if (x > 1) return x * fact(x-1); else return 1;}

void main( void ){ int x; x = read(); if (x > 0) write( fact(x) );}

Page 23: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

23

Structure of the TINY CompilerStructure of the TINY Compiler

globals.h main.c

util.h util.c

scan.h scan.c

parse.h parse.c

symtab.h symtab.c

analyze.h analyze.c

code.h code.c

cgen.h cgen.c

Page 24: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

24

Conditional Compilation OptionsConditional Compilation Options

NO_PARSE: Builds a scanner-only compiler.

NO_ANALYZE: Builds a compiler that parses and scans only.

NO_CODE: Builds a compiler that performs semantic analysis, but generates no code.

Page 25: Introduction

D. M. Akbar Hussain: Department of Software & Media Technology

25

Listing Options (built in - not flags)Listing Options (built in - not flags)

EchoSource: Echoes the TINY source program to the listing, together with line numbers.

TraceScan: Displays information on each token as the scanner recognizes it.

TraceParse: Displays the syntax tree in a linearlised format. TraceAnalyze: Displays summary information on the symbol table

and type checking. TraceCode: Prints code generation-tracing comments to the code

file.