Assembly Language Part 3. Symbols Symbols are assembler names for memory addresses Can be used to...

37
Assembly Language Part 3

Transcript of Assembly Language Part 3. Symbols Symbols are assembler names for memory addresses Can be used to...

Assembly Language

Part 3

Symbols

• Symbols are assembler names for memory addresses

• Can be used to label data or instructions• Syntax rules:

– start with letter– contain letter & digits– 8 characters max– CASE sensitive

• Define by placing symbol label at start of line, followed by colon

Symbol Table

• Assembler stores labels & corresponding addresses in lookup table called symbol table

• Value of symbol corresponds to 1st byte of memory address (of data or instruction)

• Symbol table only stores label & address, not nature of what is stored

• Instruction can still be interpreted as data, & vice versa

Example

Example program:

this: deco this, dstop.end

Output:

14592

What happened?

High Level Languages & Compilers

• Compilers translate high level language code into low level language; may be:– machine language– assembly language– for the latter an additional translation step is

required to make the program executable

C++/Java example// C++ code:#include <iostream.h>#include <string>string greeting =

“Hello world”; int main () {

cout << greeting << endl;return 0;

}

// Java code:public class Hello { static String greeting =

“Hello world”; public static void main (String [] args) {

System.out.print(greeting);

System.out.print(‘\n’); } }

Assembly language (approximate) equivalent

br maingreeting: .ASCII "Hello world \x00"main: stro greeting, dcharo '\n', istop.end

Data types

• In a high level language, such as Java or C++, variables have the following characteristics:– Name– Value– Data Type

• At a lower level (assembly or machine language), a variable is just a memory location

• The compiler generates a symbol table to keep track of high level language variables

Symbol table entries

The illustration drove shows a snippet of output from the Pep/8 assembler. Each symbol table entry includes:• the symbol• the value (of the symbol’s start address)• the type (.ASCII in this case)

Pep/8 Branching instructions

• We have already seen the use of BR, the unconditional branch instruction

• Pep/8 also includes 8 conditional branch instructions; these are used to create assembly language control structures

• These instructions are described on the next couple of slides

Conditional branching instructions

• BRLE:– branch on less than or equal– how it works: if N or Z is 1, PC = operand

• BRLT:– branch on less than– how it works: if N is 1, PC = operand

• BREQ:– branch on equal– how it works: if Z is 1, PC = operand

• BRNE:– branch on not equal– how it works: if Z is 0, PC = operand

Conditional branching instructions

• BRGE:– branch on greater than or equal– if N is 0, PC = operand

• BRGT:– branch on greater than– if N and Z are 0, PC = operand

• BRV:– branch if overflow– if V is 1, PC = operand

• BRC:– branch if carry– if C is 1, PC = operand

ExamplePep/8 code:

br mainnum: .block 2prompt: .ascii "Enter a number: \x00"main: stro prompt, ddeci num, dlda num, dbrge endiflda num, dnega ; negate value in asta num, dendif: deco num, dstop.end

HLL code:

int num;Scanner kb = new Scanner();System.out.print

(“Enter a number: ”);num = kb.nextInt();if (num < 0)

num = -num;System.out.print(num);

Analysis of examplePep/8 code:

br mainnum: .block 2prompt: .ascii "Enter a number: \x00"main: stro prompt, ddeci num, dlda num, dbrge endiflda num, dnega ; negate value in asta num, dendif: deco num, dstop.end

The if statement, if translated back to Java, would now be more like:

if (num >= 0);else

num = -num;

This part requires a little more explanation; see next slide

Analysis continued

• A compiler must be programmed to translate assignment statements; a reasonable translation of x = 3 might be:– load a value into the accumulator– evaluate the expression– store result to variable

• In the case above (and in the assembly language code on the previous page), evaluation of the expression isn’t necessary, since the initial value loaded into A is the only value involved (the second load is really the evaluation of the expression)

Compiler types and efficiency

• An optimizing compiler would perform the necessary source code analysis to recognize that the second load is extraneous– advantage: end product (executable code) is shorter

& faster– disadvantage: takes longer to compile

• So, an optimizing compiler is good for producing the end product, or a product that will be executed many times (for testing); a non-optimizing compiler, because it does the translation quickly, is better for mid-development

Another exampleHLL code:

final int limit = 100;int num;

System.out.print(“Enter a #: ”);if (num >= limit)

System.out.print(“high”);else

System.out.print(“low”);

Pep/8 code:

br mainlimit: .equate 100num: .block 2high: .ascii "high\x00"low: .ascii "low\x00"prompt: .ascii "Enter a #: \x00"main: stro prompt, d

deci num, dif: lda num, d

cpa limit, ibrlt elsestro high, dbr endif

else: stro low, dendif: stop.end

Compare instruction: cpr wherer is a register (a or x):action same as subr exceptdifference (result) isn’t stored inthe register – just sets status bits –if N or Z is 0, <= is true

Writing loops in assembly language

• As we have seen, an if or if/else structure in assembly language involves a comparison and then a (possible) branch forward to another section of code

• A loop structure is actually more like its high level language equivalent; for a while loop, the algorithm is:– perform comparison; branch forward if condition isn’t

met (loop ends)– otherwise, perform statements in loop body– perform unconditional branch back to comparison

Example

• The following example shows a C++ program (because this is easier to demonstrate in C++ than in Java) that performs the following algorithm:– prompt for input (of a string)– read one character– while (character != end character (‘*’))

• write out character• read next character

C++ code#include <iostream.h>

int main (){

char ch;cout << “Enter a line of text, ending with *” << endl;cin.get (ch);while (ch != ‘*’){

cout << ch;cin.get(ch);

}return 0;

}

Pep/8 code;am3ex5br mainch: .block 1prompt: .ascii "Enter a line of text ending with *\n\x00"main: stro prompt, d

chari ch, d ; initial readlda 0x0000, i ; clear accumulator

while: ldbytea ch, d ; load ch into Acpa '*', ibreq endWcharo ch, dchari ch, d ; read next letterbr while

endW: stop.end

Do/while loop

• Post-test loop: condition test occurs after iteration

• Premise of sample program:– cop is sitting at a speed trap– speeder drives by– within 2 seconds, cop starts following, going 5

meters/second faster– how far does the cop travel before catching

up with the speeder?

C++ version of speedtrap#include <iostream.h>#include <stdlib.h>

int main(){ int copDistance = 0; // cop is sitting still int speeder; // speeder's speed: entered by user int speederDistance; // distance speeder travels from cop’s position cout << "How fast is the driver going? (Enter whole #): "; cin >> speeder; speederDistance = speeder; do { copDistance += speeder + 5; speederDistance += speeder; } while (copDistance < speederDistance); cout << "Cop catches up to speeder in " << copDistance << " meters." << endl; return 0;}

Pep/8 version;speedTrapbr maincop: .block 2drvspd: .block 2drvpos: .block 2prompt: .ascii "How fast is the driver going? (Enter whole #): \x00"outpt1: .ascii "Cop catches up to speeder in \x00"outpt2: .ascii " meters\n\x00"main: lda 0, i

sta cop, dstro prompt, ddeci drvspd, d ; cin >> speeder;ldx drvspd, d ; speederDistance = speeder;stx drvpos, d

Pep/8 version continueddo: lda 5, i ; copDistance += speeder + 5;

adda drvspd, dadda cop, dsta cop, daddx drvspd, d ; speederDistance += speeder;stx drvpos, d

while: lda cop, d ; while (copDistance <cpa drvpos, d ; speederDistance);brlt do

stro outpt1, ddeco cop, dstro outpt2, dstop.end

For loops

• For loop is just a count-controlled while loop

• Next example illustrates nested for loops

C++ version#include <iostream.h>#include <stdlib.h>

int main(){ int x,y; for (x=0; x < 4; x++) { for (y = x; y > 0; y--) cout << "* "; cout << endl; }

return 0;}

Pep/8 version;nestforbr mainx: .word 0x0000y: .word 0x0000main: sta x, d

stx y, d

outer: adda 1, icpa 5, ibreq endosta x, dldx x, d

inner: charo '*', icharo ' ', isubx 1, icpx 0, ibrne innercharo '\n', ibr outer

endo: stop.end

Notes on control structures

• It’s possible to create “control structures” in assembly language that don’t exist at a higher level

• Your text describes such a structure, illustrated and explained on the next slide

A control structure not found in nature

• Condition C1 is tested; if true, branch to middle of loop (S3)

• After S3 (however you happen to get there – via branch from C1 or sequentially, from S2) test C2

• If C2 is true, branch to top of loop• No way to do this in C++ or Java

(at least, not without the dreaded goto statement)

High level language programs vs. assembly language programs

• If you’re talking about pure speed, a program in assembly language will almost always beat one that originated in a high level language

• Assembly and machine language programs produced by a compiler are almost always longer and slower

• So why use high level languages (besides the fact that assembly language is a pain in the patoot)

Why high level languages?

• Type checking:– data types sort of exist at low level, but the

assembler doesn’t check your syntax to ensure you’re using them correctly

– can attempt to DECO a string, for example

• Encourages structured programming

Structured programming

• Flow of control in program is limited to nestings of if/else, switch, while, do/while and for statements

• Overuse of branching instructions leads to spaghetti code

Unstructured branching

• Advantage: can lead to faster, smaller programs• Disadvantage: Difficult to understand

– and debug– and maintain– and modify

• Structured flow of control is newer idea than branching; a form of branching with gotos by another name lives on in the Java/C++ switch/case structure

Evolution of structured programming

• First widespread high level language was FORTRAN; it introduced a new conditional branch statement:if (expression) GOTO new location

• Considered improvement over assembly language – combined CPr and BR statements

• Still used opposite logic:if (expression not true) branch else

// if-related statements herebranch past else

else:// else-related statements here

destination for if branch

Block-structured languages

• ALGOL-60 (introduced in 1960 – hey, me too) featured first use of program blocks for selection/iteration structures

• Descendants of ALGOL include C, C++, and Java

Structured Programming Theorem

• Any algorithm containing GOTOs can be written using only nested ifs and while loops (proven back in 1966)

• In 1968, Edsgar Dijkstra wrote a famous letter to the editor of Communications of the ACM entitled “gotos considered harmful” – considered the structured programming manifesto

• It turns out that structured code is less expensive to develop, debug and maintain than unstructured code – even factoring in the cost of additional memory requirements and execution time