80x87 Math Coprocessor · PDF file08/10/2012 · 80x87 Math Coprocessor. ... DW...

80x87 Math Coprocessor80x87 Math Coprocessor

Comparison of 8087 and 8086 Clock Times

InstructionApproximate Execution Time

8087 8086 EmulationMultiply (single precision) 19 1,600Multiply (double precision) 27 2,100Add 17 1,600Divide (single precision) 39 3,200Compare 9 1,300Load (single precision) 9 1,700Store (single precision) 18 1,200Square root 36 19,600Tangent 90 13,000Exponentiation 100 17,100

IEEE Single-Precision Floating-Point Numbers

• IEEE SP FP (short real) numbers use only 32 bits of data torepresent any real number

• The range of SP FP number is 2128 to 2-126.• This translates approximately to a range of 1.2 10-38 to 3.4 10+38 in

decimal numbers, for both positive and negative values.• To make the hardware design of the math processors much easier

and less transistor consuming, the exponent part is added to aconstant of 7FH (127 decimal).

Biased Exp. Fraction (Significand)022233031

23-bit8-bit1-bit

Conversion from real to floating point• The real number is converted to its binary form.• The binary number is represented in scientific form: 1.xxxxEyyyy• Bit 31 is either 0 for positive or 1 for negative.• The exponent portion, yyyy, is added to 7F to get the biased

exponent, which is placed in bits 23 to 30.• The significand, xxxx, is placed in bits 22 to 0.

• Example: Convert 9.7510 to single-precision floating point.• Decimal 9.75 = binary 1001.11 = 1.0011123 = scientific binary

1.00111E3– sign bit 31 is 0 for positive– exponent bits 30 to 23 are 1000 0010 (3 + 7F = 82H) after biasing– significand bits 22 to 0 are 00111000000000000000000

0 1000 0010 00111000000000000000000022233031

23-bit8-bit1-bit411C0000H

Example

Convert 0.07812510 to short real FP (single precision).Solution:• decimal 0.078125 = binary 0.000101 = 1.012-4

• scientific binary 1.01E-4• sign bit 31 is 0 for positive• exponent bits 30 - 23 are 0111 1011 (-4 + 7F = 7B) after biasing• significand bits 22 - 0 are 010 0000 0000 0000 0000 0000• This number will be represented in binary and hex as 3DA00000

0 011 1101 1 010 0000 0000 0000 0000 0000022233031

23-bit8-bit1-bit

IEEE double-precision floating-point numbers

• Double-precision FP (long real) can represent numbers in the range2.310-308 to 1.710308, both positive and negative.– 52 bits (bits 0 to 51) are for the significand,– 11 bits (bit 52 to 62) are for the exponent,– bit 63 is for the sign.

• The conversion process– The real number must first be represented as 1.xxxxxxxEyyyy,– yyyy is added to 3FF to get the biased exponent.

Biased Exp. Fraction (Significand)051526263

52-bit11-bit1-bit

Example

• Convert 152.187510 to double-precision FP.• Solution:

– decimal 152.1875 = binary 10011000.0011 = 1.0011000001127

– scientific binary 1.00110000011E7– bit 63 is 0 for positive– exponent bits 62-52 are 10000000110 (7+3FF=406) after biasing– fractionbits 51 - 0 are 00110000011000 … 000

0 100 0000 0110 0011 000 0011 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000051526263

52-bit11-bit1-bit

4063030000000000

Representation of special valuesIEEE Single Precision

Exponent Fraction Representse = 0 f = 0 0e = 0 f ≠ 0 0.f × 2-126

1≤ e ≤ 254 — 1.f × 2e

e = 255 f = 0 ∞e = 255 f ≠ 0 NAN

Other Data Formats of the 8087

• In addition to short real (SP) and long real (DP) representations for realnumbers, the 8087 also supports– 16-bit (signed word) integers– 32-bit (signed short) integers– 64-bit (signed long) integers

• There are also two 80-bit data formats in the 8087 coprocessor– packed decimal (18 packed BCD numbers) and bit 79 for the sign

– temporary real format is used internally by the 8087• The conversion goes through the same process, except that the

biased exponent is calculated by adding the constant 3FFFH.

0 Biased Exp. Fraction (Significand)

62647879

63-bit15-bit1-bit1

1-bit

63

Directives for Coprocessor

• In MASM and compatible assemblers, there are different directivesto define the different data types of the coprocessor:

DD Define double word (32-bit) for short real (single precision)DQ Define quad word (64-bit) for long real (double precision)DW Define word (16-bit) for word integerDD Define double word (32-bit) for short integerDQ Define quad word (64-bit) for long integerDT Define ten bytes (80-bit) for packed decimalDT Define ten bytes (80-bit) for temporary real

80x87 registers

• There are only 8 general-purpose registers in the 80x87.• All the registers of the 8087 are 80 bits wide.• Every time the 8087 loads an operand, it automatically converts

it to this 80-bit format.• This makes programming, as well as 8087 hardware design, much

easier.• Although these 8 registers have been numbered from 0 to 7, they

are accessed like a stack (last-in-first-out policy).• At any given time, the top of the stack is referred to as ST(0), or

simply ST• All other registers, regardless of their number, are referred to

according to their positions compared to the top of the stack, ST.

Note That

• All 80x87 mnemonics start with the letter “f” to distinguish them from80x86 instructions.

• The 80x87 must be initialized to make sure that the top of the stackwill be register number 7. (finit)

• Whenever a register is not identified specifically, ST(0) is assumedautomatically.

• ST, the stack top, also called ST(0),• ST(1), the register just below the stack top,• ST(2), the register just below ST(1),• ST(3), ST(4), ST(5), ST(6), and• ST(7), the register at the bottom of the stack.

000001010011100101110111

(a) FINIT

Real transfers

X DD 9.75• FLD src ; pushes source operand onto ST(0): decrement the stack pointer

; source may be ST(i) or memory: FLD x FLD ST(3); FLD ST(0) duplicates stack top: FLD ST(0)

• FST dest ; copies ST(0) to destination: FST x; dest may be ST(i) or short or long real variable: FLD ST(2)

• FSTP dest ; copies ST(0) to dest then pops ST(0): FSTP x; dest may be ST(i) or short or long or temporary real memory; FSTP ST(0) popping the stack with no data transfer

• FXCH dest ; swaps contents of ST(0) and destination ST(i); FXCH with no operands swaps ST(0) and ST(1); FXCH is frequently used to move a register to the top before; using an instruction which assumes ST(0)

Integer transfersi DD 9B DT 321•FILD src ; converts source to temporary real and pushes onto ST(0)

FILD i•FIST dest ; rounds ST(0) to integer and copies to destination FIST i

; dest may be a word or short integer

•FISTP dest ; functions the same as FIST but then pops ST(0); dest may be any binary integer data type

•FBLD src ; converts source to temporary real and pushes onto ST(0)FBLD B

•FBST dest ;XXXX

•FBSTP dest ; converts ST(0) to BCD and stores at dest then pops stack

Addition

Mnemonic Operand Action

fadd (none)pops both ST(0) and ST(1);adds these values;pushes sum onto the stack

fadd st(i), st(0) adds ST(i) and ST(0);replaces ST(0) by the sum

fadd st(0), st(i) adds ST(0) and ST(i);replaces S(0)T by the sum

fadd memory (real) adds ST(0) and real number from memory;replaces ST(0) by the sum

fiadd memory (int) adds ST(0) and integer from memory;replaces ST(0) by the sum

faddp st(i), st(0)adds ST(i) and ST(0);replaces ST(i) by the sum;pops ST(0) from stack

Subtract

Mnemonic Operand Action

fsub (none)pops ST(0) and ST(1);calculates ST(1) – ST(0);Pushes difference onto the stack

fsub st(i), st calculates ST(i) - ST(0);replaces ST(i) by the difference

fsub st, st(i) calculates ST(0) - ST(i);replaces ST(0) by the difference

fsub memory (real) calculates ST(0) - real number from memory;replaces ST(0) by the difference

fisub memory (int) calculates ST(0) - integer from memory;replaces ST(0) by the difference

fsubp st(i), stcalculates ST(i) – ST(0);replaces ST(i) by the difference;pops ST(0) from the stack

Subtract ReverseMnemonic Operand Action

fsubr (none)pops ST(0) and ST(1);calculates ST(0) - ST(1);pushes difference onto the stack

fsubr st(i), st calculates ST(0) -ST(num);replaces ST(i) by the difference

fsubr st, st(i) calculates ST(i) – ST(0);replaces ST(0) by the difference

fsubr memory(real)

calculates real number from memory -ST(0);replaces ST(0) by the difference

fisubr memory (int) calculates integer from memory – ST(0);replaces ST(0) by the difference

fsubpr st(i), stcalculates ST(0) -ST(i);replaces ST(i) by the difference;pops ST(0) from the stack

Multiply

Mnemonic Operand Actionfmul (none) pops ST(0) and ST(1);

multiplies these values;pushes product onto the stack

fmul st(i), st multiplies ST(i) and ST(0);replaces ST(i) by the product

fmul st, st(i) multiplies ST(0) and ST(i);replaces ST(0) by the product

fmul memory (real) multiplies ST(0) and real number frommemory;replaces ST(0) by the product

fimul memory (int) multiplies ST(0) and integer from memory;replaces ST(0) by the product

fmulp st(i), st multiplies ST(i) and ST(0);replaces ST(i) by the product;pops ST(0) from stack

Division

Mnemonic Operand Actionfdiv (none) pops ST(0) and ST(1);

calculates ST(1) / ST(0);pushes quotient onto the stack

fdiv st(i), st calculates ST(i) / ST(0);replaces ST(i) by the quotient

fdiv st, st(i) calculates ST(0) / ST(i);replaces ST(0) by the quotient

fdiv memory (real) calculates ST(0) / real number from memory;replaces ST(0) by the quotient

fidiv memory (int) calculates ST(0) / integer from memory;replaces ST(0) by the quotient

fdivp st(i), st calculates ST(i) / ST(0);replaces ST(i) by the quotient;pops ST(0) from the stack

Write an 8087 program that loads three values forX, Y, and Z, adds them, and stores the result.

;in the data segmentX DD 9.75Y DD 13.09375Z DD 29.0390625SUM DD ?;in the code segmentfinit ;initialize the 8087 to start at the top of stackfld X ;load X into ST(0). now ST(0)=Xfld Y ;load Y into ST(0). now ST(0)=Y and ST(1)=Xfld Z ;load Z into ST(0). now ST(0)=Z, ST(1)=Y, ST(2)=Xfadd ST(0),ST(1) ;add Y to Z and save the result in ST(0)fadd ST(0), ST(2) ;add X to (Y+Z) and save it in ST(0)fst sum ;store ST(0) in memory location called sum

000001010011100101110111

(a) FINITX

(b) FLD XST(0)

YX

(c) FLD YST(1)ST(0)

ZYX

(d) FLD ZST(2)ST(1)ST(0)

X+Y+ZYX

(f) FADD ST(0),ST(2)

ST(2)ST(1)ST(0) X+Y+Z

YX

(g) FST SUM

ST(2)ST(1)ST(0)

YX

(g) FSTP SUM

ST(1)ST(0)

(e) FADD ST(0),ST(1)

Y+ZYX ST(2)

ST(1)ST(0)

000001010011100101110111

Example

;Data SegmentR DD 2.25AREA DD ?;Code SegmentFINITFLD RFMUL ST(0),ST(0)FLDPIFMUL ST(0),ST(1)FSTP AREA

Additional floating-point instructionswith none operands

• fabs ST = |ST| (absolute value)• fchs ST = - ST (change sign)• frndint rounds ST to an integer value• fsqrt replace the contents of ST by its square root

• fld1 1.0 pushed onto stack• fldz 0.0 pushed onto stack• fldpi pushed onto stack• fldl2e log2(e) pushed onto stack• fldl2t log2(10) pushed onto stack• fldlg2 log10(2) pushed onto stack• fldln2 loge(2) pushed onto stack

; from data segmentvalue1 DD 0.5value2 DD 1.2sqrt DD ?; from code segmentfld value1 ; value1 in STfld st ; value1 in ST and ST(1)fmul ; value1*value1 in STfld value2 ; value2 in ST (value1*value1 in ST(1))fld st ; value2 in ST and ST(1)fmul ; value2*value2 in STfadd ; sum of squares in STfsqrt ; square root of sum of squares in STfstp sqrt ; store resultEND

22 21 valuevaluesqrt

Calculate Area of a Circle.8087;Data SegmentRed DD 91.67Area DD ?

;Code SegmentFINITFLD RedFMUL st(0), st(0)FLDPIFMUL st(0), st(1) ;!!!!!!!!FSTP Area

Calculate Area of a Circle (C version)

#include<stdio.h>main(){

float Red=0.5, Area;__asm{

FINITFLD RedFMUL st(0), st(0)FLDPIFMULP st(1), st(0)FSTP Area

}printf("\n %f ",Area);

}

FPTAN Instruction

• The instruction FPTAN partial tangent calculates Y/X = TAN Z,– Z is the angle in radians and must be 0 < Z < /4

• Z is stored in ST(0) prior to execution of FPTAN.• After the execution ST(0) = X and ST(1) = Y.• X and Y can be used to calculate the hypotenuse R.• After that, it is easy to calculate the sine, cosine, tangent, and

cotangent.

Calculate SIN, COS, TAN, and CTAN.8087

ANGLE DD 0.523598776X DD 0Y DD 0R DD 0SIN DD 0COS DD 0TAN DD 0COT DD 0

START PROC FARASSUME CS:CODESG, DS:DATASG, SS:STACKSGMOV AX,DATASGMOV DS, AXCALL CALC_X_YCALL CALC_RCALL CALC_SINCALL CALC_COSCALL CALC_TANCALL CALC_COTMOV AH, 4CHINT 21HSTART ENDP

CALC_X_Y PROC NEARFINITFLD ANGLEFPTANFSTP XFSTP YRET

CALC_X_Y ENDP

CALC_R PROC NEARFINITFLD XFMUL ST(0), ST(0)FLD YFMUL ST(0),ST(0)FADD ST(0), ST(1)FSQRTFST RRET

CALC_R ENDP

CALC_SIN PROC NEARFINITFLD RFLD YFDIV ST(0),ST(1)FST SINRET

CALC_SIN ENDP

CALC_COS PROC NEARFINITFLD RFLD XFDIV ST(0),ST(1)FST COSRET

CALC_COS ENDP

CALC_TAN PROC NEARFINITFLD XFLD YFDIV ST(0),ST(1)FST TANRET

CALC_TAN ENDP

CALC_COT PROC NEARFINITFLD YFLD XFDIV ST(0),ST(1)FST COTRET

CALC_COT ENDP

#include<stdio.h>main(){ float ANGLE= 30, X, Y, R, SIN, COS, TAN, COT, C_180=180;

__asm{ FINITFLDPIFLD ANGLEFMULFLD C_180FDIVP ST(1),ST(0)FPTANFST XFST YFLD ST(0)FMUL ST(0), ST(0)FLD ST(2)FMUL ST(0), ST(0)FADDP ST(1), ST(0)FSQRTFST RFLD ST(2)FDIV ST(0),ST(1)FSTP SINFDIVR ST(0),ST(1)FSTP COSFDIVP ST(1),ST(0)FST TANFLD1FDIV ST(0),ST(1)FSTP COTFSTP ST(0) }}

DIRECT MEMORY ACCESSAND

DMA CHANNELS IN x86 PC

15.1: CONCEPT OF DMA

• There is often need to transfer a many bytes between memory &peripherals like disk drives.– Using the microprocessor to transfer the data is too slow.

• Data must be fetched to the CPU, then sent to its destination.• The Intel 8237 DMAC (direct memory access controller) chip functions to

bypass the CPU.

– It provides a direct connection between peripherals and memory, transferringthe data as fast as possible.

• Where 8237 can transfer a byte between a peripheral &memory in 4 clocks, the 8088 would take 39 clocks.

15.1: CONCEPT OF DMAbus sharing

• The data bus, address bus, or control bus can be used either by themain x86 CPU or the 8237 DMA.– Since x86 has primary control, it must give permission

to DMA to use them.• When DMA needs the buses, it sends a HOLD signal to the CPU,

and the CPU responds with a HLDA (hold acknowledge) signal.– Indicating the DMA can use the buses.

15.1: CONCEPT OF DMAbus sharing

• While DMA uses the buses, the CPU is idle, and when the CPU uses thebus, DMA is sitting idle.– After DMA finishes, it makes HOLD go low & the CPU will regain

control over the buses

Fig. 15-1DMA Usageof System Bus

15.1: CONCEPT OF DMAsteps involved in a DMA transfer

• DMA can only transfer information.– It cannot decode and execute instructions.

• When the CPU receives a HOLD request from DMA, it finishes thepresent bus cycle (but not necessarily the present instruction) before ithands over control of the buses to the DMA.

• To transfer a block of data from memory to I/O, DMA must know:– The address of the beginning of the data block.

(address of the first byte of data)– The number of bytes (count) it needs to transfer.


– 1. A peripheral device (like the disk controller) will request DMAservice by pulling DREQ (DMA request) high.

– 2. DMA puts a high on its HRQ (hold request), signaling the CPUthrough its HOLD pin that it needs to the buses.

– 3. The CPU finishes the present bus cycle & responds to DMA byputting high on HLDA (hold acknowledge).

• Telling the 8237 DMA it can use the buses to perform its task.• HOLD must remain active-high while DMA performs its task.

– 4. DMA will activate DACK (DMA acknowledge), which tells theperipheral device it will start to transfer the data.

• DMA Transfer Steps:


– 5. DMA starts to transfer data from memory to the I/O peripheral byputting the address of the first byte of the block on the address busand activating MEMR.

• Reading the byte from memory into the data bus; it then activatesIOW to write the data to the peripheral.

• DMA decrements the counter, increments the address pointer &repeats the process until the count reaches zero.

• DMA Transfer Steps:

– 6. After the DMA has finished, it will deactivate HRQ, signaling the CPU thatit can regain control overits buses.

15.2: 8237 DMA CHIPPROGRAMMING

• The 40-pin Intel 8237 DMA controller chip has four data transferchannels, each used for one device.– Only one device at a time can use the DMA.

• With every channel are two associated signals:– DREQ (DMA request) - an input to DMA from the peripheral

device.– DACK (DMA acknowledge) - an output signal from

the 8237 going to the peripheral device.• HOLD & HLDA connect to HOLD & HLDA of x86.

– Four channels from four different devices can requestbus use, but DMA decides who gets control based on how itspriority register has been programmed.

15.2: 8237 DMA CHIPPROGRAMMING

• Every channel of the 8237 DMA must be initialized separately for theaddress of the data block and the count (the size of the block)before it can be used.– After initialization, each channel can be enabled and controlled

with the use of a control word.• Many modes of operation can be programmed into the 8237's

internal registers.– Accessed by four address pins, A0–A3.

• Along with the CS (chip select) pin.

15.2: 8237 DMA CHIP PROGRAMMING

Internal 8237register addressesfor each channel,and how they aregenerated.


• Two sets of information needed to program a channel of the 8237DMA to transfer data are:– The address of the first byte of data to be transferred.

(base address)– How many bytes of data are to be transferred.

(word count)• For set 1, the channel's memory address register must be

programmed.– The 8237 memory address register 16 bits, and

the data bus is 8 bits.• One byte at a time, consecutively, is sent in to the

same port address.


• For set 2, the channel count register is programmed.– The count can go as high as FFFFH.

See the entire example onpage 406 of your textbook.

Since the count register is16 bits and the DMA databus is only 8 bits, it takestwo consecutive writes toprogram that register, asshown in Example 15-2.

15.2: 8237 DMA CHIP PROGRAMMINGinternal control registers

• One set of control/command registers is usedby all 8237 channels.– These registers are shown in Table 15-1 on page 405.– To understand how to access those registers,

review Example 15-3 on page 407.

15.2: 8237 DMA CHIP PROGRAMMINGinternal control registers

15.2: 8237 DMA CHIP PROGRAMMINGcommand register

• An 8-bit register used for controlling the operationof the 8237.– It must be programmed (written into) by the CPU.– It is cleared by the RESET signal from the CPU.

• Or the master clear instruction of the DMA.

15.2: 8237 DMA CHIP PROGRAMMINGcommand register

8237 is capable oftransferring data…From a peripheraldevice to memory.(reading from disk)From memory to aperipheral device(writing a file to disk)From memory tomemory.(Shadow RAM)

Fig. 15-3 8237 Command Register Format

15.3: 8237 DMA INTERFACING IN THE IBM PC

• The 8237 DMA has eight addresses, A0–A7.– A bidirectional address bus is

formed by A0–A3, which sendsaddresses to the 8237 to select one ofthe 16 possible registers.

Fig. 15-88237 DMAPin Layout


fg15_00900

Figure 15-9 Chip Selection of the 8237A in the PC

Fig. 15-88237 DMAPin Layout

In the IBM PC, chipselect is activated byY0 of the 74LS138.


See page 414 ofyour textbook.

Port addresses 0–7are assigned to thefour channelsAddresses 08–0F areassigned to controlregisters commonlyused by all channels.

• Address selection of the registers inside the 8237.– Summarized, assuming zero for each x.

15.3: 8237 DMA INTERFACING IN THE IBM PCconnections in the IBM PC

• DMA must be capable of transferring data between I/O & memory withoutinterference from the CPU.– It must have all required control, data & address buses.

Fig. 15-10Block Diagram of the 8237A DMA


– It has four control buses: IOR;IOW; MEMR; MEMW.

• 8237 has its own bidirectional data bus, D0–D7,which is connected to x86 system bus D0–D7.

– The address bus, A0–A7, is only 8bits.

• ADSTB (address strobe)activates the latch when8237 provides the upper8-bit address through thedata bus.

AEN = 0 - x86 controls system bus.AEN = 1 - 8237 DMA controls system bus.

See the entire circuit onpage 415 of your textbook.


Figure15-11a DMA Circuit Connection in the PC


Figure15-11b DMA Circuit Connection in the PC

See the entire circuit onpage 415 of your textbook.

80x87 Math Coprocessor · PDF file08/10/2012 · 80x87 Math Coprocessor. ... DW...

Documents

Transcript of 80x87 Math Coprocessor · PDF file08/10/2012 · 80x87 Math Coprocessor. ... DW...