80x87 Math Coprocessor · PDF file08/10/2012 · 80x87 Math Coprocessor. ... DW...
Transcript of 80x87 Math Coprocessor · PDF file08/10/2012 · 80x87 Math Coprocessor. ... DW...
80x87 Math Coprocessor80x87 Math Coprocessor
Comparison of 8087 and 8086 Clock Times
InstructionApproximate Execution Time
8087 8086 EmulationMultiply (single precision) 19 1,600Multiply (double precision) 27 2,100Add 17 1,600Divide (single precision) 39 3,200Compare 9 1,300Load (single precision) 9 1,700Store (single precision) 18 1,200Square root 36 19,600Tangent 90 13,000Exponentiation 100 17,100
IEEE Single-Precision Floating-Point Numbers
• IEEE SP FP (short real) numbers use only 32 bits of data torepresent any real number
• The range of SP FP number is 2128 to 2-126.• This translates approximately to a range of 1.2 10-38 to 3.4 10+38 in
decimal numbers, for both positive and negative values.• To make the hardware design of the math processors much easier
and less transistor consuming, the exponent part is added to aconstant of 7FH (127 decimal).
Biased Exp. Fraction (Significand)022233031
23-bit8-bit1-bit
Conversion from real to floating point• The real number is converted to its binary form.• The binary number is represented in scientific form: 1.xxxxEyyyy• Bit 31 is either 0 for positive or 1 for negative.• The exponent portion, yyyy, is added to 7F to get the biased
exponent, which is placed in bits 23 to 30.• The significand, xxxx, is placed in bits 22 to 0.
• Example: Convert 9.7510 to single-precision floating point.• Decimal 9.75 = binary 1001.11 = 1.0011123 = scientific binary
1.00111E3– sign bit 31 is 0 for positive– exponent bits 30 to 23 are 1000 0010 (3 + 7F = 82H) after biasing– significand bits 22 to 0 are 00111000000000000000000
0 1000 0010 00111000000000000000000022233031
23-bit8-bit1-bit411C0000H
Example
Convert 0.07812510 to short real FP (single precision).Solution:• decimal 0.078125 = binary 0.000101 = 1.012-4
• scientific binary 1.01E-4• sign bit 31 is 0 for positive• exponent bits 30 - 23 are 0111 1011 (-4 + 7F = 7B) after biasing• significand bits 22 - 0 are 010 0000 0000 0000 0000 0000• This number will be represented in binary and hex as 3DA00000
0 011 1101 1 010 0000 0000 0000 0000 0000022233031
23-bit8-bit1-bit
IEEE double-precision floating-point numbers
• Double-precision FP (long real) can represent numbers in the range2.310-308 to 1.710308, both positive and negative.– 52 bits (bits 0 to 51) are for the significand,– 11 bits (bit 52 to 62) are for the exponent,– bit 63 is for the sign.
• The conversion process– The real number must first be represented as 1.xxxxxxxEyyyy,– yyyy is added to 3FF to get the biased exponent.
Biased Exp. Fraction (Significand)051526263
52-bit11-bit1-bit
Example
• Convert 152.187510 to double-precision FP.• Solution:
– decimal 152.1875 = binary 10011000.0011 = 1.0011000001127
– scientific binary 1.00110000011E7– bit 63 is 0 for positive– exponent bits 62-52 are 10000000110 (7+3FF=406) after biasing– fractionbits 51 - 0 are 00110000011000 … 000
0 100 0000 0110 0011 000 0011 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000051526263
52-bit11-bit1-bit
4063030000000000
Representation of special valuesIEEE Single Precision
Exponent Fraction Representse = 0 f = 0 0e = 0 f ≠ 0 0.f × 2-126
1≤ e ≤ 254 — 1.f × 2e
e = 255 f = 0 ∞e = 255 f ≠ 0 NAN
Other Data Formats of the 8087
• In addition to short real (SP) and long real (DP) representations for realnumbers, the 8087 also supports– 16-bit (signed word) integers– 32-bit (signed short) integers– 64-bit (signed long) integers
• There are also two 80-bit data formats in the 8087 coprocessor– packed decimal (18 packed BCD numbers) and bit 79 for the sign
– temporary real format is used internally by the 8087• The conversion goes through the same process, except that the
biased exponent is calculated by adding the constant 3FFFH.
0 Biased Exp. Fraction (Significand)
62647879
63-bit15-bit1-bit1
1-bit
63
Directives for Coprocessor
• In MASM and compatible assemblers, there are different directivesto define the different data types of the coprocessor:
DD Define double word (32-bit) for short real (single precision)DQ Define quad word (64-bit) for long real (double precision)DW Define word (16-bit) for word integerDD Define double word (32-bit) for short integerDQ Define quad word (64-bit) for long integerDT Define ten bytes (80-bit) for packed decimalDT Define ten bytes (80-bit) for temporary real
80x87 registers
• There are only 8 general-purpose registers in the 80x87.• All the registers of the 8087 are 80 bits wide.• Every time the 8087 loads an operand, it automatically converts
it to this 80-bit format.• This makes programming, as well as 8087 hardware design, much
easier.• Although these 8 registers have been numbered from 0 to 7, they
are accessed like a stack (last-in-first-out policy).• At any given time, the top of the stack is referred to as ST(0), or
simply ST• All other registers, regardless of their number, are referred to
according to their positions compared to the top of the stack, ST.
Note That
• All 80x87 mnemonics start with the letter “f” to distinguish them from80x86 instructions.
• The 80x87 must be initialized to make sure that the top of the stackwill be register number 7. (finit)
• Whenever a register is not identified specifically, ST(0) is assumedautomatically.
• ST, the stack top, also called ST(0),• ST(1), the register just below the stack top,• ST(2), the register just below ST(1),• ST(3), ST(4), ST(5), ST(6), and• ST(7), the register at the bottom of the stack.
000001010011100101110111
(a) FINIT
Real transfers
X DD 9.75• FLD src ; pushes source operand onto ST(0): decrement the stack pointer
; source may be ST(i) or memory: FLD x FLD ST(3); FLD ST(0) duplicates stack top: FLD ST(0)
• FST dest ; copies ST(0) to destination: FST x; dest may be ST(i) or short or long real variable: FLD ST(2)
• FSTP dest ; copies ST(0) to dest then pops ST(0): FSTP x; dest may be ST(i) or short or long or temporary real memory; FSTP ST(0) popping the stack with no data transfer
• FXCH dest ; swaps contents of ST(0) and destination ST(i); FXCH with no operands swaps ST(0) and ST(1); FXCH is frequently used to move a register to the top before; using an instruction which assumes ST(0)
Integer transfersi DD 9B DT 321•FILD src ; converts source to temporary real and pushes onto ST(0)
FILD i•FIST dest ; rounds ST(0) to integer and copies to destination FIST i
; dest may be a word or short integer
•FISTP dest ; functions the same as FIST but then pops ST(0); dest may be any binary integer data type
•FBLD src ; converts source to temporary real and pushes onto ST(0)FBLD B
•FBST dest ;XXXX
•FBSTP dest ; converts ST(0) to BCD and stores at dest then pops stack
Addition
Mnemonic Operand Action
fadd (none)pops both ST(0) and ST(1);adds these values;pushes sum onto the stack
fadd st(i), st(0) adds ST(i) and ST(0);replaces ST(0) by the sum
fadd st(0), st(i) adds ST(0) and ST(i);replaces S(0)T by the sum
fadd memory (real) adds ST(0) and real number from memory;replaces ST(0) by the sum
fiadd memory (int) adds ST(0) and integer from memory;replaces ST(0) by the sum
faddp st(i), st(0)adds ST(i) and ST(0);replaces ST(i) by the sum;pops ST(0) from stack
Subtract
Mnemonic Operand Action
fsub (none)pops ST(0) and ST(1);calculates ST(1) – ST(0);Pushes difference onto the stack
fsub st(i), st calculates ST(i) - ST(0);replaces ST(i) by the difference
fsub st, st(i) calculates ST(0) - ST(i);replaces ST(0) by the difference
fsub memory (real) calculates ST(0) - real number from memory;replaces ST(0) by the difference
fisub memory (int) calculates ST(0) - integer from memory;replaces ST(0) by the difference
fsubp st(i), stcalculates ST(i) – ST(0);replaces ST(i) by the difference;pops ST(0) from the stack
Subtract ReverseMnemonic Operand Action
fsubr (none)pops ST(0) and ST(1);calculates ST(0) - ST(1);pushes difference onto the stack
fsubr st(i), st calculates ST(0) -ST(num);replaces ST(i) by the difference
fsubr st, st(i) calculates ST(i) – ST(0);replaces ST(0) by the difference
fsubr memory(real)
calculates real number from memory -ST(0);replaces ST(0) by the difference
fisubr memory (int) calculates integer from memory – ST(0);replaces ST(0) by the difference
fsubpr st(i), stcalculates ST(0) -ST(i);replaces ST(i) by the difference;pops ST(0) from the stack
Multiply
Mnemonic Operand Actionfmul (none) pops ST(0) and ST(1);
multiplies these values;pushes product onto the stack
fmul st(i), st multiplies ST(i) and ST(0);replaces ST(i) by the product
fmul st, st(i) multiplies ST(0) and ST(i);replaces ST(0) by the product
fmul memory (real) multiplies ST(0) and real number frommemory;replaces ST(0) by the product
fimul memory (int) multiplies ST(0) and integer from memory;replaces ST(0) by the product
fmulp st(i), st multiplies ST(i) and ST(0);replaces ST(i) by the product;pops ST(0) from stack
Division
Mnemonic Operand Actionfdiv (none) pops ST(0) and ST(1);
calculates ST(1) / ST(0);pushes quotient onto the stack
fdiv st(i), st calculates ST(i) / ST(0);replaces ST(i) by the quotient
fdiv st, st(i) calculates ST(0) / ST(i);replaces ST(0) by the quotient
fdiv memory (real) calculates ST(0) / real number from memory;replaces ST(0) by the quotient
fidiv memory (int) calculates ST(0) / integer from memory;replaces ST(0) by the quotient
fdivp st(i), st calculates ST(i) / ST(0);replaces ST(i) by the quotient;pops ST(0) from the stack
Write an 8087 program that loads three values forX, Y, and Z, adds them, and stores the result.
;in the data segmentX DD 9.75Y DD 13.09375Z DD 29.0390625SUM DD ?;in the code segmentfinit ;initialize the 8087 to start at the top of stackfld X ;load X into ST(0). now ST(0)=Xfld Y ;load Y into ST(0). now ST(0)=Y and ST(1)=Xfld Z ;load Z into ST(0). now ST(0)=Z, ST(1)=Y, ST(2)=Xfadd ST(0),ST(1) ;add Y to Z and save the result in ST(0)fadd ST(0), ST(2) ;add X to (Y+Z) and save it in ST(0)fst sum ;store ST(0) in memory location called sum
000001010011100101110111
(a) FINITX
(b) FLD XST(0)
YX
(c) FLD YST(1)ST(0)
ZYX
(d) FLD ZST(2)ST(1)ST(0)
X+Y+ZYX
(f) FADD ST(0),ST(2)
ST(2)ST(1)ST(0) X+Y+Z
YX
(g) FST SUM
ST(2)ST(1)ST(0)
YX
(g) FSTP SUM
ST(1)ST(0)
(e) FADD ST(0),ST(1)
Y+ZYX ST(2)
ST(1)ST(0)
000001010011100101110111
Example
;Data SegmentR DD 2.25AREA DD ?;Code SegmentFINITFLD RFMUL ST(0),ST(0)FLDPIFMUL ST(0),ST(1)FSTP AREA
Additional floating-point instructionswith none operands
• fabs ST = |ST| (absolute value)• fchs ST = - ST (change sign)• frndint rounds ST to an integer value• fsqrt replace the contents of ST by its square root
• fld1 1.0 pushed onto stack• fldz 0.0 pushed onto stack• fldpi pushed onto stack• fldl2e log2(e) pushed onto stack• fldl2t log2(10) pushed onto stack• fldlg2 log10(2) pushed onto stack• fldln2 loge(2) pushed onto stack
; from data segmentvalue1 DD 0.5value2 DD 1.2sqrt DD ?; from code segmentfld value1 ; value1 in STfld st ; value1 in ST and ST(1)fmul ; value1*value1 in STfld value2 ; value2 in ST (value1*value1 in ST(1))fld st ; value2 in ST and ST(1)fmul ; value2*value2 in STfadd ; sum of squares in STfsqrt ; square root of sum of squares in STfstp sqrt ; store resultEND
22 21 valuevaluesqrt
Calculate Area of a Circle.8087;Data SegmentRed DD 91.67Area DD ?
;Code SegmentFINITFLD RedFMUL st(0), st(0)FLDPIFMUL st(0), st(1) ;!!!!!!!!FSTP Area
Calculate Area of a Circle (C version)
#include<stdio.h>main(){
float Red=0.5, Area;__asm{
FINITFLD RedFMUL st(0), st(0)FLDPIFMULP st(1), st(0)FSTP Area
}printf("\n %f ",Area);
}
FPTAN Instruction
• The instruction FPTAN partial tangent calculates Y/X = TAN Z,– Z is the angle in radians and must be 0 < Z < /4
• Z is stored in ST(0) prior to execution of FPTAN.• After the execution ST(0) = X and ST(1) = Y.• X and Y can be used to calculate the hypotenuse R.• After that, it is easy to calculate the sine, cosine, tangent, and
cotangent.
Calculate SIN, COS, TAN, and CTAN.8087
ANGLE DD 0.523598776X DD 0Y DD 0R DD 0SIN DD 0COS DD 0TAN DD 0COT DD 0
START PROC FARASSUME CS:CODESG, DS:DATASG, SS:STACKSGMOV AX,DATASGMOV DS, AXCALL CALC_X_YCALL CALC_RCALL CALC_SINCALL CALC_COSCALL CALC_TANCALL CALC_COTMOV AH, 4CHINT 21HSTART ENDP
CALC_X_Y PROC NEARFINITFLD ANGLEFPTANFSTP XFSTP YRET
CALC_X_Y ENDP
CALC_R PROC NEARFINITFLD XFMUL ST(0), ST(0)FLD YFMUL ST(0),ST(0)FADD ST(0), ST(1)FSQRTFST RRET
CALC_R ENDP
CALC_SIN PROC NEARFINITFLD RFLD YFDIV ST(0),ST(1)FST SINRET
CALC_SIN ENDP
CALC_COS PROC NEARFINITFLD RFLD XFDIV ST(0),ST(1)FST COSRET
CALC_COS ENDP
CALC_TAN PROC NEARFINITFLD XFLD YFDIV ST(0),ST(1)FST TANRET
CALC_TAN ENDP
CALC_COT PROC NEARFINITFLD YFLD XFDIV ST(0),ST(1)FST COTRET
CALC_COT ENDP
#include<stdio.h>main(){ float ANGLE= 30, X, Y, R, SIN, COS, TAN, COT, C_180=180;
__asm{ FINITFLDPIFLD ANGLEFMULFLD C_180FDIVP ST(1),ST(0)FPTANFST XFST YFLD ST(0)FMUL ST(0), ST(0)FLD ST(2)FMUL ST(0), ST(0)FADDP ST(1), ST(0)FSQRTFST RFLD ST(2)FDIV ST(0),ST(1)FSTP SINFDIVR ST(0),ST(1)FSTP COSFDIVP ST(1),ST(0)FST TANFLD1FDIV ST(0),ST(1)FSTP COTFSTP ST(0) }}
DIRECT MEMORY ACCESSAND
DMA CHANNELS IN x86 PC
15.1: CONCEPT OF DMA
• There is often need to transfer a many bytes between memory &peripherals like disk drives.– Using the microprocessor to transfer the data is too slow.
• Data must be fetched to the CPU, then sent to its destination.• The Intel 8237 DMAC (direct memory access controller) chip functions to
bypass the CPU.
– It provides a direct connection between peripherals and memory, transferringthe data as fast as possible.
• Where 8237 can transfer a byte between a peripheral &memory in 4 clocks, the 8088 would take 39 clocks.
15.1: CONCEPT OF DMAbus sharing
• The data bus, address bus, or control bus can be used either by themain x86 CPU or the 8237 DMA.– Since x86 has primary control, it must give permission
to DMA to use them.• When DMA needs the buses, it sends a HOLD signal to the CPU,
and the CPU responds with a HLDA (hold acknowledge) signal.– Indicating the DMA can use the buses.
15.1: CONCEPT OF DMAbus sharing
• While DMA uses the buses, the CPU is idle, and when the CPU uses thebus, DMA is sitting idle.– After DMA finishes, it makes HOLD go low & the CPU will regain
control over the buses
Fig. 15-1DMA Usageof System Bus
15.1: CONCEPT OF DMAsteps involved in a DMA transfer
• DMA can only transfer information.– It cannot decode and execute instructions.
• When the CPU receives a HOLD request from DMA, it finishes thepresent bus cycle (but not necessarily the present instruction) before ithands over control of the buses to the DMA.
• To transfer a block of data from memory to I/O, DMA must know:– The address of the beginning of the data block.
(address of the first byte of data)– The number of bytes (count) it needs to transfer.
15.1: CONCEPT OF DMAsteps involved in a DMA transfer
– 1. A peripheral device (like the disk controller) will request DMAservice by pulling DREQ (DMA request) high.
– 2. DMA puts a high on its HRQ (hold request), signaling the CPUthrough its HOLD pin that it needs to the buses.
– 3. The CPU finishes the present bus cycle & responds to DMA byputting high on HLDA (hold acknowledge).
• Telling the 8237 DMA it can use the buses to perform its task.• HOLD must remain active-high while DMA performs its task.
– 4. DMA will activate DACK (DMA acknowledge), which tells theperipheral device it will start to transfer the data.
• DMA Transfer Steps:
15.1: CONCEPT OF DMAsteps involved in a DMA transfer
– 5. DMA starts to transfer data from memory to the I/O peripheral byputting the address of the first byte of the block on the address busand activating MEMR.
• Reading the byte from memory into the data bus; it then activatesIOW to write the data to the peripheral.
• DMA decrements the counter, increments the address pointer &repeats the process until the count reaches zero.
• DMA Transfer Steps:
– 6. After the DMA has finished, it will deactivate HRQ, signaling the CPU thatit can regain control overits buses.
15.2: 8237 DMA CHIPPROGRAMMING
• The 40-pin Intel 8237 DMA controller chip has four data transferchannels, each used for one device.– Only one device at a time can use the DMA.
• With every channel are two associated signals:– DREQ (DMA request) - an input to DMA from the peripheral
device.– DACK (DMA acknowledge) - an output signal from
the 8237 going to the peripheral device.• HOLD & HLDA connect to HOLD & HLDA of x86.
– Four channels from four different devices can requestbus use, but DMA decides who gets control based on how itspriority register has been programmed.
15.2: 8237 DMA CHIPPROGRAMMING
• Every channel of the 8237 DMA must be initialized separately for theaddress of the data block and the count (the size of the block)before it can be used.– After initialization, each channel can be enabled and controlled
with the use of a control word.• Many modes of operation can be programmed into the 8237's
internal registers.– Accessed by four address pins, A0–A3.
• Along with the CS (chip select) pin.
15.2: 8237 DMA CHIP PROGRAMMING
Internal 8237register addressesfor each channel,and how they aregenerated.
15.2: 8237 DMA CHIP PROGRAMMING
Internal 8237register addressesfor each channel,and how they aregenerated.
15.2: 8237 DMA CHIP PROGRAMMING
• Two sets of information needed to program a channel of the 8237DMA to transfer data are:– The address of the first byte of data to be transferred.
(base address)– How many bytes of data are to be transferred.
(word count)• For set 1, the channel's memory address register must be
programmed.– The 8237 memory address register 16 bits, and
the data bus is 8 bits.• One byte at a time, consecutively, is sent in to the
same port address.
15.2: 8237 DMA CHIP PROGRAMMING
• For set 2, the channel count register is programmed.– The count can go as high as FFFFH.
See the entire example onpage 406 of your textbook.
Since the count register is16 bits and the DMA databus is only 8 bits, it takestwo consecutive writes toprogram that register, asshown in Example 15-2.
15.2: 8237 DMA CHIP PROGRAMMINGinternal control registers
• One set of control/command registers is usedby all 8237 channels.– These registers are shown in Table 15-1 on page 405.– To understand how to access those registers,
review Example 15-3 on page 407.
15.2: 8237 DMA CHIP PROGRAMMINGinternal control registers
15.2: 8237 DMA CHIP PROGRAMMINGcommand register
• An 8-bit register used for controlling the operationof the 8237.– It must be programmed (written into) by the CPU.– It is cleared by the RESET signal from the CPU.
• Or the master clear instruction of the DMA.
15.2: 8237 DMA CHIP PROGRAMMINGcommand register
8237 is capable oftransferring data…From a peripheraldevice to memory.(reading from disk)From memory to aperipheral device(writing a file to disk)From memory tomemory.(Shadow RAM)
Fig. 15-3 8237 Command Register Format
15.3: 8237 DMA INTERFACING IN THE IBM PC
• The 8237 DMA has eight addresses, A0–A7.– A bidirectional address bus is
formed by A0–A3, which sendsaddresses to the 8237 to select one ofthe 16 possible registers.
Fig. 15-88237 DMAPin Layout
15.3: 8237 DMA INTERFACING IN THE IBM PC
fg15_00900
Figure 15-9 Chip Selection of the 8237A in the PC
Fig. 15-88237 DMAPin Layout
In the IBM PC, chipselect is activated byY0 of the 74LS138.
15.3: 8237 DMA INTERFACING IN THE IBM PC
See page 414 ofyour textbook.
Port addresses 0–7are assigned to thefour channelsAddresses 08–0F areassigned to controlregisters commonlyused by all channels.
• Address selection of the registers inside the 8237.– Summarized, assuming zero for each x.
15.3: 8237 DMA INTERFACING IN THE IBM PCconnections in the IBM PC
• DMA must be capable of transferring data between I/O & memory withoutinterference from the CPU.– It must have all required control, data & address buses.
Fig. 15-10Block Diagram of the 8237A DMA
15.3: 8237 DMA INTERFACING IN THE IBM PCconnections in the IBM PC
– It has four control buses: IOR;IOW; MEMR; MEMW.
• 8237 has its own bidirectional data bus, D0–D7,which is connected to x86 system bus D0–D7.
– The address bus, A0–A7, is only 8bits.
• ADSTB (address strobe)activates the latch when8237 provides the upper8-bit address through thedata bus.
AEN = 0 - x86 controls system bus.AEN = 1 - 8237 DMA controls system bus.
See the entire circuit onpage 415 of your textbook.
15.3: 8237 DMA INTERFACING IN THE IBM PCconnections in the IBM PC
Figure15-11a DMA Circuit Connection in the PC
15.3: 8237 DMA INTERFACING IN THE IBM PCconnections in the IBM PC
Figure15-11b DMA Circuit Connection in the PC
See the entire circuit onpage 415 of your textbook.