Arithmetic Coprocessor Coprocessor Basic:
-
Upload
prachi-pandey -
Category
Documents
-
view
1.574 -
download
20
description
Transcript of Arithmetic Coprocessor Coprocessor Basic:
Advanced MicroprocessorAdvanced Microprocessor 11
Arithmetic CoprocessorArithmetic Coprocessor
Coprocessor basic:
• The 80x87 is able to multiply, divide, add, subtract, find the sqrt and calculate transcendental functions and logarithms.
• Data types include - 16-, 32- and 64-bit signed integers- 18-digit BCD data and - 32-,64- and 80-bit (extended precision) floating-point numbers.
• The operation performed by the 80x87 generally executes much faster than equivalent operation written in microprocessor normal instruction.
Advanced MicroprocessorAdvanced Microprocessor 22
Arithmetic CoprocessorArithmetic Coprocessor
Data Formats for the Arithmetic Coprocessor:
Signed Integers:-• 16 bit ( word ) – range -32768 to +32767 • 32 bit ( short integer ) – range -2x10+9 to + 2x10+9
• 64 bit ( long integer ) – range -9x10+18 to +9x10+18
3 forms of signed integers-
s magnitude
s
s
magnitude
magnitude
15 0
31 0
63 0
Advanced MicroprocessorAdvanced Microprocessor 33
Arithmetic CoprocessorArithmetic Coprocessor
• The directives dw, dd and dq are used for declaring signed integer storage
- dw to define word- dd to define short integer- dq to define long integer
for every microprocessor their will be a coprocessor
8086 80878088 808780186 80187
& so on
Advanced MicroprocessorAdvanced Microprocessor 44
Arithmetic CoprocessorArithmetic Coprocessor
Binary Coded Decimal ( BCD ):-
• BCD form requires 80 bits of memory.
• Each number is stored as an 18-digit packed integer in 9 bytes of memory as 2 digit per byte, 10th byte for sign bit.
• Both positive & negative numbers are stored in true formex :
DATA1 DT 20 ; 20 as bcd00 00 00 00 00 00 00 00 00 20
DATA2 DT -220 ; -220 as bcd80 00 00 00 00 00 00 00 02 20
DATA3 DT 50000 ; 50000 as bcd00 00 00 00 00 00 00 05 00 00
SS 1717 1616 1515 1414 1313 1212 1111 1010 99 88 77 66 55 44 33 22 11 00
Advanced MicroprocessorAdvanced Microprocessor 55
Arithmetic CoprocessorArithmetic Coprocessor
Floating point:-
• Hold signed integers, fractions & mixed numbers.
• Floating point numbers has 3 parts
- Sign bit - Biased exponent- Significand
• Intel family arithmetic coprocessor supports 3 types of floating point numbers
- Short (32 bit) : single precision, with a bias of 7FH- Long (64 bit) : double precision, with a bias of 3FFH- Temporary (80 bit) : extended precision, with a bias of 3FFFH
Advanced MicroprocessorAdvanced Microprocessor 66
Arithmetic CoprocessorArithmetic Coprocessor
1
fraction
fraction
fraction
exp
exp
exps
s
s
31 30 23 22 0
63 62 52 51 0
79 78 64 63 0
Converting Decimal to Floating-point form:
- Convert the decimal number into binary.- Normalize the binary number.- Calculate the biased exponent.- Store the number in the floating-point format.
Advanced MicroprocessorAdvanced Microprocessor 77
Arithmetic CoprocessorArithmetic Coprocessor
Ex : convert decimal to floating-point
100.2510 1 - convert to binary
100 ->1100100.25 -> 011100100.01
2 - normalize binary1100100.01 = 1.10010001x26
3 - calculate bias expo7FH(127) for single precision
add expo with precision110 + 01111111 ( 6 + 127)10000101
4 - floating-point numbersign -> 0expo -> 10000101significand -> 10010001000000000000000
Advanced MicroprocessorAdvanced Microprocessor 88
Arithmetic CoprocessorArithmetic Coprocessor
-100.2510 1- convert to binary
100 ->1100100.25 -> 01-1100100.01
2 - normalize binary-1100100.01 = -1.10010001x26
3 - calculate bias expo7FH(127) for single precision
add expo with precision110 + 01111111 ( 6 + 127)10000101
4 - floating-point numbersign -> 1expo -> 10000101significand -> -10010001000000000000000
Advanced MicroprocessorAdvanced Microprocessor 99
Arithmetic CoprocessorArithmetic Coprocessor
Special Rules:
- The number 0 is stored as all 0s (except for the sign bit).- +/- infinity is stored as logic 1s in the exponent, with a significand of all 0s. Sign bit is used to represent +/- infinity. - A NAN (not-a-number) is an invalid floating-point result that has all 1s in the exponent with a Significand that is NOT all zeros.
Converting Floating-point to Decimal:
- Separate the sign-bit, biased exponent and significand.- Convert the biased exponent into a true exponent by
subtracting the bias.- Write the number as a normalized binary number.- Convert it to a de-normalized binary number.- Convert the de-normalized binary number to decimal.
Advanced MicroprocessorAdvanced Microprocessor 1010
Arithmetic CoprocessorArithmetic Coprocessor
Ex: convert floating-point to decimal: 1- separate the
sign = 0 expo = 10000011
significand = 10010010000000000000000 2 - convert the biased to true expo
100 <- 10000011 – 01111111 ( 7FH , 127 for single preci) 3 - normalized binary number
1.1001001 x 24 4 - convert to de-normalized binary number
11001.001 5 - convert into decimal
25.125
Advanced MicroprocessorAdvanced Microprocessor 1111
Arithmetic CoprocessorArithmetic Coprocessor
1 - separate the sign = 1
expo = 10000011significand = 10010010000000000000000
2 - convert the biased to true expo100 <- 10000011 – 01111111 ( 7FH , 127 for single preci)
3 - normalized binary number1.1001001 x 24
4 - convert to de-normalized binary number11001.001
5 - convert into decimal-25.125
Advanced MicroprocessorAdvanced Microprocessor 1212
Arithmetic CoprocessorArithmetic Coprocessor
The 8087 Architecture:
• 8087 designed to operate concurrently with microprocessor
• 8087 executes 68 different instructions
• Both microprocessor & coprocessor can execute their respective instruction simultaneously or concurrently
• The numeric or arithmetic coprocessor is a special purpose microprocessor, especially designed to execute arithmetic & transcendental operation
• Microprocessor intercepts & executes normal instruction set, Coprocessor intercepts & executes its instruction
Advanced MicroprocessorAdvanced Microprocessor 1313
Arithmetic CoprocessorArithmetic CoprocessorInternal Structure of the 80x87:
StatusAddress
Advanced MicroprocessorAdvanced Microprocessor 1414
Arithmetic CoprocessorArithmetic Coprocessor
• Control unit ( CU ):- interface the coprocessor to the microprocessor data bus. if instruction is ESC then coprocessor executes, if not microprocessor will executes it.
• Numeric execution unit ( NEU ) :- - Unit is responsible for executing all coprocessor instruction
- Has 8 register stack, hold arithmetic instruction & results
- Also other register status, tag, control & exception pointers
- Stack within the coprocessor contain 8-registers each 80 bits wide, contain 80 bit extended-precision floating-point number
- Coprocessors converted data are moved between memory & coprocessor register stack.
Advanced MicroprocessorAdvanced Microprocessor 1515
Arithmetic CoprocessorArithmetic Coprocessor
Status register:• Reflects overall operation of the coprocessor.
• Coprocessor is accessed by executing, FSTSW instructions which stores the content of status register into word of memory
• The coprocessor/microprocessor communications are carried out thru I/O ports
B – busy bit: indicate coprocessor is busy, can be checked by testing status register or by FWAIT instruction
Advanced MicroprocessorAdvanced Microprocessor 1616
Arithmetic CoprocessorArithmetic Coprocessor
• C3 to C0 – condition code bit : indicate the condition of the coprocessor
• TOP - top of stack (ST) : bit indicate the current register address as the top of stack
• ES – error summary : bit is set if any unmasked error bit (PE,UE, OE, ZE, IE ) is set. In 8087 coprocessor the error summary also caused a coprocessor interrupt
• PE – precision error : result exceed the precision
• UE – underflow : non-zero result , which is too small to represent it current precision selected
• OE – overflow : result is too large. If error is masked, coprocessor enters infinite time
Advanced MicroprocessorAdvanced Microprocessor 1717
Arithmetic CoprocessorArithmetic Coprocessor• ZE – zero error : divisor is zero, dividend is a non infinity or non zero number.
• DE – denormalized error : least one of the operands is denormalized
• IE – invalid error : indicate stack underflow/overflow, indeterminate form or the use of a NAN as an operand. Sqrt of –ve number
Control register: - selects – precision, rounding control & infinity control
- masks & unmasked the exception bits that corresponds to the rightmost 6 bits of the status register
- FLDCW instruction is used to load a value onto the control register
Advanced MicroprocessorAdvanced Microprocessor 1818
Arithmetic CoprocessorArithmetic Coprocessor
Invalid Operationmask
DenormalizedOperand mask
Division by zero mask
Precision control00 – single01 – reserved10 – double 11 - extended
Rounding control00 – round nearest or even01 – round down towards minus infinity 10 – round up towards plus
infinity11 – chop or truncate towards zero
Infinity control0 – projective1 – affine
Advanced MicroprocessorAdvanced Microprocessor 1919
Arithmetic CoprocessorArithmetic Coprocessor• IC – infinity control : affine allows +ve or –ve infinity & projective assumes infinity is unsigned
• RC – rounding control : determine the type of rounding
• PC – precision control : sets the precision of the results
• Exception masks : check error indicated by the exception affects the error bit in the status register , if logic 1 present in the one of the exception control bits , corresponding bit in the status register is masked off
Advanced MicroprocessorAdvanced Microprocessor 2020
Arithmetic CoprocessorArithmetic Coprocessor
fdiv DATA1fstsw ax ;Copy status reg to AXtest ax, 4 ;Test bit position 2jnz DIVIDE_ERRORfcom DATA1 ;Compare DATA1 to ST0 and set status.fstsw axsahf ;Copy status bits to flags.je ST_EQUALjb ST_BELOWja ST_ABOVE
Advanced MicroprocessorAdvanced Microprocessor 2121
Arithmetic CoprocessorArithmetic Coprocessor
TAG 7TAG 7 TAG 6TAG 6 TAG 5TAG 5 TAG 4TAG 4 TAG 3TAG 3 TAG 2TAG 2 TAG 1TAG 1 TAG 0TAG 0
Tag register :
-Indicates the contents of each location in the coprocessor stack
- program can view the tag register by storing the coprocessor Environment using FSTENV, FSAVE, FRSTOR
- 00 – VALID, 01 – ZERO, 10 – INVALID or INIFINITY, 11 - EMPTY
Advanced MicroprocessorAdvanced Microprocessor 2222
Arithmetic CoprocessorArithmetic Coprocessor
Instruction Set:
• executes over 68 different instructions
• coprocessor uses the data bus for data transfer during coprocessor instruction , microprocessor uses during normal instruction
Types of instruction :-- data transfer instructions- arithmetic instructions- comparison instructions- transcendental operations- constant operation- coprocessor control instructions
Advanced MicroprocessorAdvanced Microprocessor 2323
Arithmetic CoprocessorArithmetic Coprocessor
i) Data transfer instruction:- floating-point- signed-integer- BCD- pentium pro thru pentium4 FCMOV instruction
coprocessor stores the data in 80-bit extended precision floating point number
Floating – point data transfer
FLD (Load Real) : - Loads floating-point data to Stack Top (ST).- Stack pointer is then decremented by 1.- Data can be retrieved from memory, or another stack
position.
Advanced MicroprocessorAdvanced Microprocessor 2424
Arithmetic CoprocessorArithmetic Coprocessor
Ex : FLD st2 ;Copies contents of register two to ST
top of the stack is register 0 when coprocessor is reset or initialized
FLD data7 ;copies the content memory location data7 to the ;top of stack
size of the transfer is automatically determined by the assembler thru directives
FST ( store real) :
- Stores a copy of the top of the stack into memory or another coprocessor register.- Rounding occurs when the storage operation
completes according to the control register- copy instruction
Advanced MicroprocessorAdvanced Microprocessor 2525
Arithmetic CoprocessorArithmetic Coprocessor
FSTP ( floating point store and pop)- Stores a copy of the top of the stack into memory or
another coprocessor register- pop the data from the top of stack - a removal instruction
FXCH ( exchange )- exchanges the content of register with top of stack
ex : FXCH st2 ; exchanges top of the stack with register 2
Integer data transfer instruction
- FILD ( load integer)- FIST ( store integer)- FISTP ( store integer and pop)
While transferring the data , coprocessor automatically converts extended floating-point number to integer data.
Advanced MicroprocessorAdvanced Microprocessor 2626
Arithmetic CoprocessorArithmetic Coprocessor
BCD data transfer instruction
- FBLD – loads the top of stack with BCD memory data- FBSTP – stores top of the stack and does a pop
Pentium pro thru pentium4 instructionFCMOV
- contains condition- if condition true, copies the source to destination- condition are checked for either an ordered or
unordered - testing for NAN and denormalized numbers are not
checkedFCMOVB - move if below, FCMOVE - move if equalFCMOVBE - move if below or equal, FCMOVU - move if unorderedFCMOVNB - move if not below, FCMOVNE - move if not equalFCMOVNBE - move if not below or equal, FCMOVNU - move if not ordered
Advanced MicroprocessorAdvanced Microprocessor 2727
Arithmetic CoprocessorArithmetic Coprocessorii) Arithmetic instruction:
- addition, subtraction, multiplication, division, calculating square roots
- arithmetic related – scaling, rounding, absolute value, changing sign
Addressing modesModeMode FormForm ExampleExample
StackStack ST(1),STST(1),ST FADDFADD
RegisterRegister ST,ST(n)ST,ST(n)
ST(n),STST(n),ST
FADD ST,ST(2)FADD ST,ST(2)
FADD ST(2),STFADD ST(2),ST
Register popRegister pop ST(n),STST(n),ST FADDP ST(3),STFADDP ST(3),ST
MemoryMemory operandoperand FADD data2FADD data2
-Stack addressing mode is restricted to use ST (stack top) and ST1.
-The source operand is ST while the destination operand is ST1.
-After the operation, the source is popped, leaving the dest. at ST.
Advanced MicroprocessorAdvanced Microprocessor 2828
Arithmetic CoprocessorArithmetic Coprocessor
Stack addressing mode, • stack, uses top of the stack as the source operand & next to thetop as destination.
• later, top is popped out, result is the top of the stackex :FADD – adds ST and ST1, result will store in ST1FSUB – subtract ST from ST1, result will be ST, FSUBR, reverse instruction – subtracts ST1 from ST, result
in STto compute reciprocalFDIVR – result stored in ST
Advanced MicroprocessorAdvanced Microprocessor 2929
Arithmetic CoprocessorArithmetic Coprocessor
Register addressing mode,
• MUST use ST as one of the operands.
• The other operand can be any register, including ST0 which is ST. Note that the destination can be either ST or STn.
• unlike stack addressing, non-popping versions can be used.
Memory addressing mode,
• always uses ST as the destination, coprocessor stack oriented Machine
Advanced MicroprocessorAdvanced Microprocessor 3030
Arithmetic CoprocessorArithmetic Coprocessor
Arithmetic operation,
The following letters are used to additionally qualify the operation:
• P: Perform a register pop after the operation, FADD and FADDP.
• R: Reverse mode for subtraction and division.
• I: Indicates that the memory operand is an integer. I appears as the second letter in the instruction, e.g., FIADD, FISUB, FIMUL, FIDIV.
Advanced MicroprocessorAdvanced Microprocessor 3131
Arithmetic CoprocessorArithmetic Coprocessor
Arithmetic related operations,
• FSQRT: Finds the square root of operand at ST. Leave result there. Check IE bit for an invalid result, e.g., the operand was negative using FSTSW AX, and TEST AX, 1.
• FSCALE: Adds contents of ST1 (interpreted as an integer) to the exponent of ST. value of ST must be between 2-15 and 2+15
• FPREM1: Performs modulo division of ST by ST1. The resultant remainder is found at ST.
• FRNDINT: Rounds ST to an integer.
• FXTRACT: Decomposes ST into an unbiased exponent and a significand. Extracted significand is at ST and unbiased exponent at ST1.
Advanced MicroprocessorAdvanced Microprocessor 3232
Arithmetic CoprocessorArithmetic Coprocessor
• FABS: Change sign of ST to positive.
• FCHS: Invert sign of ST.
iii) Comparison instruction:
-Instruction examines the data at the top of the stack with other, return the result of the comparison in status register condition code c3 to c0 .
• FCOM: Compares ST with an memory or register operand. FCOM by itself compares ST and ST1.
• FCOMP/FCOMPP: Compare and pop once or twice.
• FICOM/FICOMP: Compare ST with integer memory operand and optionally pop the stack.
Advanced MicroprocessorAdvanced Microprocessor 3333
Arithmetic CoprocessorArithmetic Coprocessor
• FTST: Compare ST with 0.0.
• FXAM: Exam ST and modify CC bits to indicate whether contents are positive, negative, normalized, etc.
• FCOMI/FUCOMI: pentium’s, same as FCOM, has one additionalfeature moves the floating point flags register to flag register FNSTSW AX, and SAHF.
iv) Transcendental operations
• FPTAN – finds partial tangent of y/x = tanθ, θ value on top of the stack must be between 0 and n/4 for 87 & 287 , must less than 263 for 387 – pentium4
• FPATAN – partial arctangent θ
Advanced MicroprocessorAdvanced Microprocessor 3434
Arithmetic CoprocessorArithmetic Coprocessor
•F2XM1: Compute 2x -1
•FSIN/FCOS : sin or cosine , result found in ST
•FSINCOS : sin & cosine, ST – sine & ST1 – cosine
•FYL2X: Compute Ylog2X, X – ST & Y – ST1, result on top of the stack, X range between 0 and infinity & Y range between •-infinity and 0
•FYL2XP1: Compute Ylog2(X + 1)
FunctionFunction equationequation
1010yy 22yy x log x log22 10 10
εεyy 22yy x log x log22 εεxxyy 22yy x log x log22 x x
Advanced MicroprocessorAdvanced Microprocessor 3535
Arithmetic CoprocessorArithmetic CoprocessorV - Constant operation
• coprocessor instruction set include opcodes that return constants to the top of the stack.
- FLDZ: Store +0.0 to ST.
- FLD1: Store +1.0 to ST.
- FLDPI: Store pi to ST.
- FLDL2T: Store log210 to ST.
- FLDL2E: Store log2e to ST.
- FLDLG2: Store log102 to ST.
- FLDLN2: Store loge2 to ST.
Advanced MicroprocessorAdvanced Microprocessor 3636
Arithmetic CoprocessorArithmetic Coprocessor
VI . Coprocessor Control instruction
-Control instruction for initialization, exception handling & task switching
FINIT/ FNINIT : performs a reset operation, sets register0 as top of the stack
round, busy,
FSETPM : changes the addressing mode of the coprocessor to the protected addressing mode
FLDCW : loads the control register with the word addressed by the operands
FSTCW/FNSTCW : store the control register into the word sized memory operand
Advanced MicroprocessorAdvanced Microprocessor 3737
Arithmetic CoprocessorArithmetic Coprocessor• FSTSW AX/ FNSTSW AX : copies the contents of the control register to AX ( not for 8087)
• FCLEX/FNCLEX : clear the error flags in the status register and also busy flag
• FSAVE/FNSAVE : writes the entire state of the machine to memory
• FRSTOR : restores the state of the machine from memory
• FSTENV/FNSTENV : stores the environment of the coprocessor – real mode or protected mode
• FLDENV : reloads the environment
• FINCST : increments the stack pointer FDECSTP : decrement the stack pointer
Advanced MicroprocessorAdvanced Microprocessor 3838
Arithmetic CoprocessorArithmetic Coprocessor
• FFREE : frees a register content
• FNOP : floating point coprocessor NOP
• FWAIT : causes the microprocessor to wait for the coprocessor to finish an operation, it should be used before the microprocessor access memory data that are affect by the coprocessor
Advanced MicroprocessorAdvanced Microprocessor 3939
Arithmetic CoprocessorArithmetic Coprocessor
Coprocessor instruction:
- lists of the instruction for all coprocessor from 8087 thru pentium 4, with number of clocking periods required to execute each instruction.
General:
reg = floating point register, st(0), st(1) ... st(7)Mem = memory addressmem32 = memory address of 32-bit itemmem64 = memory address of 64-bit itemmem80 = memory address of 80-bit item
Advanced MicroprocessorAdvanced Microprocessor 4040
Arithmetic CoprocessorArithmetic Coprocessor
FX = pairs with FXCHNP = no pairing
Instruction clock cycles
• F2XM1 Compute 2x-1
8087 287 387 486 Pentium310-630 310 -630 211-476 140-279 13-57 NP
• FABS Absolute value
8087 287 387 486 Pentium10-17 10-17 22 3 1 FX
Advanced MicroprocessorAdvanced Microprocessor 4141
Arithmetic CoprocessorArithmetic Coprocessor
• FADD Floating point add• FADDP Floating point add and popvariations/operand 8087 287 387 486 Pentiumfadd 70-100 70-100 23-34 8-20 3/1 FXfadd mem32 90-120 90-120 24-32 8-20 3/1 FX
+EA fadd mem64 95-125 95-125 29-37 8-20 3/1 FX
+EAfaddp 75-105 75-105 23-31 8-20 3/1 FX
• FBLD Load BCDoperand 8087 287 387 486 Pentiummem (290-310) 290-310 266-275 70-103 48-58 NP
+EA
Advanced MicroprocessorAdvanced Microprocessor 4242
Arithmetic CoprocessorArithmetic Coprocessor
• FBSTP Store BCD and pop 8087 287 387 486 Pentium(520-540)+EA 520-540 512-534 172-176 148-154 NP
• FCHS Change sign8087 287 387 486 Pentium10-17 10-17 24-25 6 1 FX
• FNCLEX Clear exceptions, no waitvariations 8087 287 387 486 Pentiumfclex 2-8 2-8 11 7 9 NPfnclex 2-8 2-8 11 7 9 NPThe wait version may take additional cycles
Advanced MicroprocessorAdvanced Microprocessor 4343
Arithmetic CoprocessorArithmetic Coprocessor
• FCOM Floating point compare• FCOMP Floating point compare and pop• FCOMPP Floating point compare and pop twicevariations/operand 8087 287 387 486 Pentiumfcom reg 40-50 40-50 24 4 4/1 FXfcom mem32 (60-70) 60-70 26 4 4/1 FX
+EAfcom mem64 (65-75) 65-75 31 4 4/1 FX
+EAfcomp 42-52 42-52 26 4 4/1 FXfcompp 45-55 45-55 26 5 4/1 FX
FCOS Floating point cosine (387+)8087 287 387 486 Pentium- - 123-772 257-354 18-124 NPAdditional cycles required if operand > pi/4 (~3.141/4 =~.785)
Advanced MicroprocessorAdvanced Microprocessor 4444
Arithmetic CoprocessorArithmetic Coprocessor
•FDISI Disable interrupts (8087 only, others do fnop)•FNDISI Disable interrupts, no wait (8087 only, others do fnop)variations 8087 287 387 486 Pentiumfdisi 2-8 2 2 3 1 NPfndisi 2-8 2 2 3 1 NPThe wait version may take additional cycles
•FDIV Floating divide•FDIVP Floating divide and popvariations/operand 8087 287 387 486 Pentiumfdiv reg 193-203 193-203 88-91 73 39 FXfdiv mem32 (215-225) 215-225 89 73 39 FX
+EAfdiv mem64 (220-230) 220-230 94 73 39 FX
+EAfdivp 197-207 197-207 91 73 39 FX
Advanced MicroprocessorAdvanced Microprocessor 4545
Arithmetic CoprocessorArithmetic Coprocessor
•FDIVR Floating divide reversed•FDIVRP Floating divide reversed and popvariations/operand 8087 287 387 486 Pentiumfdivr reg 194-204 194-204 88-91 73 39 FXfdivr mem32 (216-226) 216-226 89 73 39 FX
+EAfdivr mem64 (221-231) 221-231 94 73 39 FX
+EAfdivrp 198-208 198-208 91 73 39 FX
•FENI Enable interrupts (8087 only, others do fnop)•FNENI Enable interrupts, nowait (8087 only, others do fnop)Variations 8087 287 387 486 Pentiumfeni 2-8 2 2 3 1 NPfneni 2-8 2 2 3 1 NP
Advanced MicroprocessorAdvanced Microprocessor 4646
Arithmetic CoprocessorArithmetic Coprocessor
• FFREE Free register8087 287 387 486 Pentium9-16 9-16 18 3 1 NP
• FIADD Integer addoperand 8087 287 387 486 PentiumMem16 (102-137) 102-137 71-85 20-35 7/4 NP
+EAmem32 (108-143) 108-143 57-72 19-32 7/4 NP
+EA
•FINIT Initialize floating point processor•FNINIT Initialize floating point processor, no waitvariations 8087 287 387 486 Pentiumfinit 2-8 2-8 33 17 16 NPfninit 2-8 2-8 33 17 12 NPThe wait version may take additional cycles
Advanced MicroprocessorAdvanced Microprocessor 4747
Arithmetic CoprocessorArithmetic Coprocessor
•FICOM Integer compare•FICOMP Integer compare and popvariations/operand 8087 287 387 486 Pentiumficom mem16 (72-86) 72-86 71-75 16-20 8/4 NP
+EAficom mem32 (78-91) 78-91 56-63 15-17 8/4 NP
+EAficomp mem16 (74-88) 74-88 71-75 16-20 8/4 NP
+EAficomp mem32 (80-93) 80-93 56-63 15-17 8/4 NP
+EA
• FIMUL Integer multiplyOperand 8087 287 387 486 Pentiummem16 (124-138) 124-138 76-87 23-27 7/4 NP
+EA mem32 (130-144) 130-144 61-82 22-24 7/4 NP
+EA
Advanced MicroprocessorAdvanced Microprocessor 4848
Arithmetic CoprocessorArithmetic Coprocessor
•FIDIV Integer divide•FIDIVR Integer divide reversedvariations/operand 8087 287 387 486 Pentiumfidiv mem16 (224-238) 224-238 136-140 85-89 42 NP
+EA fidiv mem32 (230-243) 230-243 120-127 84-86 42 NP
+EA fidivr mem16 (225-239)225-239 135-141 85-89 42 NP
+EA fidivr mem32 (231-245) 231-245 121-128 84-86 42 NP
+EA
• FILD Load integeroperand 8087 287 387 486 Pentiummem16 (46-54)+EA 46-54 61-65 13-16 3/1 NPmem32 (52-60)+EA 52-60 45-52 9-12 3/1 NPmem64 (60-68)+EA 60-68 56-67 10-18 3/1 NP
Advanced MicroprocessorAdvanced Microprocessor 4949
Arithmetic CoprocessorArithmetic Coprocessor•FIST Store integer•FISTP Store integer and popvariations/operand 8087 287 387 486 Pentiumfist mem16 (80-90)+EA 80-90 82-95 29-34 6 NPfist mem32 (82-92)+EA 82-92 79-93 28-34 6 NPfistp mem16 (82-92)+EA 82-92 82-95 29-34 6 NPfistp mem32 (84-94)+EA 84-94 79-93 28-34 6 NPfistp mem64 (94-105)+EA 94-105 80-97 28-34 6 NP
•FISUB Integer subtract•FISUBR Integer subtract reversedvariations/Operand 8087 287 387 486 Pentiumfisub mem16 (102-137)+EA 102-137 71-85 20-35 7/4 NPfisubr mem32 (108-143)+EA 108-143 57-82 19-32 7/4 NP
Advanced MicroprocessorAdvanced Microprocessor 5050
Arithmetic CoprocessorArithmetic Coprocessor
• FINCSTP Increment floating point stack pointer8087 287 387 486 Pentium6-12 6-12 21 3 1 NP
• FLD Floating point loadoperand 8087 287 387 486 Pentiumreg 17-22 17-22 14 4 1 FXmem32 (38-56)+EA 38-56 20 3 1 FXmem64 (40-60)+EA 40-60 25 3 1 FXmem80 (53-65)+EA 53-65 44 6 3 NPLoad floating point constants
• FLDCW Load control wordoperand 8087 287 387 486 Pentiummem16 (7-14)+EA 7-14 19 4 7 NP
Advanced MicroprocessorAdvanced Microprocessor 5151
Arithmetic CoprocessorArithmetic Coprocessor
•FLDZ Load constant onto stack, 0.0•FLD1 Load constant onto stack, 1.0•FLDL2E Load constant onto stack, logarithm base 2 (e)•FLDL2T Load constant onto stack, logarithm base 2 (10)•FLDLG2 Load constant onto stack, logarithm base 10 (2)•FLDLN2 Load constant onto stack, natural logarithm (2)•FLDPI Load constant onto stack, pi (3.14159...)
variations 8087 287 387 486 Pentiumfldz 11-17 11-17 20 4 2 NPfld1 15-21 15-21 24 4 2 NPfldl2e 15-21 15-21 40 8 5/3 NPfldl2t 16-22 16-22 40 8 5/3 NPfldlg2 18-24 18-24 41 8 5/3 NPfldln2 17-23 17-23 41 8 5/3 NPfldpi 16-22 16-22 40 8 5/3 NP
Advanced MicroprocessorAdvanced Microprocessor 5252
Arithmetic CoprocessorArithmetic Coprocessor
•FLDENV Load environment stateoperand 8087 287 387 486 Pentiummem (35-45)+EA 35-45 71 44/34 37/32-33 NPcycles for real mode/protected mode
•FMUL Floating point multiply•FMULP Floating point multiply and popvariations/operand 8087 287 387 486 Pentiumfmul reg s 90-105 90-105 29-52 16 3/1 FXfmul reg 130-145 130-145 46-57 16 3/1 FXfmul mem32 (110-125)+EA 110-125 27-35 11 3/1 FXfmul mem64 (154-168)+EA 154-168 32-57 14 3/1 FXfmulp reg s 94-108 94-108 29-52 16 3/1 FXfmulp reg 134-148 134-148 29-57 16 3/1 FXs = register with 40 trailing zeros in fraction
Advanced MicroprocessorAdvanced Microprocessor 5353
Arithmetic CoprocessorArithmetic Coprocessor
•FNOP no operation8087 287 387 486 Pentium10-16 10-16 12 3 1 NP
•FPATAN Partial arctangent8087 287 387 486 Pentium250-800 250-800 314-487 218-303 17-173
•FPREM Partial remainder•FPREM1 Partial remainder (IEEE compatible, 387+)Variations 8087 287 387 486 Pentiumfprem 15-190 15-190 74-155 70-138 16-64 NPfprem1 - - 95-185 72-167 20-70 NP
•FPTAN Partial tangent8087 287 387 486 Pentium30-540 30-540 191-497 200-273 17-173 NPAdditional cycles required if operand > pi/4 (~3.141/4 =~.785)
Advanced MicroprocessorAdvanced Microprocessor 5454
Arithmetic CoprocessorArithmetic Coprocessor
•FRNDINT Round to integer8087 287 387 486 Pentium16-50 16-50 66-80 21-30 9-20 NP
•FRSTOR Restore saved statevariations/Operand 8087 287 387 486 Pentiumfrstor mem (197-207)+EA 197-207 308 131/120 75-95/70 NPfrstorw mem - - 308 131/120 75-95/70 NPfrstord mem - - 308 131/120 75-95/70 NP
cycles for real mode/protected mode
Advanced MicroprocessorAdvanced Microprocessor 5555
Arithmetic CoprocessorArithmetic Coprocessor
•FSAVE Save FPU state•FSAVEW Save FPU state, 16-bit format (387+)•FSAVED Save FPU state, 32-bit format (387+)•FSAVE Save FPU state, no wait•FSAVEW Save FPU state, no wait, 16-bit format (387+)•FSAVED Save FPU state, no wait, 32-bit format (387+)
variations 8087 287 387 486 Pentiumfsave (197-207)+EA 197-207 375-376 154/143 127-151/124 NPfsavew 375-376 154/143 127-151/124 NPfsaved 375-376 154/143 127-151/124 NPfnsave (197-207)+EA 197-207 375-376 154/143 127-151/124 NPfnsavew 375-376 154/143 127-151/124 NPFnsaved 375-376 154/143 127-151/124 NPCycles for real mode/protected modeThe wait version may take additional cycles
Advanced MicroprocessorAdvanced Microprocessor 5656
Arithmetic CoprocessorArithmetic Coprocessor
•FSCALE Scale by factor of 28087 287 387 486 Pentium32-38 32-38 67-86 30-32 20-31 NP
FSETPM Set protected mode (287 only, 387+ = fnop)8087 287 387 486 Pentium- 2-8 12 3 1 NP
•FSIN Sine (387+)•FSINCOS Sine and cosine (387+)variations 8087 287 387 486 Pentiumfsin - - 122-771 257-354 16-126 NPfsincos - - 194-809 292-365 17-137 NPAdditional cycles required if operand > pi/4 (~3.141/4 = ~.785)•FSQRT Square root8087 287 387 486 Pentium180-186 180-186 122-129 83-87 70 NP
Advanced MicroprocessorAdvanced Microprocessor 5757
Arithmetic CoprocessorArithmetic Coprocessor
•FST Floating point store•FSTP Floating point store and pop
variations/Operand 8087 287 387 486 Pentiumfst reg 15-22 15-22 11 3 1 NPfst mem32 (84-90)+EA 84-90 44 7 2 NPfst mem64 (96-104)+EA 96-104 45 8 2 NPfstp reg 17-24 17-24 12 3 1 NPfstp mem32 (86-92)+EA 86-92 44 7 2 NPfstp mem64 (98-106)+EA 98-106 45 8 2 NPfstp mem80 (52-58)+EA 52-58 53 6 3 NP
Advanced MicroprocessorAdvanced Microprocessor 5858
Arithmetic CoprocessorArithmetic Coprocessor
•FSTCW Store control word•FNSTCW Store control word, no waitvariations/operand 8087 287 387 486 Pentiumfstcw mem 12-18 12-18 15 3 2 NPfnstcw mem 12-18 12-18 15 3 2 NPThe wait version may take additional cycles
Advanced MicroprocessorAdvanced Microprocessor 5959
Arithmetic CoprocessorArithmetic Coprocessor
•FSTENV Store FPU environment•FSTENVW Store FPU environment, 16-bit format (387+)•FSTENVD Store FPU environment, 32-bit format (387+)•FNSTENV Store FPU environment, no wait•FNSTENVW Store FPU environment, no wait, 16-bit format (387+)•FNSTENVD Store FPU environment, no wait, 32-bit format (387+)variations/operand 8087 287 387 486 Pentiumfstenv mem (40-50)+EA 40-50 103-104 67/56 48-50 NPfstenvw mem 103-104 67/56 48-50 NPfstenvd mem 103-104 67/56 48-50 NPfnstenv mem (40-50)+EA 40-50 103-104 67/56 48-50 NPfnstenvw mem 103-104 67/56 48-50 NPfnstenvd mem 103-104 67/56 48-50 NPCycles for real mode/protected modeThe wait version may take additional cycles
Advanced MicroprocessorAdvanced Microprocessor 6060
Arithmetic CoprocessorArithmetic Coprocessor•FSTSW Store status word•FNSTSW Store status word, no waitvariations/operand 8087 287 387 486 Pentiumfstsw mem 12-18 12-18 15 3 2 NPfstsw ax - 10-16 13 3 2 NPfnstsw mem 12-18 12-18 15 3 2 NPfnstsw ax - 10-16 13 3 2 NPThe wait version may take additional cycles
•FSUB Floating point subtract•FSUBP Floating point subtract and popvariations/operand 8087 287 387 486 Pentiumfsub reg 70-100 70-100 26-37 8-20 3/1 FXfsub mem32 (90-120)+EA 90-120 24-32 8-20 3/1 FXfsub mem64 (95-125)+EA 95-125 28-36 8-20 3/1 FXfsubp reg 75-105 75-105 26-34 8-20 3/1 FX
Advanced MicroprocessorAdvanced Microprocessor 6161
Arithmetic CoprocessorArithmetic Coprocessor
•FSUBR Floating point reverse subtract•FSUBRP Floating point reverse subtract and popvariations/operand 8087 287 387 486 Pentiumfsubr reg 70-100 70-100 26-37 8-20 3/1 FXfsubr mem32 (90-120)+EA 90-120 24-32 8-20 3/1 FXfsubr mem64 (95-125)+EA 95-125 28-36 8-20 3/1 FXfsubrp reg 75-105 75-105 26-34 8-20 3/1 FX
FTST Floating point test for zero8087 287 387 486 Pentium38-48 38-48 28 4 4/1 FX
FWAIT Wait while FPU is executing8087 287 387 486 Pentium4 3 6 1-3 1-3 NP
Advanced MicroprocessorAdvanced Microprocessor 6262
Arithmetic CoprocessorArithmetic Coprocessor
•FXAM Examine condition flags8087 287 387 486 Pentium12-23 12-23 30-38 8 21 NP
FXCH Exchange floating point registers8087 287 387 486 Pentium10-15 10-15 18 4 0-1 *• * FCXH is pairable in the V pipe with all FX pairable instructions
•FXTRACT Extract exponent and significand 8087 287 387 486 Pentium27-55 27-55 70-76 16-20 13 NP
•FYL2X Compute Y * log2(x)•FYL2XP1 Compute Y * log2(x+1)variations 8087 287 387 486 Pentiumfyl2x 900-1100 900-1100 120-538 196-329 22-111 NPfyl2xp1 700-1000 700-1000 257-547 171-326 22-103 NP
Advanced MicroprocessorAdvanced Microprocessor 6363
MMX Technology MMX Technology
• Multi Media eXtensions ( MMX )
• Designed to accelerate multimedia and communicationapplications
- motion video, image processing, audio synthesis, speech synthesis and compression, video conferencing, 2D and 3D graphics
• Includes new instructions and data types to significantly improve application performance
• Exploits the parallelism inherent in many multimedia andcommunications algorithms
• Maintains full compatibility with existing operating systems and applications
Advanced MicroprocessorAdvanced Microprocessor 6464
MMX Technology MMX Technology
Data Types:-
• packed data types- 8 packed , consecutive 8 bit bytes- 4 packed , consecutive 16 bit words- 2 packed , consecutive 32 bit double words- format have consecutive memory addresses & uses little endian form
63 56 55 48 47 40 39 32 31 24 23 16 15 8 7 0
63 32 31 0
63 0
63 48 47 32 31 16 15 0
Advanced MicroprocessorAdvanced Microprocessor 6565
MMX Technology MMX Technology • MMX Technology registers have the same format as a 64 bit quantity in memory
• has 2 data access modes- 64 bit access mode, for 64 bit memory & register transfer
occur between floating point coprocessor registers- 32 bit access mode, for 32 bit memory & register transfer
occur between microprocessor registers
MM7MM6MM5MM4MM3MM2MM1MM0
TAGs
Advanced MicroprocessorAdvanced Microprocessor 6666
MMX Technology MMX Technology
• adds 57 new instructions to the instructions set of pentium – pentium4
Instruction Set :-
- arithmetic - comparison- conversion- logical- shift- data transfer
• instruction types are similar to microprocessor , MMX instruction uses packed data types
Arithmetic instruction: addition, subtraction, multiplication & a special multiplication with an addition.
Advanced MicroprocessorAdvanced Microprocessor 6767
MMX Technology MMX Technology
• addition are performed- packed signed or unsigned packed bytes ( B )- packed words ( W )- packed double word data ( D )
• any carry or borrow is generated are dropped
Comparison instruction:
• 2 comparison PCMPEQ( equal) & PCMPGT( greater than)
• compared bytes, words or double word
• do not change the microprocessor flag bits, return 1’s for true & 0’s for false
• if MM2 compared with MM1 , if equal Least significant byte of MM2 contains FFH otherwise 00H
Advanced MicroprocessorAdvanced Microprocessor 6868
MMX Technology MMX Technology
Conversion Instruction:
• 2 comparison instruction PACK as signed and unsigned , & PUNPCK as unpack high data and unpack low data
• packed signed or unsigned packed bytes ( B )- packed words ( W )- packed double word data ( D )
• B,W & D – must be used in combination- WB word to byte- DW double to word
• in conversion, if unsigned word does not fit , then the destination byte becomes an FFH
Advanced MicroprocessorAdvanced Microprocessor 6969
MMX Technology MMX Technology
Logical instruction:
• AND, OR, NAND & XOR
• instruction do not have size extension
• perform bit wise operations on all 64 bits of the data
Shift instruction:
• logical shift & arithmetic shift right instruction
• performed on word (W), double word (D) & quad word (Q)
Advanced MicroprocessorAdvanced Microprocessor 7070
MMX Technology MMX Technology
Data transfer instruction:
• data transfer done – register to register or register and memory
• only rightmost 32 bits are copied , no instruction to transfer leftmost 32 bit ,
• to transfer leftmost 32 bit, shift right
EMMS instruction:
• empty MMX state, all the tags in the floating point unit , floating point register are listed as empty
• this instruction should be executed before the return instruction at the end of MMX procedure or subsequent floating point operation will cause interrupt error, crashing window, application
Advanced MicroprocessorAdvanced Microprocessor 7171
MMX Technology MMX Technology
• EMMS – empty MMX state
Ex : EMMS
• MOVED – move double word
Ex: MOVED MM3, EAX reg to xreg
MOVED EAX, MM4 xreg to regMOVED MM3, DATA mem to xregMOVED DATA1, MM3 xreg to mem
MOVEQ – move quadword
Ex: MOVEQ MM3, MM1 xreg to xregMOVEQ MM3, DATA mem to xregMOVEQ DATA1, MM3 xreg to mem
Advanced MicroprocessorAdvanced Microprocessor 7272
MMX Technology MMX Technology
• PACKSSDW – pack signed doubleword to word
Ex :PACKSSDW MM1,MM2 xreg to xreg PACKSSDW MM1,DATA mem to xreg
• PACKSSWB – pack signed word to byte
Ex :PACKSSWB MM1,MM2 xreg to xreg PACKSSWB MM1,DATA mem to xreg
• PACKUSDW – pack unsigned word to byte
Ex :PACKUSDW MM1,MM2 xreg to xreg PACKUSDW MM1,DATA mem to xreg
Advanced MicroprocessorAdvanced Microprocessor 7373
MMX Technology MMX Technology • PADD – add with truncation : byte, word & doubleword
Ex : PADDB MM1,MM3 xreg to xregPADDW MM1,MM3PADDD MM1,MM3
PADDB MM1, DATA mem to xregPADDW MM1, DATAPADDD MM1,DATA
• PADDS – add with signed saturation : byte & word
Ex : PADDSB MM1,MM3 xreg to xregPADDSW MM1,MM3
PADDSB MM1, DATA mem to xregPADDSW MM1, DATA
Advanced MicroprocessorAdvanced Microprocessor 7474
MMX Technology MMX Technology
• PADDUS – add with unsigned saturation : byte & wordEx :
PADDUSB MM1,MM3 xreg to xregPADDUSW MM1,MM3
PADDUSB MM1, DATA mem to xregPADDUSW MM1, DATA
• PAND – And•EX :
PAND MM1,MM2 xreg to xregPAND MM1,DATA mem to xreg
• PAND – NandEX :
PANDN MM1,MM2 xreg to xregPANDN MM1,DATA mem to xreg
Advanced MicroprocessorAdvanced Microprocessor 7575
MMX Technology MMX Technology • PCMPEQU – compare for equalityEx :
PCMPEQUB MM1,MM2 xreg to xreg
PCMPEQUW MM1,MM2
PCMPEQUD MM1,MM2
PCMPEQUB MM1,DATA mem to xreg
PCMPEQUW MM1,DATA
PCMPEQUD MM1,DATA
PCMPGT – compare for greater thanEx :
PCMPGTB MM1,MM2 xreg to xregPCMPGTW MM1,MM2
PCMPGTD MM1,MM2PCMPGTB MM1,DATA mem to xregPCMPGTW MM1,DATAPCMPGTD MM1,DATA
Advanced MicroprocessorAdvanced Microprocessor 7676
MMX Technology MMX Technology
• PMADD – multiply and addEx :
PMADD MM1,MM4 xreg to xreg
PMADD MM1,DATA mem to xreg
• PMULH – multiplication - high Ex :
PMULH MM1,MM4 xreg to xregPMULH MM1,DATA mem to xreg
• PMULL – multiplication - low Ex :
PMULL MM1,MM4 xreg to xregPMULL MM1,DATA mem to xreg
• POR – orPOR MM1,MM4 xreg to xreg
POR MM1,DATA mem to xreg
Advanced MicroprocessorAdvanced Microprocessor 7777
MMX Technology MMX Technology
• PSLL – shift left :word, doubleword and quadword
Ex :PSLLW MM1,MM3 xreg to xregPSLLD MM1,MM3PSLLQ MM1,MM3
PSLLW MM1,DATA mem to xregPSLLD MM1,DATAPSLLQ MM1,DATA
PSLLW MM1,5 xreg by count PSLLD MM1,4PSLLQ MM1,7
Advanced MicroprocessorAdvanced Microprocessor 7878
MMX Technology MMX Technology
• PSRA – shift arithmetic right :word, doubleword and quadword
Ex :PSRAW MM1,MM3 xreg to xregPSRAD MM1,MM3PSRAQ MM1,MM3
PSRAW MM1,DATA mem to xregPSRAD MM1,DATAPSRAQ MM1,DATA
PSRAW MM1,5 xreg by count PSRAD MM1,4PSRAQ MM1,7
Advanced MicroprocessorAdvanced Microprocessor 7979
MMX Technology MMX Technology
• PSRL – shift right :word, doubleword and quadword
Ex :PSRLW MM1,MM3 xreg to xregPSRLD MM1,MM3PSRLQ MM1,MM3
PSRLW MM1,DATA mem to xregPSRLD MM1,DATAPSRLQ MM1,DATA
PSRLW MM1,5 xreg by count PSRLD MM1,4PSRLQ MM1,7
Advanced MicroprocessorAdvanced Microprocessor 8080
MMX Technology MMX Technology • PSUB – subtraction with truncation : byte, word & doublewordEx :
PSUBB MM1,MM3 xreg to xregPSUBW MM1,MM3PSUBD MM1,MM3
PSUBB MM1, DATA mem to xregPSUBW MM1, DATAPSUBD MM1,DATA
• PSUBS – subtraction with signed saturation: byte, word & doublewordEx :
PSUBSB MM1,MM3 xreg to xregPSUBSW MM1,MM3PSUBSD MM1,MM3
PSUBSB MM1, DATA mem to xregPSUBSW MM1, DATAPSUBSD MM1,DATA
Advanced MicroprocessorAdvanced Microprocessor 8181
MMX Technology MMX Technology
• PSUBUS – subtraction with unsigned saturation: byte, word & doublewordEx :
PSUBUSB MM1,MM3 xreg to xregPSUBUSW MM1,MM3PSUBUSD MM1,MM3
PSUBUSB MM1, DATA mem to xregPSUBUSW MM1, DATAPSUBUSD MM1,DATA
• PXOR – exclusive OrEx :
PXOR MM1,MM3 xreg to xregPXOR MM4,DATA mem to xreg
Advanced MicroprocessorAdvanced Microprocessor 8282
MMX Technology MMX Technology
• PUNPCKH – unpack high : byte, word & doublewordEx
PUNPCKHB MM1,MM3 xreg to xregPUNPCKHW MM1,MM3PUNPCKHD MM1,MM3PUNPCKHB MM1,DATA mem to xregPUNPCKHW MM1,DATAPUNPCKHD MM1,DATA
PUNPCHL – unpack LOW : byte, word & doublewordEx
PUNPCHLB MM1,MM3 xreg to xregPUNPCHLW MM1,MM3PUNPCHLD MM1,MM3PUNPCHLB MM1,DATA mem to xregPUNPCHLW MM1,DATAPUNPCHLD MM1,DATA
Advanced MicroprocessorAdvanced Microprocessor 8383
MMX Technology MMX Technology