CSN221_Lec_10.pdf
Transcript of CSN221_Lec_10.pdf
Course Website: http://faculty.iitr.ac.in/~sudiproy.fcs/csn221_2015.htmlPiazza Site: https://piazza.com/iitr.ac.in/fall2015/csn221
Dr. Sudip Roy
CSN‐221: COMPUTER ARCHITECTURE AND MICROPROCESSORS
Computer Arithmetic
(Lecture - 10)
Dr. Sudip Roy 2
Real Numbers:
Dr. Sudip Roy 3
Floating‐Point Number Formats:The term floating point number refers to representation of real binary numbers in computers
IEEE 754 standard defines standards for floating point representations
General format±1.bbbbbtwo×2eeee
or (‐1)S × (1+F) × 2E
Where S = sign, 0 for positive, 1 for negative F = fraction (or mantissa) as a binary integer, 1+F is called
significand E = exponent as a binary integer, positive or negative (two’s
complement)
Dr. Sudip Roy 4
MIPS Single Precision:
‐127 ≤ E ≤ 128, Max |E| ~ 128 Overflow: Exponent requiring more than 8 bits. Number can be positive or negative.
Underflow: Fraction requiring more than 23 bits. Number can be positive or negative.
S E: 8‐bit Exponent F: 23‐bit Fraction
bits 0‐22bits 23‐30bit 31
MIPS: Microprocessor without Interlocked Pipeline Stages
Dr. Sudip Roy 5
MIPS Double Precision:
‐1023 ≤ E ≤ 1024, Max |E| ~ 1024 Overflow: Exponent requiring more than 11 bits. Number can be positive or negative.
Underflow: Fraction requiring more than 52 bits. Number can be positive or negative.
S E: 11‐bit Exponent F: 52‐bit Fraction +
bits 32‐51bits 52‐62bit 63
Continuation of 52‐bit Fraction
bits 0‐31
Alternative Representation of 4‐bit Integers:
Dr. Sudip Roy 6
Dr. Sudip Roy 7
IEEE 754 Floating Point Standard:
Biased exponent: true exponent range [‐127,128] is changed to [0, 255]: Biased exponent is an 8‐bit positive binary integer. True exponent obtained by subtracting 127ten or 01111111two
First bit of significand is always 1:± 1.bbbb . . . b × 2E
1 before the binary point is implicitly assumed. Bias = 2(k‐1) – 1, in general
Significand field represents 23 bit fraction after the binary point. Significand range is [1, 2), to be exact [1, 2 – 2‐23] True exponent = biased exponent – 127, for 32‐bit representation
Dr. Sudip Roy 8
IEEE 754 Floating Point Standard:
Dr. Sudip Roy 9
Conversion to Decimal:
Sign bit is 1, number is negative Biased exponent is 27+20 = 129 The number is
1 10000001 01000000000000000000000
Sign bit S bits 23-30 bits 0-22normalized E F
(-1)S × (1 + F) × 2(exponent – bias) = (-1)1 × (1 + F) × 2(129 – 127)
= - 1 × 1.25 × 22
= - 1.25 × 4= - 5.0
Dr. Sudip Roy 10
Positive Zero in IEEE 754:
+ 1.0 × 2‐127
Smallest positive number in single‐precision IEEE 754 standard. Interpreted as positive zero True exponent less than ‐126 is positive underflow; can be regarded as
zero.
0 00000000 00000000000000000000000
Biasedexponent
Fraction
Dr. Sudip Roy 11
Negative Zero in IEEE 754:
‐ 1.0 × 2‐127
Smallest negative number in single‐precision IEEE 754 standard. Interpreted as negative zero. True exponent less than ‐126 is negative underflow; may be regarded as 0.
1 00000000 00000000000000000000000
Biasedexponent
Fraction
Dr. Sudip Roy 12
Positive Infinity in IEEE 754:
+ 1.0 × 2128 Largest positive number in single‐precision IEEE 754 standard. Interpreted as +∞ If true exponent = 128 and frac on ≠ 0, then the number is greater than ∞. It is called “not a number” or NaN and may be interpreted as ∞
0 11111111 00000000000000000000000Biasedexponent
Fraction
Dr. Sudip Roy 13
Negative Infinity in IEEE 754:
‐1.0 × 2128 Smallest negative number in single‐precision IEEE 754 standard. Interpreted as ‐∞ If true exponent = 128 and frac on ≠ 0, then the number is less than ‐ ∞. It is called “not a number” or NaN and may be interpreted as ‐ ∞
1 11111111 00000000000000000000000Biasedexponent
Fraction
Dr. Sudip Roy 14
IEEE 754 Floating Point Standard:
NegativeOverflow
PositiveOverflow
Expressible negativenumbers
Expressible positivenumbers
0‐2‐126 2‐126
Positive underflowNegative underflow
(2 – 2‐23)×2127‐ (2 – 2‐23)×2127
+ ∞–∞
Dr. Sudip Roy 15
Floating Point Arithmetic:
Dr. Sudip Roy 16
Floating Point Addition and Subtraction:
0. Zero check‐ Change the sign of subtrahend, i.e., convert to
summation‐ If either operand is 0, the other is the result
1. Significand alignment: right shift significand of smaller exponent until two exponents match.
2. Addition: add significands and report exception if overflow occurs. If significand = 0, return result as 0.
3. Normalization‐ Shift significand bits to normalize.‐ report overflow or underflow if exponent goes out of
range.4. Rounding
Dr. Sudip Roy 17
Example (4 Significant Fraction Bits):
Subtraction: 0.5ten – 0.4375ten Step 0: Floating point numbers to be added
1.000two× 2 –1 and –1.110two× 2 –2
Step 1: Significand of lesser exponent is shifted right until exponents match
–1.110two× 2 –2 → – 0.111two× 2 –1
Step 2: Add significands, 1.000two + ( – 0.111two)Result is 0.001two × 2 –1
01000+1100100001
2’s complement addition, one bit added for sign
Dr. Sudip Roy 18
Example (Continued):
Step 3: Normalize, 1.000two× 2 – 4
No overflow/underflow since127 ≥ exponent ≥ –126
Step 4: Rounding, no change since the sum fits in 4 bits.1.000two × 2 – 4 = (1+0)/16 = 0.0625ten
Dr. Sudip Roy 19
Floating Point Multiplication (Basic Idea):
1. Separate sign2. Add exponents3. Multiply significands4. Normalize, round, check overflow/underflow5. Replace sign
Dr. Sudip Roy 20
Floating Point Multiplication (Example):
Multiply 0.5ten and – 0.4375ten(answer = – 0.21875ten) or
Multiply 1.000two×2 –1 and –1.110two×2 –2 Step 1: Add exponents
–1 + (–2) = – 3 Step 2: Multiply significands
1.000×1.11000001000100010001110000 Product is 1.110000
Dr. Sudip Roy 21
Floating Point Multiplication (Example):
Step 3: Normalization: If necessary, shift significand right and increment exponent.
Normalized product is 1.110000 × 2 –3
Check overflow/underflow: 127 ≥ exponent ≥ –126 Step 4: Rounding: 1.110 × 2 –3
Step 5: Sign: Operands have opposite signs,Product is –1.110 × 2 –3
(Decimal value = – (1+0.5+0.25)/8 = – 0.21875ten)
Dr. Sudip Roy 22
Floating Point Addition and Subtraction Flowchart:
Dr. Sudip Roy 23
Floating Point Multiplication Flowchart:
That’s all from Computer Arithmetic !
Dr. Sudip Roy 24