A Fast Hardware Approach for Approximate, Efficient Logarithm and Anti-logarithm Computation Suganth...
-
Upload
logan-bryan -
Category
Documents
-
view
221 -
download
0
Transcript of A Fast Hardware Approach for Approximate, Efficient Logarithm and Anti-logarithm Computation Suganth...
A Fast Hardware Approach for Approximate, Efficient Logarithm and Anti-logarithm Computation
Suganth Paul
Nikhil Jayakumar
Sunil P. Khatri
Department of Electrical and Computer Engineering
Texas A&M University, College Station
Introduction• The fast generation of functions such as logarithm and antilogarithm is
important in areas such as DSP, computer graphics, scientific computing, artificial neural networks, logarithmic number systems.
• Over the past, authors have proposed various hardware approaches to accurately approximate logarithm and antilogarithm functions.
• Out of these approaches, Look up table (LUT) based methods such as Brubaker, Maenner, Kmetz, SBTM are widely used.
• Some hardware approaches also include LUTs combined with polynomial approximations. But these need multiplications/divisions.
• Our approach combines an LUT with linear interpolation implemented in an area and delay efficient manner.
• The novelty of our approach lies in the fact that we do not need a multiplier or divider to perform interpolation. Also we use the same hardware structure to implement log and antilog.
• The number format used for the computation is shown below.
Here : 0 < < 1 is the Mantissa and : is the exponent.
N 2e (1m)
e
m
m
Mitchell ApproximationThe logarithm of a number is found as
Mitchell’s approximation is given by
where
The error due to this approximation is
The error is plotted on the right
log2(N) e m
Em log2(1m) m
N 2e (1m)
log2(N)e log2(1m)
log2(1m) m
Kmetz Approximation
• In the Kmetz method, the Mitchell error curve shown above is sampled at points and stored in an LUT.
• Here the LUT is indexed by the first bits of the mantissa
• If the error value looked up from the LUT is , the logarithm is found as
where
• The error in this case due to approximating the logarithm of the mantissa portion is given by
log2(1m) m a
E k log2(1m) m a
log2(N)e log2(1m)
m
a
2t
t
Our Approach• In our method we interpolate between values stored in the LUT to get a
more accurate result.
• The logarithm of the mantissa part of the number is obtained as
• where is the error value from the LUT at location
is the number of leading bits in the mantissa indexing the table
is the next value in the LUT at location
is the total number of bits used to represent the mantissa
is the decimal value of the last bits of the mantissa
• The multiplication step is found as
• is found by using the same LUT as above
• We consider the following approximations to find and
log2(1m) m a(b a)n2k t
(b a)n
antilog2(log2(b a) log2(n))
log2(n)
log2(b a)
antilog2(log2(b a) log2(n))
a
t
b
k
n
k t
i
i 1
Errors for Various Interpolation Methods and Table Sizes
1. is found bya) Mitchell approximationb) Kmetz approximation using another LUT
2. is found bya) Mitchell approximationb) Kmetz approximation using another LUT
We find from the table below that 1.b) 2.b) has the best error performance and hence we use LUTs to approximate the multiplication.
Max Error is in
log2(b a)
antilog2(log2(b a) log2(n))
10 3
Block Diagram of the Log Engine
• The block diagram shows the implementation of where is the 23 bit mantissa
• The number of leading bits of the mantissa going to the interpolator depends on the size of the LUTs used in the Interpolator.
• In this case we are using an LUT that holds 64 values and 13 bits of the mantissa are required.
• The Interpolator block is shown below.
log2(1m)
m
Interpolator Block Diagram
• The implementation can be pipelined to get a better throughput.
• The COMPARE block determines if the final stage does an Add or Subtract.
• The LOD (leading one detector) block finds the position of the leading one and the rest of the bits are used to access the LUT.
• The LUT used to find and is the same and is implemented as a dual port ROM.
a
log2(n)
Antilog Computation• Let
The antilogarithm of this number is found as
Using Mitchell’s method we make the following approximation
• A Kmetz approximation can be made by storing the error due to this approximation in an LUT and adding the error value to the above equation for the antilogarithm.
• In our approach, we compute the antilogarithm by interpolating efficiently between two adjacent table values stored in the LUT without needing a multiplier.
• We follow the same flow used for computing the logarithm. The error incurred while using different table sizes for computing the antilogarithm is shown below.
M log2(N)e m
antilog2(M)2M 2e2m
2m 1m
Comparison of FPGA Resources used by the Log Engine
• We implemented our method and the Symmetric Bipartite Table Method (SBTM) using a Virtex2P FPGA.
• Our method requires smaller on-chip Block Rams.• Both methods occupied less than 1% of FPGA resources• Both methods were able to support clock speeds of a little over
350 MHz.
Comparison of LUT Size used and Accuracy of the Log Computation
Conclusion• Our approach has low memory requirement as compared
with other methods to provide better accuracies.• When compared to the SBTM, for every two bits of extra bits of
accuracy,– we need a factor of 2 increase in the LUT size– the SBTM needs a factor of 3 increase in the LUT size
Hence our method scales well for higher accuracy in bits.• We are area efficient compared polynomial interpolation
methods as we do not need a multiplier or divider to perform interpolation.
• The implementation can be pipelined and the number of stages in the pipeline can be varied depending on the throughput required.
• We have presented an approach to efficiently compute the logarithm and antilogarithm of a number in hardware.