4. High Speed of Complex Multiplier Based on Urdhva-tiryakbyham Sutra Verilog HDL

download 4. High Speed of Complex Multiplier Based on Urdhva-tiryakbyham Sutra Verilog HDL

of 56


4. High Speed of Complex Multiplier Based on Urdhva-tiryakbyham Sutra Verilog HDL

Transcript of 4. High Speed of Complex Multiplier Based on Urdhva-tiryakbyham Sutra Verilog HDL

AbstractComplex multiplication is of immense importance in Digital Signal Processing (DSP) and Image Processing (IP).To implement the hardware module of Discrete Fourier Transformation (DFT), Discrete Cosine Transformation(DCT), Discrete Sine Transformation (DST) and modem broadband communications; large numbers of complex multipliers are required. Complex number multiplication is performed using four real number multiplications and two additions/ subtractions. In real number processing, carry needs to be propagated from the least significant bit (LSB) to the most significant bit (MSB) when binary partial products are added . Therefore, the addition and subtraction after binary multiplications limit the overall speed. Many alternative method had so far been proposed for complex number multiplication like algebraic transformation based implementation, bit-serial multiplication using offset binary and distributed arithmetic , the CORDIC (coordinate rotation digital computer) algorithm , the quadratic residue number system (QRNS) , and recently, the redundant complex number system (RCNS) .Vedic Mathematics is the ancient methodology of Indian mathematics which has a unique technique of calculations based on 16 Sutras (Formulae). A high speed complex multiplier design (ASIC) using Vedic Mathematics is presented in this paper. The idea for designing the multiplier and adderlsubtractor unit is adopted from ancient Indian mathematics "Vedas". On account of those formulas, the partial products and sums are generated in one step which reduces the carry propagation from LSB to MSB. The implementation of the Vedic mathematics and their application to the complex multiplier ensure substantial reduction of propagation delay in comparison with DA based architecture and parallel adder based implementation which are most commonly used architectures. The functionality of these circuits was checked and performance parameters like propagation delay and dynamic power consumption were calculated by spice spectre using standard 90nm CMOS technology. The propagation delay of the resulting (16, 16)x(16, 16) complex multiplier is only 4ns and consume 6.5 mW power. We achieved almost 25% improvement in speed from earlier reported complex multipliers, e.g. parallel adder and DA based architectures.This project proposes the hardware implementation of vlsi architecture for High Speed VLSI Design of Complex Multiplier Using Vedic Mathematics that have been modified to improve performance.This project was implemented in verilog the coding is done in Verilog HDL and the FPGA synthesis is done using Xilinx Spartan library. The main advantage of implementation using Hardware based High Speed VLSI Design of Complex Multiplier Using Vedic Mathematics its inherent speed over Software based methods. This speed is due to Flexibility of Reconfigurability and Reprogrammability of FPGA. The speed advantage makes the hardware based High Speed VLSI Design of Complex Multiplier Using Vedic Mathematics is a prime candidate for real time applications.

CHAPTER 1INTRODUCTIONMultiplication is an important fundamental function in arithmetic operations. Multiplication-based operations such as Multiply and Accumulate(MAC) and inner product are among some of the frequently used Computation- Intensive Arithmetic Functions(CIAF) currently implemented in many Digital Signal Processing (DSP) applications such as convolution, Fast Fourier Transform(FFT), filtering and in microprocessors in its arithmetic and logic unit. Since multiplication dominates the execution time of most DSP algorithms, so there is a need of high speed multiplier. Currently, multiplication time is still the dominant factor in determining the instruction cycle time of a DSP chip.The demand for high speed processing has been increasing as a result of expanding computer and signal processing applications. Higher throughput arithmetic operations are important to achieve the desired performance in many real-time signal and image processing applications. One of the key arithmetic operations in such applications is multiplication and the development of fast multiplier circuit has been a subject of interest over decades. Reducing the time delay and power consumption are very essential requirements for many applications. This work presents different multiplier architectures. Multiplier based on Vedic Mathematics is one of the fast and low power multiplier.Minimizing power consumption for digital systems involves optimization at all levels of the design. This optimization includes the technology used to implement the digital circuits, the circuit style and topology, the architecture for implementing the circuits and at the highest level the algorithms that are being implemented. Digital multipliers are the most commonly used components in any digital circuit design. They are fast, reliable and efficient components that are utilized to implement any operation. Depending upon the arrangement of the components, there are different types of multipliers available. Particular multiplier architecture is chosen based on the application.In many DSP algorithms, the multiplier lies in the critical delay path and ultimately determines the performance of algorithm. The speed of multiplication operation is of great importance in DSP as well as in general processor. In the past multiplication was implemented generally with a sequence of addition, subtraction and shift operations. There have been many algorithms proposals in literature to perform multiplication, each offering different advantages and having tradeoff in terms of speed, circuit complexity, area and power consumption.The multiplier is a fairly large block of a computing system. The amount of circuitry involved is directly proportional to the square of its resolution i.e. A multiplier of size n bits has n2 gates. For multiplication algorithms performed in DSP applications latency and throughput are the two major concerns from delay perspective. Latency is the real delay of computing a function, a measure of how long the inputs to a device are stable is the final result available on outputs. Throughput is the measure of how many multiplications can be performed in a given period of time; multiplier is not only a high delay block but also a major source of power dissipation. Thats why if one also aims to minimize power consumption, it is of great interest to reduce the delay by using various delay optimizations.Digital multipliers are the core components of all the digital signal processors (DSPs) and the speed of the DSP is largely determined by the speed of its multipliers. Two most common multiplication algorithms followed in the digital hardware are array multiplication algorithm and Booth multiplication algorithm. The computation time taken by the array multiplier is comparatively less because the partial products are calculated independently in parallel. The delay associated with the array multiplier is the time taken by the signals to propagate through the gates that form the multiplication array. Booth multiplication is another important multiplication algorithm. Large booth arrays are required for high speed multiplication and exponential operations which in turn require large partial sum and partial carry registers. Multiplication of two n-bit operands using a radix-4 booth recording multiplier requires approximately n / (2m) clock cycles to generate the least significant half of the final product, where m is the number of Booth recorder adder stages. Thus, a large propagation delay is associated with this case. Due to the importance of digital multipliers in DSP, it has always been an active area of research and a number of interesting multiplication algorithms have been reported in the literature.In this thesis work, Urdhva tiryakbhyam Sutra is first applied to the binary number system and is used to develop digital multiplier architecture. This is shown to be very similar to the popular array multiplier architecture. This Sutra also shows the effectiveness of to reduce the NXN multiplier structure into an efficient 4X4 multiplier structures. Nikhilam Sutra is then discussed and is shown to be much more efficient in the multiplication of large numbers as it reduces the multiplication of two large numbers to that of two smaller ones. The proposed multiplication algorithm is then illustrated to show its computational efficiency by taking an example of reducing a 4X4-bit multiplication to a single 2X2-bit multiplication operation. This work presents a systematic design methodology for fast and area efficient digit multiplier based on Vedic mathematics .The Multiplier Architecture is based on the Vertical and Crosswise algorithm of ancient Indian Vedic Mathematics.

1. What is Vedic Mathematics?To obtain a historical perspective on Vedic Mathematics, briefly we discuss the mathematical developments in India (Boyer, 1968; Katz, 1992). Archeological excavations documented an old and highly cultured civilization in India during the third millenium B.C.E., but no Indian mathematical documents where found from that period. India like Egypt had its geometrical measurement in the form of a body of knowledge known as the Sulvasutras (rules of the cords). The word sutra (thread of knowledge) means rules expressed by aphorisms relating to rituals or science. This primitive account, dating perhaps before the time of Pythagoras (6th century B.C.E.), contained rules for the construction of right angles by means of triple cords. The period of the Sulvasutras was followed by the age of the Siddhantas (systems of astronomy) starting around 500 C.E., which contributed to trigonometry the notion of the sine, namely the ratio of half a chord of a circle and half the corresponding central angle. The trigonometry of the sine function is one of the noted contributions of India to modern mathematics. Another marked development is our system of numeration for integers, called the Hindu-Arabic system (about 700 C. E.). The use of the zero symbol in India existed from the ninth century. Other developments from the same period were indeterminate analysis (Brahmagupta) and algebraic techniques. In the first centuries of the second millenium spherical trigonometry and Pell equations (Bhaskara) were developed. The discovery of power series for trigonometric functions took place around 1500. Hindu mathematians were inclined to further develop topics in number theory and indeterminate analysis in particular. However, these aspects did not contribute to later developments in modern mathematics such as analytic geometry, calculus, and algebra. The term Vedic Mathematics refers to a set of sixteen mathematical sutras and their corollaries derived from the Vedas. The Vedas are ancient texts written in Sanskrit; the word Veda means knowledge knowledge both within and among the senses. It is conjectured that the vedic mathematical system was part of the Sulvasutras. This is a system of calculations based on easy-to-follow rules and principles that can be used effectively to solve problems in arithmetic, algebra, geometry, and trigonometry. The Vedic system was rediscovered between 1911 and 1918 by Sri Swami Bharati Krisna Tirthaji and has been re-structured for use in schools. It is being taught in some schools in several countries as well as being used for scientific applications (see references to Web sites). vedic methods

Several Vedic methods were studied during the course. Here we describe two methods, each of which is used for multiplication of two natural numbers. The workshop described later included the use of these methods. For each method we describe its goal, rules of use, correctness justification, the matching algorithmic problem, and its solution. The description of both methods uses the symbol | to separate the prefix and the suffix of a number. The term length of a number indicates the number of its digits. Method 1Multiplication of two numbers x (the multiplicand) and y (the multiplier), using the Urdhva-Tiryagbhyam Sutra, which means: Vertically and Cross-Wise.The Vedic instructions are represented visually. The method works for numbers of any length. Here we demonstrate two cases: the multiplication of two-digit numbers illustrated in Figure 4 and the multiplication of three-digit numbers illustrated in Figure 5.

a b c dCase I: The multiplication of two-digit numbers ac*bd:

Figure 5: The multiplication of two-digit numbers1. Compute the intermediate products: a*c | a*d + c*d | b*d.2. Deal with carry over if necessary.Examples:1. 12*32 =384In the first number (the multiplicand) a = 1, b = 2 and in the second number (the multiplier) c = 3, d = 2. The intermediate products: 1*3 | 1*2 + 2*3 | 2*2 3 | 8 | 4 and since there is no carry over this is the final result. Examples:2. 82*57 = 4674In the first number (the multiplicand) a = 8, b = 2 and in the second number (the multiplier) c = 5, d = 7. The computation of the intermediate products: 5*8 | 8*7 + 2*5 | 2*7 40 | 66 |14. Taking care of the carry over, we get the final answer: 40 | 66 + 1 | 4 40 +6 | 7 | 4 46 | 7 | 4.

a bcdefCase II: The multiplication of three-digit numbers abc*def:

Figure 6: The multiplication of three-digit numbers1. Compute the intermediate products: a*d | a*e + b*d | a*f + b*e + c*d | b*f + c*e | c*f.2. Deal with carry over if necessary.Examples: 3. 194*285 = 552901*2| 1*8 + 9*2 | 1*5 + 9*8 + 4*2 | 9*5 + 8*4 | 4*5 2 | 26 | 85 | 77 |20 2 | 26 | 85 | 77 +2 | 0 2 | 26 | 85 + 7 | 9 | 0 2 | 26 + 9 | 2 | 9 | 0 2 + 3 | 5 | 2 | 9 | 0 5 | 5 | 2 | 9 | 0In case that the lengths of the two multiplication numbers are not the same, we should add zeroes at the beginning of the shortest number. The development of a suitable algorithm: The main difficulty is in forming a general rule for any two numbers, based on the cases for two-digit numbers and three-digit numbers, as presented above. The mathematical justification for the Vedic rules using polynomial representation of the numbers, where x = 10, can be a big help:(ax + b)*(cx + d) = acx2 + (ad + bc)x + bd (ax2 + bx + c) *(dx2 + ex + f) = adx4 + (ae + bd)x3 + (af + be + cd) x2 + (bf + ce)x + cf Figure 7 presents the intermediate products (obtained in the first step of the Vedic method Vertically and Cross-Wise) according to the digits of the resulting number formed by these products. 43210Digit position (k)

abcx (i)

defy (j)

a*da*e + b*da*f + b*e + c*db*f + c*ec*fx*y

Figure 7: The components of each digit in the multiplicand x, multiplier y, and the result x*yThe relation between the digits position and the power values in the polynomial representation led to a general representation of the multiplication: every digit Z, in position k of the resulting number, is calculated by summing the multiplications of pairs of digits Xi and Yj, whose positions i and j are summed up to k, as in the following formula: Zk = Xi*Yj i+j = k. In the following algorithm (Figure 8), m and n are the length of x and y, respectively. The number of digits in the result is the sum of the lengths of the two numbers m + n. However, since digits were referred according to their degree in the polynomial representation, the less significant digit is in degree 0, whereas the most significant digit in the result number is in degree m+n-1. The loop boundaries were set accordingly.

for each k between 0 and m+n-1 do Z[k] 0 i 0 for each i < m doj 0for each j < n doif k = j + i then Z[k] Z[k] + x[i] * y[j] Figure 8: The algorithm for calculating the intermediate products for method 2The program is presented in Figure 9.

Figure 9: The program that implements method 1Note: The two Derive programs can be downloaded from http://stwww.weizmann.ac.il/g-math/mathcomp/vedic 1. 31 X 12Multiplyverticallyon the left: 3 x 1 =3(first figure)Multiplycross-wiseand add:(3 x 2) + (1 x 1) =7(middle figure)Multiplyverticallyon the right 2 x 1 =2(last figure)So the answer is3722. 275 x 513Multiply vertically 5 x 3 =15 write 5 and carry 1==> 5Multiply and add crosswise last two digits (37)+(15) = 26+ carry 1 =27write 7 and carry 2==> 75Multiply and add vertically and cross wise all digits(23) + (55) + (71)=38+ carry 2 = 40 write 0 and carry 4==>075Multiply and add crosswise first two digits (12)+(57)=37+ carry 4=41write 1 and carry 4==> 1075Multiply vertically 25=10 + carry 4 ==> 14write141075So 275513 = 1410753. Divide (12X28X-32) by (X-2) usingUrdhva-TiryakThe quotient can be written as (12X+k)We also know that 8x = kx 24x Hence k = 16 cross-check 2k = -32 third termSome rules of thumb1. Multiplying a number by 11 To multiply any two digit number by 11 we just put the total of the two figures between the 2 digits. 36 x 11 = 3 (3+6) 6 = 396 74 x 11 = 7 (7+4) 4 = 7+carry 1 (1) 4 = 814 234 x 11 = 2 (2+3) (3+4) 4 = 25742. Diving by 9. 23 / 9 =2 remainder 5 The first digit of 23 i.e, 2 is the answer and remainder is sum of 2 and 3!134 / 9 = 14 remainder 8 Answer is (1st digit of 134)(1+3) remainder (1+3+4)3.1 Binary Multiplication - Urdhva-tiryagbhyam3.1.1 AlgorithmIn binary system only 0 and 1 are used hence multiplication in Urdhva-tiryagbhyam orvertically-crosswise formula is replaced by AND logic. Each AND will be a bit wide andthese bits are added together to generate cross-product. Rules for vertically-crosswisemultiplication remains same as starting from MSBs Most Significant Bit, of bothmultiplicands considered for first cross product. Then increasing one bit in each furthercalculation with cross product taken for bits of multiplicands till all bits are used. Furtherdropping bits from MSB process of cross-product is continued till only LSB is used forcross-product. In binary number system the maximum width of cross-product depends onwidth of multiplicands. For example, in 8 bit multiplication maximum cross-product widthwill be log28 + 1 = 4. In 16 bit it will be 5 and in 23 bits it will be 5 again.

Figure 3.1 - Urdhva-tiryagbhyam Complete Example3.1.2 Comparison of Vedic and Conventional MultiplierIn this section comparison between Vedic multiplier by Urdhva-tiryagbhyam andconventional multiplier is made. Conventional multiplier of width N x N will generate Nnumber of partial products with each product containing N bits with 0 to 7 zeros added at theend. Vedic method generates (2N - 1) cross products of different widths which whencombined forms (log2N + 1) partial products for N bit multiplier. In case of number of partialproducts there is significant decrease in number for Vedic Mathematics. But partial productsgenerated in case of conventional multipliers are just by AND one multiplier by digits ofanother multiplier, whereas in case of Vedic, partial products are obtained after crossproducts are generated which requires some logic. Hence in Vedic mathematics delay forpartial products is equal to adder delay. Critical path would consist of adders addingmaximum number of bits in cross product. In all cases it will be the cross product in whichall bits of multipliers are considered. Different techniques are used to combine these partialproducts efficiently to reduce the total time required for multiplication. One such techniqueWallace tree addition is discussed in next section.

3.2 Combining Partial ProductsCombining partial products is the most critical part in any multiplier which decides theperformance of it. Different types of adders like Carry Save adders (CSA) or Carry LookaheadAdders (CLA) are used frequently. To further improve the performance moreparallelism is achieved by combining three or more partial products at a time until two areleft and then to add these two partial products to get the final answer. This technique is calledas Wallace tree adder.3.2.1 Wallace treeWallace tree in simplest sense is a full adder which combines 3 bits to produce a carry and asum which can be seen as a 3 bit to 2 bit reduction. The 3 bits added are at the same power of2 in any binary number whereas the two bit output produced has one bit at same power of 2as input but one bit is at 1 power above. In combining large partial products this technique isused in which 3 bits at same power of 2 of different partial products can be combined inparallel. With these techniques 3 partial products can be combined to form 2. If there aremore than 3 partial products then multiple stages Wallace adder has to be used.

Figure 3.3 - Full AdderConsider a 4x4 multiplier, which will have 4 four bit wide partial products. In the diagrambelow we can see that 3 partial products are combined to form 2 and later again combinedwith 4th to make 2 partial products. It can also be observed that in all cases a full addercannot be used hence a half adder has to be used to combine partial products. Full adderwhich is a 3:2 -reduction, half adders and 4:2 compressors are common in differentconfigurations of Wallace tree.

Figure 3.4 - Wallace tree reductionIn Vedic mathematics sutra Urdhva-tiryagbhyam for N N multiplier, 2N -1 cross-productare generated. As seen earlier, cross-products process is similar to first AND logic and thenaddition of these bits to form cross-product. Starting from first number of bits is crossproduct increases till N and then again decreases to 1. Hence these products have widthsfrom l to (log2 N + 1). Partial products are formed by combining bits at same power of 2 ofall cross-products. This forms (log2 N + 1) partial products. There are two ways by whichthese partial products can be combined.3.2.2 Vedic WallaceIf Wallace tree structure with full and half adder is used then the class of multiplier is namedas Vedic Wallace multiplier. This type of multiplier essentially defers from conventionalmultiplier in the way the different bits are combined to form the final answer. 8, 16, and 23bit multiplier are designed in Verilog with Vedic Wallace structure with each of themcustomized to that particular width of multiplication to achieve high performance. Thesemultipliers are designed as combinational blocks similar to equivalent DesignWare blocks.Each of these designed are synthesized with Synopsys in two ways 1. To achieve the highestpossible clock frequency with which they work 2. Relaxed clock frequency to check the areagenerated. Power analysis is done in both cases. Results are as follows.3.2.3 Vedic VedicAnother way to combine the cross-products is to continue the methodology described inUrdhva-tiryagbhyam. In this all the bits which are at same power of 2 are combined until36each addition of bits result into two bit numbers which results in two partial products. Thisclass of multipliers is named as Vedic-Vedic multipliers. Similar to previous class ofmultipliers 8, 16, and 23 bit multiplier are designed in Verilog and synthesized withSynopsys. Each of these designed are synthesized with Synopsys in two ways 1. To achievethe highest possible clock frequency with which they work 2. Relaxed clock frequency tocheck the area generated. Power analysis is done in both cases. Results are discussed in nextsections.

CHAPTER 3VEDIC MULTIPLICATION ALGORITHMS3.1 HISTORY OF VEDIC MATHEMATICS:- Vedic mathematics is part of four Vedas (books of wisdom). It is part of Sthapatya- Veda (book on civil engineering and architecture), which is an upa-veda (supplement) of Atharva Veda. It covers explanation of several modern mathematical terms including arithmetic, geometry (plane, co-ordinate), trigonometry, quadratic equations, factorization and even calculus. His Holiness Jagadguru Shankaracharya Bharati Krishna Teerthaji Maharaja (1884-1960) comprised all this work together and gave its mathematical explanation while discussing it for various applications. Swahiji constructed 16 sutras (formulae) and 16 Upa sutras (sub formulae) after extensive research in Atharva Veda. Obviously these formulae are not to be found in present text of Atharva Veda because these formulae were constructed by Swamiji himself. Vedic mathematics is not only a mathematical wonder but also it is logical. Thats why VM has such a degree of eminence which cannot be disapproved. Due these phenomenal characteristic, VM has already crossed the boundaries of India and has become a leading topic of research abroad. VM deals with several basic as well as complex mathematical operations. Especially, methods of basic arithmetic are extremely simple and powerful. The word Vedic is derived from the word veda which means the store-house of all knowledge. Vedic mathematics is mainly based on 16 Sutras (or aphorisms) dealing with various branches of mathematics like arithmetic, algebra, geometry etc. These Sutras along with their brief meanings are enlisted below alphabetically. 1) (Anurupye) Shunyamanyat If one is in ratio, the other is zero. 2) Chalana-Kalanabyham Differences and Similarities. 3) Ekadhikina Purvena By one more than the previous One. 4) Ekanyunena Purvena By one less than the previous one. 5) Gunakasamuchyah The factors of the sum is equal to the sum of the factors. 6) Gunitasamuchyah The product of the sum is equal to the sum of the product. 7) Nikhilam Navatashcaramam Dashatah All from 9 and last from 10. 8) Paraavartya Yojayet Transpose and adjust. 9) Puranapuranabyham By the completion or noncompletion. 10) Sankalana- vyavakalanabhyam By addition and by subtraction. 11) Shesanyankena Charamena The remainders by the last digit. 12) Shunyam Saamyasamuccaye When the sum is the same that sum is zero. 13) Sopaantyadvayamantyam The ultimate and twice the penultimate. 14) Urdhva-tiryakbhyam Vertically and crosswise. 15) Vyashtisamanstih Part and Whole. 16) Yaavadunam Whatever the extent of its deficiency. These methods and ideas can be directly applied to trigonometry, plain and spherical geometry, conics, calculus (both differential and integral), and applied mathematics of various kinds. As mentioned earlier, all these Sutras were reconstructed from ancient Vedic texts early in the last century. Many Sub-sutras were also discovered at the same time, which are not discussed here. The beauty of Vedic mathematics lies in the fact that it reduces the otherwise cumbersome-looking calculations in conventional mathematics to a very simple one. This is so because the Vedic formulae are claimed to be based on the natural principles on which the human mind works. This is a very interesting field and presents some effective algorithms which can be applied to various branches of engineering such as computing and digital signal processing. The multiplier architecture can be generally classified into three categories. First is the serial multiplier which emphasizes on hardware and minimum amount of chip area. Second is parallel multiplier (array and tree) which carries out high speed mathematical operations. But the drawback is the relatively larger chip area consumption. Third is serial- parallel multiplier which serves as a good trade-off between the times consuming serial multiplier and the area consuming parallel multipliers. 3.2 ALGORITHMS OF VEDIC MATHEMATICS:- 3.2.1 VEDIC MULTIPLICATION The proposed Vedic multiplier is based on the Vedic multiplication formulae (Sutras). These Sutras have been traditionally used for the multiplication of two numbers in the decimal number system. In this work, we apply the same ideas to the binary number system to make the proposed algorithm compatible with the digital hardware. Vedic multiplication based on some algorithms, some are discussed below: Urdhva Tiryakbhyam sutra The multiplier is based on an algorithm Urdhva Tiryakbhyam (Vertical & Crosswise) of ancient Indian Vedic Mathematics. Urdhva Tiryakbhyam Sutra is a general multiplication formula applicable to all cases of multiplication. It literally means Vertically and crosswise. It is based on a novel concept through which the generation of all partial products can be done with the concurrent addition of these partial products. The parallelism in generation of partial products and their summation is obtained using Urdhava Triyakbhyam explained in fig 2.1. The algorithm can be generalized for n x n bit number. Since the partial products and their sums are calculated in parallel, the multiplier is independent of the clock frequency of the processor. Thus the multiplier will require the same amount of time to calculate the product and hence is independent of the clock frequency. The net advantage is that it reduces the need of microprocessors to operate at increasingly high clock frequencies. While a higher clock frequency generally results in increased processing power, its disadvantage is that it also increases power dissipation which results in higher device operating temperatures. By adopting the Vedic multiplier, microprocessors designers can easily circumvent these problems to avoid catastrophic device failures. The processing power of multiplier can easily be increased by increasing the input and output data bus widths since it has a quite a regular structure. Due to its regular structure, it can be easily layout in a silicon chip. The Multiplier has the advantage that as the number of bits increases, gate delay and area increases very slowly as compared to other multipliers. Therefore it is time, space and power efficient. It is demonstrated that this architecture is quite efficient in terms of silicon area/speed. 1) Multiplication of two decimal numbers- 325*738 To illustrate this multiplication scheme, let us consider the multiplication of two decimal numbers (325 * 738). Line diagram for the multiplication is shown in Fig.2.2. The digits on the both sides of the line are multiplied and added with the carry from the previous step. This generates one of the bits of the result and a carry. This carry is added in the next step and hence the process goes on. If more than one line are there in one step, all the results are added to the previous carry. In each step, least significant bit acts as the result bit and all other bits act as carry for the next step. Initially the carry is taken to be zero. To make the methodology more clear, an alternate illustration is given with the help of line diagrams in figure 2.2 where the dots represent bit 0 or 1.

Figure 2.1: Multiplication of two decimal numbers by Urdhva Tiryakbhyam.

2) Algorithm for 4 x 4 bit Vedic multiplier Using Urdhva Tiryakbhyam (Vertically and crosswise) for two Binary numbers [10]- CP = Cross Product (Vertically and Crosswise) X3 X2 X1 X0 Multiplicand Y3 Y2 Y1 Y0 Multiplier -------------------------------------------------------------------- H G F E D C B A --------------------------------------------------------------------- P7 P6 P5 P4 P3 P2 P1 P0 Product --------------------------------------------------------------------- PARALLEL COMPUTATION METHODOLOGY 1. CP X0 = X0 * Y0 = A Y0 2. CP X1 X0 = X1 * Y0+X0 * Y1 = B Y1 Y0 3. CP X2 X1 X0 = X2 * Y0 +X0 * Y2 +X1 * Y1 = C Y2 Y1 Y0 4. CP X3 X2 X1 X0 = X3 * Y0 +X0 * Y3+X2 * Y1 +X1 * Y2 = D Y3 Y2 Y1 Y0 5. CP X3 X2 X1 = X3 * Y1+X1 * Y3+X2 * Y2 = E Y3 Y2 Y1 6. CP X3 X2 = X3 * Y2+X2 * Y3 = F Y3 Y2 7 CP X3 = X3 * Y3 = G Y3 3) Algorithm for 8 X 8 Bit Multiplication Using Urdhva Triyakbhyam (Vertically and crosswise) for two Binary numbers [11]- A = A7A6A5A4 A3A2A1A0 X1X0 B = B7B6B5B4 B3B2B1B0 Y1 Y0

X1 X0 * Y1 Y0 --------------------------------------------------------- F E D C CP = X0 * Y0 = C CP = X1 * Y0 + X0 * Y1 = D CP = X1 * Y1 = E Where CP = Cross Product. Note: Each Multiplication operation is an embedded parallel 4x4 Multiply module.

To illustrate the multiplication algorithm, let us consider the multiplication of two binary numbers a3a2a1a0 and b3b2b1b0. As the result of this multiplication would be more than 4 bits, we express it as... r3r2r1r0. Line diagram for multiplication of two 4-bit numbers is shown in Fig. 2.2 which is nothing but the mapping of the Fig.2.1 in binary system. For the simplicity, each bit is represented by a circle. Least significant bit r0 is obtained by multiplying the least significant bits of the multiplicand and the multiplier. The process is followed according to the steps shown in Fig. 2.1.

Figure 2.2: Line diagram for multiplication of two 4 - bit numbers.Firstly, least significant bits are multiplied which gives the least significant bit of the product (vertical). Then, the LSB of the multiplicand is multiplied with the next higher bit of the multiplier and added with the product of LSB of multiplier and next higher bit of the multiplicand (crosswise). The sum gives second bit of the product and the carry is added in the output of next stage sum obtained by the crosswise and vertical multiplication and addition of three bits of the two numbers from least significant position. Next, all the four bits are processed with crosswise multiplication and addition to give the sum and carry. The sum is the corresponding bit of the product and the carry is again added to the next stage multiplication and addition of three bits except the LSB. The same operation continues until the multiplication of the two MSBs to give the MSB of the product. For example, if in some intermediate step, we get 110, then 0 will act as result bit (referred as rn) and 11 as the carry (referred as cn). It should be clearly noted that cn may be a multi-bit number. Thus we get the following expressions: r0=a0b0; (1) c1r1=a1b0+a0b1; (2) c2r2=c1+a2b0+a1b1 + a0b2; (3) c3r3=c2+a3b0+a2b1 + a1b2 + a0b3; (4) c4r4=c3+a3b1+a2b2 + a1b3; (5) c5r5=c4+a3b2+a2b3; (6) c6r6=c5+a3b3 (7) With c6r6r5r4r3r2r1r0 being the final product. Hence this is the general mathematical formula applicable to all cases of multiplication.

Figure 2.3: Hardware architecture of the Urdhva tiryakbhyam multiplier [4]The hardware realization of a 4-bit multiplier is shown in figure2.3. This hardware design is very similar to that of the famous array multiplier where an array of adders is required to arrive at the final product. All the partial products are calculated in parallel and the delay associated is mainly the time taken by the carry to propagate through the adders which form the multiplication array. Clearly, this is not an efficient algorithm for the multiplication of large numbers as a lot of propagation delay is involved in such cases. To deal with this problem, we now discuss Nikhilam Sutra which presents an efficient method of multiplying two large numbers.

CHAPTER-3 MATHEMATICAL FORMULATION OF VEDIC SUTRAS:The gifts of the ancient Indian mathematics in the world history of mathematical science are not well recognized. The contributions of saint and mathematician in the field of number theory, 'Sri Bharati Krsna Thirthaji Maharaja', in the fonn of Vedic Sutras (fonnulas) are significant for calculations. He had explored the mathematical potentials from Vedic primers and showed that the mathematical operations can be carried out mentally to produce fast answers using the Sutras. A. "Urdhva-tiryakbyham " Sutra:The meaning of this sutra is "Vertically and crosswise" and it is applicable to all the multiplication operations.

Fig. 1represents the general multiplication procedure of the 4x4 multiplication. This procedure is simply known as array multiplication technique . It is an efficient multiplication technique when the multiplier and multiplicand lengths are small, but for the larger length multiplication this technique is not suitable because a large amount of carry propagation delays are involved in these cases.

CHAPTER 4Very-large-scale integration (VLSI) is the process of creating integrated circuits by combining thousands of transistor-based circuits into a single chip. VLSI began in the 1970s when complex semiconductor and communication technologies were being developed. The microprocessor is a VLSI device. The term is no longer as common as it once was, as chips have increased in complexity into the hundreds of millions of transistors.OverviewThe first semiconductor chips held one transistor each. Subsequent advances added more and more transistors, and, as a consequence, more individual functions or systems were integrated over time. The first integrated circuits held only a few devices, perhaps as many as ten diodes, transistors, resistors and capacitors, making it possible to fabricate one or more logic gates on a single device. Now known retrospectively as "small-scale integration" (SSI), improvements in technique led to devices with hundreds of logic gates, known as large-scale integration (LSI), i.e. systems with at least a thousand logic gates. Current technology has moved far past this mark and today's microprocessors have many millions of gates and hundreds of millions of individual transistors.At one time, there was an effort to name and calibrate various levels of large-scale integration above VLSI. Terms like Ultra-large-scale Integration (ULSI) were used. But the huge number of gates and transistors available on common devices has rendered such fine distinctions moot. Terms suggesting greater than VLSI levels of integration are no longer in widespread use. Even VLSI is now somewhat quaint, given the common assumption that all microprocessors are VLSI or better.As of early 2008, billion-transistor processors are commercially available, an example of which is Intel's Montecito Itanium chip. This is expected to become more commonplace as semiconductor fabrication moves from the current generation of 65 nm processes to the next 45 nm generations (while experiencing new challenges such as increased variation across process corners). Another notable example is NVIDIAs 280 series GPU.This microprocessor is unique in the fact that its 1.4 Billion transistor count, capable of a teraflop of performance, is almost entirely dedicated to logic (Itanium's transistor count is largely due to the 24MB L3 cache). Current designs, as opposed to the earliest devices, use extensive design automation and automated logic synthesis to lay out the transistors, enabling higher levels of complexity in the resulting logic functionality. Certain high-performance logic blocks like the SRAM cell, however, are still designed by hand to ensure the highest efficiency (sometimes by bending or breaking established design rules to obtain the last bit of performance by trading stability).What is VLSI? VLSI stands for "Very Large Scale Integration". This is the field which involves packing more and more logic devices into smaller and smaller areas.VLSISimply we say Integrated circuit is many transistors on one chip.Design/manufacturing of extremely small, complex circuitry using modified semiconductor materialIntegrated circuit (IC) may contain millions of transistors, each a few mm in sizeApplications wide ranging: most electronic logic devicesHistory of Scale Integration late 40s Transistor invented at Bell Labs late 50s First IC (JK-FF by Jack Kilby at TI) early 60s Small Scale Integration (SSI) 10s of transistors on a chip late 60s Medium Scale Integration (MSI) 100s of transistors on a chip early 70s Large Scale Integration (LSI) 1000s of transistor on a chip early 80s VLSI 10,000s of transistors on a chip (later 100,000s & now 1,000,000s) Ultra LSI is sometimes used for 1,000,000s SSI - Small-Scale Integration (0-102) MSI - Medium-Scale Integration (102-103) LSI - Large-Scale Integration (103-105) VLSI - Very Large-Scale Integration (105-107) ULSI - Ultra Large-Scale Integration (>=107)

Advantages of ICs over discrete componentsWhile we will concentrate on integrated circuits , the properties of integrated circuits-what we can and cannot efficiently put in an integrated circuit-largely determine the architecture of the entire system. Integrated circuits improve system characteristics in several critical ways. ICs have three key advantages over digital circuits built from discrete components: Size. Integrated circuits are much smaller-both transistors and wires are shrunk to micrometer sizes, compared to the millimeter or centimeter scales of discrete components. Small size leads to advantages in speed and power consumption, since smaller components have smaller parasitic resistances, capacitances, and inductances. Speed. Signals can be switched between logic 0 and logic 1 much quicker within a chip than they can between chips. Communication within a chip can occur hundreds of times faster than communication between chips on a printed circuit board. The high speed of circuits on-chip is due to their small size-smaller components and wires have smaller parasitic capacitances to slow down the signal. Power consumption. Logic operations within a chip also take much less power. Once again, lower power consumption is largely due to the small size of circuits on the chip-smaller parasitic capacitances and resistances require less power to drive them.VLSI and systemsThese advantages of integrated circuits translate into advantages at the system level: Smaller physical size. Smallness is often an advantage in itself-consider portable televisions or handheld cellular telephones. Lower power consumption. Replacing a handful of standard parts with a single chip reduces total power consumption. Reducing power consumption has a ripple effect on the rest of the system: a smaller, cheaper power supply can be used; since less power consumption means less heat, a fan may no longer be necessary; a simpler cabinet with less shielding for electromagnetic shielding may be feasible, too. Reduced cost. Reducing the number of components, the power supply requirements, cabinet costs, and so on, will inevitably reduce system cost. The ripple effect of integration is such that the cost of a system built from custom ICs can be less, even though the individual ICs cost more than the standard parts they replace.Understanding why integrated circuit technology has such profound influence on the design of digital systems requires understanding both the technology of IC manufacturing and the economics of ICs and digital systems.Applications Electronic system in cars. Digital electronics control VCRs Transaction processing system, ATM Personal computers and Workstations Medical electronic systems. Etc.Applications of VLSIElectronic systems now perform a wide variety of tasks in daily life. Electronic systems in some cases have replaced mechanisms that operated mechanically, hydraulically, or by other means; electronics are usually smaller, more flexible, and easier to service. In other cases electronic systems have created totally new applications. Electronic systems perform a variety of tasks, some of them visible, some more hidden: Personal entertainment systems such as portable MP3 players and DVD players perform sophisticated algorithms with remarkably little energy. Electronic systems in cars operate stereo systems and displays; they also control fuel injection systems, adjust suspensions to varying terrain, and perform the control functions required for anti-lock braking (ABS) systems. Digital electronics compress and decompress video, even at high-definition data rates, on-the-fly in consumer electronics. Low-cost terminals for Web browsing still require sophisticated electronics, despite their dedicated function. Personal computers and workstations provide word-processing, financial analysis, and games. Computers include both central processing units (CPUs) and special-purpose hardware for disk access, faster screen display, etc. Medical electronic systems measure bodily functions and perform complex processing algorithms to warn about unusual conditions. The availability of these complex systems, far from overwhelming consumers, only creates demand for even more complex systems.The growing sophistication of applications continually pushes the design and manufacturing of integrated circuits and electronic systems to new levels of complexity. And perhaps the most amazing characteristic of this collection of systems is its variety-as systems become more complex, we build not a few general-purpose computers but an ever wider range of special-purpose systems. Our ability to do so is a testament to our growing mastery of both integrated circuit manufacturing and design, but the increasing demands of customers continue to test the limits of design and manufacturingASICAn Application-Specific Integrated Circuit (ASIC) is an integrated circuit (IC) customized for a particular use, rather than intended for general-purpose use. For example, a chip designed solely to run a cell phone is an ASIC. Intermediate between ASICs and industry standard integrated circuits, like the 7400 or the 4000 series, are application specific standard products (ASSPs).As feature sizes have shrunk and design tools improved over the years, the maximum complexity (and hence functionality) possible in an ASIC has grown from 5,000 gates to over 100 million. Modern ASICs often include entire 32-bit processors, memory blocks including ROM, RAM, EEPROM, Flash and other large building blocks. Such an ASIC is often termed a SoC (system-on-a-chip). Designers of digital ASICs use a hardware description language (HDL), such as Verilog or VHDL, to describe the functionality of ASICs.Field-programmable gate arrays (FPGA) are the modern-day technology for building a breadboard or prototype from standard parts; programmable logic blocks and programmable interconnects allow the same FPGA to be used in many different applications. For smaller designs and/or lower production volumes, FPGAs may be more cost effective than an ASIC design even in production.

An application-specific integrated circuit (ASIC) is an integrated circuit (IC) customized for a particular use, rather than intended for general-purpose use. A Structured ASIC falls between an FPGA and a Standard Cell-based ASIC Structured ASICs are used mainly for mid-volume level designs The design task for structured ASICs is to map the circuit into a fixed arrangement of known cells.

CHAPTER 5XILINXMigrating Projects from Previous ISE Software ReleasesWhen you open a project file from a previous release, the ISE software prompts you to migrate your project. If you click Backup and Migrate or Migrate Only, the software automatically converts your project file to the current release. If you click Cancel, the software does not convert your project and, instead, opens Project Navigator with no project loaded.Note:After you convert your project, you cannot open it in previous versions of the ISE software, such as the ISE 11 software. However, you can optionally create a backup of the original project as part of project migration, as described below.To Migrate a Project1. In the ISE 12 Project Navigator, select File > Open Project.2. In the Open Project dialog box, select the .xise file to migrate.NoteYou may need to change the extension in the Files of type field to display .npl (ISE 5 and ISE 6 software) or .ise (ISE 7 through ISE 10 software) project files.3. In the dialog box that appears, select Backup and Migrate or Migrate Only.4. The ISE software automatically converts your project to an ISE 12 project.NoteIf you chose to Backup and Migrate, a backup of the original project is created at project_name_ise12migration.zip.5. Implement the design using the new version of the software.NoteImplementation status is not maintained after migration.PropertiesFor information on properties that have changed in the ISE 12 software, see ISE 11 to ISE 12 Properties Conversion.IP ModulesIf your design includes IP modules that were created using CORE Generator software or Xilinx Platform Studio (XPS) and you need to modify these modules, you may be required to update the core. However, if the core netlist is present and you do not need to modify the core, updates are not required and the existing netlist is used during implementation.Obsolete Source File TypesThe ISE 12 software supports all of the source types that were supported in the ISE 11 software.If you are working with projects from previous releases, state diagram source files (.dia), ABEL source files (.abl), and test bench waveform source files (.tbw) are no longer supported. For state diagram and ABEL source files, the software finds an associated HDL file and adds it to the project, if possible. For test bench waveform files, the software automatically converts the TBW file to an HDL test bench and adds it to the project. To convert a TBW file after project migration, see Converting a TBW File to an HDL Test Bench.Migrating Projects from Previous ISE Software ReleasesWhen you open a project file from a previous release, the ISE software prompts you to migrate your project. If you click Backup and Migrate or Migrate Only, the software automatically converts your project file to the current release. If you click Cancel, the software does not convert your project and, instead, opens Project Navigator with no project loaded.NoteAfter you convert your project, you cannot open it in previous versions of the ISE software, such as the ISE 11 software. However, you can optionally create a backup of the original project as part of project migration, as described below.To Migrate a Project1. In the ISE 12 Project Navigator, select File > Open Project.2. In the Open Project dialog box, select the .xise file to migrate.NoteYou may need to change the extension in the Files of type field to display .npl (ISE 5 and ISE 6 software) or .ise (ISE 7 through ISE 10 software) project files.3. In the dialog box that appears, select Backup and Migrate or Migrate Only.4. The ISE software automatically converts your project to an ISE 12 project.NoteIf you chose to Backup and Migrate, a backup of the original project is created at project_name_ise12migration.zip.5. Implement the design using the new version of the software.NoteImplementation status is not maintained after migration.PropertiesFor information on properties that have changed in the ISE 12 software, see ISE 11 to ISE 12 Properties Conversion.IP ModulesIf your design includes IP modules that were created using CORE Generator software or Xilinx Platform Studio (XPS) and you need to modify these modules, you may be required to update the core. However, if the core netlist is present and you do not need to modify the core, updates are not required and the existing netlist is used during implementation.Obsolete Source File TypesThe ISE 12 software supports all of the source types that were supported in the ISE 11 software.If you are working with projects from previous releases, state diagram source files (.dia), ABEL source files (.abl), and test bench waveform source files (.tbw) are no longer supported. For state diagram and ABEL source files, the software finds an associated HDL file and adds it to the project, if possible. For test bench waveform files, the software automatically converts the TBW file to an HDL test bench and adds it to the project. To convert a TBW file after project migration, see Converting a TBW File to an HDL Test Bench.Using ISE Example ProjectsTo help familiarize you with the ISE software and with FPGA and CPLD designs, a set of example designs is provided with Project Navigator. The examples show different design techniques and source types, such as VHDL, Verilog, schematic, or EDIF, and include different constraints and IP.To Open an Example1. Select File > Open Example.2. In the Open Example dialog box, select the Sample Project Name.NoteTo help you choose an example project, the Project Description field describes each project. In addition, you can scroll to the right to see additional fields, which provide details about the project.3. In the Destination Directory field, enter a directory name or browse to the directory.4. Click OK.The example project is extracted to the directory you specified in the Destination Directory field and is automatically opened in Project Navigator. You can then run processes on the example project and save any changes.NoteIf you modified an example project and want to overwrite it with the original example project, select File > Open Example, select the Sample Project Name, and specify the same Destination Directory you originally used. In the dialog box that appears, select Overwrite the existing project and click OK.Creating a ProjectProject Navigator allows you to manage your FPGA and CPLD designs using an ISE project, which contains all the source files and settings specific to your design. First, you must create a project and then, add source files, and set process properties. After you create a project, you can run processes to implement, constrain, and analyze your design. Project Navigator provides a wizard to help you create a project as follows.NoteIf you prefer, you can create a project using the New Project dialog box instead of the New Project Wizard. To use the New Project dialog box, deselect the Use New Project wizard option in the ISE General page of the Preferences dialog box. To Create a Project1. Select File > New Project to launch the New Project Wizard.2. In the Create New Project page, set the name, location, and project type, and click Next.3. For EDIF or NGC/NGO projects only: In the Import EDIF/NGC Project page, select the input and constraint file for the project, and click Next.4. In the Project Settings page, set the device and project properties, and click Next.5. In the Project Summary page, review the information, and click Finish to create the project.Project Navigator creates the project file (project_name.xise) in the directory you specified. After you add source files to the project, the files appear in the Hierarchy pane of the Design panel. Project Navigator manages your project based on the design properties (top-level module type, device type, synthesis tool, and language) you selected when you created the project. It organizes all the parts of your design and keeps track of the processes necessary to move the design from design entry through implementation to programming the targeted Xilinx device.NoteFor information on changing design properties, see Changing Design Properties.You can now perform any of the following: Create new source files for your project. Add existing source files to your project. Run processes on your source files. Modify process properties. Creating a Copy of a ProjectYou can create a copy of a project to experiment with different source options and implementations. Depending on your needs, the design source files for the copied project and their location can vary as follows: Design source files are left in their existing location, and the copied project points to these files. Design source files, including generated files, are copied and placed in a specified directory. Design source files, excluding generated files, are copied and placed in a specified directory.Copied projects are the same as other projects in both form and function. For example, you can do the following with copied projects: Open the copied project using the File > Open Project menu command. View, modify, and implement the copied project. Use the Project Browser to view key summary data for the copied project and then, open the copied project for further analysis and implementation, as described in Using the Project Browser.NoteAlternatively, you can create an archive of your project, which puts all of the project contents into a ZIP file. Archived projects must be unzipped before being opened in Project Navigator. For information on archiving, see Creating a Project Archive.To Create a Copy of a Project1. Select File > Copy Project.2. In the Copy Project dialog box, enter the Name for the copy.NoteThe name for the copy can be the same as the name for the project, as long as you specify a different location.3. Enter a directory Location to store the copied project.4. Optionally, enter a Working directory.By default, this is blank, and the working directory is the same as the project directory. However, you can specify a working directory if you want to keep your ISE project file (.xise extension) separate from your working area.5. Optionally, enter a Description for the copy.The description can be useful in identifying key traits of the project for reference later.6. In the Source options area, do the following:Select one of the following options: Keep sources in their current locations - to leave the design source files in their existing location.If you select this option, the copied project points to the files in their existing location. If you edit the files in the copied project, the changes also appear in the original project, because the source files are shared between the two projects. Copy sources to the new location - to make a copy of all the design source files and place them in the specified Location directory.If you select this option, the copied project points to the files in the specified directory. If you edit the files in the copied project, the changes do not appear in the original project, because the source files are not shared between the two projects.Optionally, select Copy files from Macro Search Path directories to copy files from the directories you specify in the Macro Search Path property in the Translate Properties dialog box. All files from the specified directories are copied, not just the files used by the design.NoteIf you added a netlist source file directly to the project as described in Working with Netlist-Based IP, the file is automatically copied as part of Copy Project because it is a project source file. Adding netlist source files to the project is the preferred method for incorporating netlist modules into your design, because the files are managed automatically by Project Navigator.Optionally, click Copy Additional Files to copy files that were not included in the original project. In the Copy Additional Files dialog box, use the Add Files and Remove Files buttons to update the list of additional files to copy. Additional files are copied to the copied project location after all other files are copied.To exclude generated files from the copy, such as implementation results and reports, select Exclude generated files from the copy.When you select this option, the copied project opens in a state in which processes have not yet been run.7. To automatically open the copy after creating it, select Open the copied project.NoteBy default, this option is disabled. If you leave this option disabled, the original project remains open after the copy is made.Click OK.Creating a Project ArchiveA project archive is a single, compressed ZIP file with a .zip extension. By default, it contains all project files, source files, and generated files, including the following: User-added sources and associated files Remote sources Verilog `include files Files in the macro search path Generated files Non-project filesTo Archive a Project1. Select Project > Archive.2. In the Project Archive dialog box, specify a file name and directory for the ZIP file.3. Optionally, select Exclude generated files from the archive to exclude generated files and non-project files from the archive.4. Click OK.A ZIP file is created in the specified directory. To open the archived project, you must first unzip the ZIP file, and then, you can open the project.NoteSources that reside outside of the project directory are copied into a remote_sources subdirectory in the project archive. When the archive is unzipped and opened, you must either specify the location of these files in the remote_sources subdirectory for the unzipped project, or manually copy the sources into their original location



CONCLUSIONA configurable booth multiplier has been designed which provides a flexible arithmetic capacity and a tradeoff between output precision and power consumption. Moreover, the ineffective circuitry can be efficiently deactivated, thereby reducing power consumption and increasing speed of operation. The experimental results have shown that the proposed multiplier outperforms the conventional multiplier both Radix 2 Booth multiplier and Radix 4 Booth multiplier in terms of power and speed of operation with enough accuracy at the expense of extra area.

REFERANCES[1]. J. Choi, J. Jeon, and K. Choi, Power minimization of function units by partially guarded computation, in Proc. Int. Symp. Low Power Electron. Des., Jul. 2000, pp. 131136.[2]. Fayed A and M. A. Bayoumi, A novel architecture for low-power design of parallel multipliers, in Proc. IEEE Comput. Soc. Annu Workshop VLSI, Apr. 2001, pp. 149154.[3]. N. Honarmand and A. A. Kusha, Low power minimization combinational multipliers using data-driven signal gating, in Proc. IEEE Int. Conf. Asia-Pacific Circuits Syst., Dec. 2006, pp. 1430 1433.[4]. K.-H. Chen and Y.-S. Chu, A spurious-power suppression technique for multimedia/DSP applications, IEEE Trans. Circuits Syst. I, Reg. Papers, vol. 56, no. 1, pp. 132143, Jan. 2009.[5]. T. Yamanaka and V. G. Moshnyaga, Reducing energy of digital multiplier by adjusting voltage supply to multiplicand variation, in Proc. 46th IEEE Midwest Symp. Circuits Syst., Dec. 2003, pp. 14231426.[6]. N.-Y. Shen and O. T.-C. Chen, Low-power multipliers by minimizing switching activities of partial products, in Proc. IEEE Int. Symp. Circuits Syst., May 2002, vol. 4, pp. 9396.[7]. M. J. Schulte and E. E. Swartzlander Jr., Truncated multiplication with correction constant, in Proc. Workshop VLSI Signal Process., Oct.1993, pp. 388396.[8]. S. J. Jou, M. H. Tsai, and Y. L. Tsao, Low-error reduced-width Booth multipliers for DSP applications, IEEE Trans. Circuits Syst. I, Fundam. Theory Appl., vol. 50, no. 11, pp. 14701474, Nov. 2003.[9]. K. J. Cho, K. C. Lee, J. G. Chung, and K. K. Parhi, Design of lowerrorfixed-width modified booth multiplier, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 12, no. 5, pp. 522531, May 2004.[10]. T.-B. Juang and S.-F. Hsiao, Low-power carry-free fixed-widthmultipliers with low-cost compensation circuit, IEEE Trans.Circuits Syst.II, Analog Digit. Signal Process. vol. 52, no. 6, pp.299303, Jun. 2005.[11]. Shiann-Rong Kuang and Jiun-Ping Wang Design of powerefficient configurable booth multiplier IEEE Trans. Circuits Syst. I Regular Papers vol. 57, no.3, pp. 568-580, March 2010.[12]. T. Kitahara, F. Minami, T. Ueda, K. Usami, S. Nishio, M. Murakata,and T. Mitsuhashi, A clock-gating method for lowpower.LSI design, in Proc. Int. Symp. Low Power Electron. Des., Feb. 1998, pp. 07312.[13]. X. Chang, M. Zhang, G. Zhang, Z. Zhang, and J. Wang, Adaptive clock gating technique for low power IP core in SOC design, in Proc. IEEE Int. Symp. Circuits Syst., May 2007, pp. 21202123.[14]. L. Dadda, Some schemes for parallel multipliers, Alta Freq., vol. 34,pp. 349356, May 1965.[15]. C.Wallace, A suggestion for a fast multiplier, IEEE Trans. Electron.Comput., vol. EC-13, no. 1, pp. 1417, Feb. 1964.