REVIEW ON OPTIMIZATION TECHNIQUES OF WALLACE TREE …

9
32 REVIEW ON OPTIMIZATION TECHNIQUES OF WALLACE TREE MULTIPLIER K Padma Priya 1 *, P Sai Arun Kumar 1 , B Mounika 1 , P Lavanya 1 , D Sivananda Das 1 and B Buchi Babu 1 *Corresponding Author: K Padma Priya, [email protected] In present generation, VLSI systems and their design became so much important in the Electronic Engineering. As the VLSI design is the basis for electronic components, one should optimise the constraints. For high performance systems such as digital signal processors and microprocessors, Multiplier is the key hardware block. In designing VLSI systems, the main criteria of interest are high speed, low power, less area, low cost. Many researchers have tried and are trying to design multipliers which offer either of the following design targets-high speed, low power consumption, regularity of layout and hence less area or even combination of them in one multiplier, thus making them suitable for various compact VLSI implementations. Improving speed results always in larger areas. So, this paper provides the techniques that optimise the area and speed up the multiplication process. Keywords: Multiplier, Wallace tree structure, Compressors, Sklansky adder INTRODUCTION Most digital signal processing algorithms involves arithmetic operations which dominates the execution time. The dominating factor in determining instruction cycle time of Reduced Instruction Set Computers (RISC) and DSP chip is the time to execute multiplication operation (Lakshmanan et al., 2002). Among the many methods of implementing parallel multipliers, there are two basic approaches namely Booth algorithm and Wallace tree compressors. In Wallace tree ISSN 2319 – 2518 www.ijeetc.com Vol. 3, No. 2, April 2014 © 2014 IJEETC. All Rights Reserved Int. J. Elec&Electr.Eng&Telecoms. 2014 1 ECE Department, JNTUKUCEV, Vizianagaram, Andhra Pradesh, India. structure multipliers, the multiplier array becomes large and requires more number of logic gates and interconnecting wires. This makes the chip design complex and slow down the operating speed (Lakshmanan et al. , 2002). This paper describes about the pitfalls of conventional Wallace tree structure and designing techniques of high speed Wallace tree multiplier. It will describe the speeding techniques of conventional Wallace Tree multiplication structure which focus mainly on the acceleration of the process (Wallace, Review Article

Transcript of REVIEW ON OPTIMIZATION TECHNIQUES OF WALLACE TREE …

Page 1: REVIEW ON OPTIMIZATION TECHNIQUES OF WALLACE TREE …

32

Int. J. Elec&Electr.Eng&Telecoms. 2014 K Padma Priya et al., 2014

REVIEW ON OPTIMIZATION TECHNIQUES OFWALLACE TREE MULTIPLIER

K Padma Priya1*, P Sai Arun Kumar1, B Mounika1, P Lavanya1,D Sivananda Das1 and B Buchi Babu1

*Corresponding Author: K Padma Priya,[email protected]

In present generation, VLSI systems and their design became so much important in theElectronic Engineering. As the VLSI design is the basis for electronic components, one shouldoptimise the constraints. For high performance systems such as digital signal processors andmicroprocessors, Multiplier is the key hardware block. In designing VLSI systems, the maincriteria of interest are high speed, low power, less area, low cost. Many researchers have triedand are trying to design multipliers which offer either of the following design targets-high speed,low power consumption, regularity of layout and hence less area or even combination of them inone multiplier, thus making them suitable for various compact VLSI implementations. Improvingspeed results always in larger areas. So, this paper provides the techniques that optimise thearea and speed up the multiplication process.

Keywords: Multiplier, Wallace tree structure, Compressors, Sklansky adder

INTRODUCTIONMost digital signal processing algorithmsinvolves arithmetic operations whichdominates the execution time. The dominatingfactor in determining instruction cycle time ofReduced Instruction Set Computers (RISC)and DSP chip is the time to executemultiplication operation (Lakshmanan et al.,2002). Among the many methods ofimplementing parallel multipliers, there are twobasic approaches namely Booth algorithm andWallace tree compressors. In Wallace tree

ISSN 2319 – 2518 www.ijeetc.comVol. 3, No. 2, April 2014

© 2014 IJEETC. All Rights Reserved

Int. J. Elec&Electr.Eng&Telecoms. 2014

1 ECE Department, JNTUKUCEV, Vizianagaram, Andhra Pradesh, India.

structure multipliers, the multiplier arraybecomes large and requires more number oflogic gates and interconnecting wires. Thismakes the chip design complex and slow downthe operating speed (Lakshmanan et al.,2002). This paper describes about the pitfallsof conventional Wallace tree structure anddesigning techniques of high speed Wallacetree multiplier. It will describe the speedingtechniques of conventional Wallace Treemultiplication structure which focus mainly onthe acceleration of the process (Wallace,

Review Article

Page 2: REVIEW ON OPTIMIZATION TECHNIQUES OF WALLACE TREE …

33

Int. J. Elec&Electr.Eng&Telecoms. 2014 K Padma Priya et al., 2014

1964). Acceleration of the process basedpurely on the following aspects of normalmultiplication:

• Reduction in the number of summands;

• Acceleration of formation of summands;

• Acceleration of the addition of summands(Wallace, 1964).

It consists of sections which describe aboutmultiplier, Wallace Tree multiplier, pitfalls ofconventional Wallace Tree multiplier anddesigning techniques of high speed WallaceTree multiplier.

MULTIPLIERMultiplication is hardware intensive and fastmultipliers are essential for digital signalprocessing systems. General purpose DSPscan afford to have customized highperformance multipliers. Application specificDSPs implement algorithms as directrealization of their floe graphs, thus havingseveral multipliers, each one with differentspecifications (Jalil Fadavi-Ardekani, 1993).Multiplication is essential for computers,process controllers, microprocessors, digitalsignal processing and graphic engines.System performance is determined byperformance of multiplier because it is slowestelement and area consuming. Multiplicationoperation is basically a shift and addoperation. Single bit multiplication isequivalent to logical AND operation. Multi bitmultiplication is of AND, ADD, SHIFToperations. There are number of techniquesthat can be used to perform multiplication. Ingeneral, the factors such as latency, throughput,area and design complexity are the basis ofthe choice (Neil Weste et al., 2005).

Digital multipliers can be classified intoserial, parallel and serial-parallel multipliers.Parallel multipliers are again of two types, Arraymultipliers and tree multipliers. The selectionof multiplier depends upon the application. Thebelow Figure 1 describes the classification ofmultipliers.

Figure 1: Classification of Multipliers

WALLACE TREE MULTIPLIERProfessor Christopher Stewart Wallacedevised the Wallace Tree Multiplier algorithmin 1964. The Wallace tree multiplier isconsiderably faster than a simple arraymultiplier and is an efficient implementation ofa digital circuit that multiplies two integers. TheFigure 2 shows the Wallace proposed treearchitecture (Wallace, 1964) and Figure 3shows structural representation of 4 x 4multiplier.

Features of Wallace Tree Multiplier• Optimized column adder tree.

• Combines all partial products into twovectors (carry and sum).

• Carry and sum outputs combined using aconventional adder.

Page 3: REVIEW ON OPTIMIZATION TECHNIQUES OF WALLACE TREE …

34

Int. J. Elec&Electr.Eng&Telecoms. 2014 K Padma Priya et al., 2014

• Wallace tree multiplier uses Wallace tree tocombine 1 X M partial products.

• Irregular routing.

Components• Half adder

• Full adder

• Ripple carry adder

• Carry propagating adder

Operation Used in Wallace TreeMultiplierAs an example for the description of operationand logic used in the Wallace Treemanipulation, the below Figure 4 shows 8-bitmultiplier constructed using Wallace Treearchitecture. Partial product are added in 6steps. In the Wallace Tree Method shownbelow, the stages are reduced by usingcomponents like half adders and full adders(Naveen Gahlan et al., 2012).

PITFALLS INCONVENTIONAL WALLACETREE MULTIPLIER• Wallace trees do not provide any advantage

over ripple adder trees in many FPGAs.

• Due to the irregular routing, they may actuallybe slower and are certainly more difficult toroute.

• Adder structure increases for increased bitmultiplication.

DESIGNING TECHNIQUES OFHIGH SPEED WALLACE TREEMULTIPLIERRealizing high speed multiplier can be doneby enhancing the parallelism so that the

Figure 2: Wallace Tree Structure

Figure 3: 4 x 4 Multiplier-StructuralRepresentation

Page 4: REVIEW ON OPTIMIZATION TECHNIQUES OF WALLACE TREE …

35

Int. J. Elec&Electr.Eng&Telecoms. 2014 K Padma Priya et al., 2014

subsequent calculation stages will decrease.To improve the functionality of the conventionalWallace tree multiplier, we here studied anddescribed some methods to reduce powerand increase the speed to the maximum extentand one can implement the below methodsand can experience the improvement in theconstraints related to the conventional Wallacetree multiplier. They can be done in two waysmainly. They are:

1. Design by using compressors and sklanskyadder (Novel Architecture) (Vinoth et al.,2011).

2. Design using pipelined architecture (TajulHamimi Harun, 2007).

Design of High Speed and LowPower Wallace Tree StructureUsing Compressors and SklanskyAdder (Novel Structure)

Conventional Wallace TreeStructure vs. Novel StructureIn conventional 8 bit Wallace tree multiplierdesign, more number of addition operationsis required. Three partial product terms canbe added at a time to form the carry and sumusing the carry save adder. At the next level,the sum signal is used by the full adder. Thecarry signal is used by the adder involved inthe next output bit generation with a resultingoverall delay which is proportional to log 3/2 n,for n number of rows. In the first and secondstages of the Wallace structure, the partialproducts do not depend upon any values otherthan the inputs obtained from the AND array.However, the final value depends on the carryout value of previous stage for the intermediatehigher stages. This operation is repeatedsequentially for the consecutive stages. Finally,it is observed that major cause of delay ispropagation of the carry out from the previousstage to the next stage. In conventionalWallace Tree multiplier structure, the totalnumber of stages in the critical path sums upto 13. Each full adder used accounts for alatency of 2. Total latency of the whole structuregives a total of 26. The latency count getsincreased by 1, when considered the andarray, which results the total latency.

Where the novel structure aims to reducethe overall latency. This increases the speedand reduces power consumption. This designmakes use of compressors instead of the fulladders and the final stage is replaced by asklansky tree adder. In first stage, consists of

Figure 4: Wallace Tree Multiplier ReductionStages for 8 x 8 Multiplication

Page 5: REVIEW ON OPTIMIZATION TECHNIQUES OF WALLACE TREE …

36

Int. J. Elec&Electr.Eng&Telecoms. 2014 K Padma Priya et al., 2014

full adder. In the second stage, two full addersare grouped using 4:2 compressors andimplemented. Then, in third stage, using a 5:2compressor, which is a combination of 3 fulladders and so on. In this way, the individualblocks of full adders in the original structureare grouped and implemented with the help ofcompressors. The use of sklansky adder in thestructure further results in a reduced latency of

6 with a latency of 1 for the AND array. Thisnovel structure, at a whole bring down thelatency count to 15. Thus, a significant latencyreduction of 44.4% is realized thanconventional structure. The figures (Figure 5shows critical path of conventional treestructure (Vinoth et al., 2011) and Figure 6shows novel architecture (Vinoth et al., 2011)).

CompressorsThe half adder and full adder contributes moredelay in circuit and this total delay is linear tothe number of addition stages. Now a days,for high speed multipliers, compressors arewidely used which lower the latency of partialproduct accumulation stage. The proposedarchitecture of Wallace tree multiplier isfocused on minimization of delay in partialproduct reduction stage using compressors.

A compressor is single bit adder circuit. Ahigh speed compressor can perform more thanthree bit addition in parallel. Realization andreduction in number of partial product additionstages can be done using multi bitcompressors. 3:2 is combinational circuitwhich sum up three binary inputs of one bit andgives sum and carry output of one bit. It isequivalent to full adder.

321 XXXSum

Carry = X1X2 + X2X3 + X3X1

4:2 compressor has 4 inputs (X1, X2, X3and X4) and 2 outputs (sum and carry) alongwith carry-in and carry-out. It consists of two3:2 compressors cascaded in series. Thisinvolves a critical path delay of three X-ORgates and hence reducing the critical path byone X-OR delay. 5:2 compressor is widely usedbuilding block in high speed and highprecision multipliers. For an example of the

Figure 5: Critical Path in ConventionalWallace Tree

Figure 6: Proposed Wallace Tree UnderNovel Architecture

Page 6: REVIEW ON OPTIMIZATION TECHNIQUES OF WALLACE TREE …

37

Int. J. Elec&Electr.Eng&Telecoms. 2014 K Padma Priya et al., 2014

compressors, the figures Figure 7 shows theblock diagram (Karthick et al., 2012) andFigure 8 shows architecture of 5:2 compressor(Karthick et al., 2012).

is required, tree structures like parallel prefixadders such as Kogge-stone, Sklansky, Brent-kung, Han-carlson and Kogge-stone with lingdamadders (Dakupati Ravi Sankar and ShaikAshraf Ali, 2013). The final stage for thesummation of the partial products can befurther minimised by the usage of tree addersin the Wallace tree multiplier, which reducesthe power consumption preliminarily and alsoincreases the speed of the operation. Thefundamental concept of tree adders extendfrom the idea of carry look-ahead computationand the class of parallel carry look-aheadschemes. These architectures give so manyhigh-performance applications. Because of thelower power consumption than incurred by theother tree structures, here the sklansky addertree is preferred (Vinoth et al., 2011). TheFigure 9 shows sklansky tree adder structure(Vinoth et al., 2011).

Figure 7: Block Diagram of 5:2 Compressor

Figure 8: High Speed 5:2 Compressor

Sklansky AdderBinary addition is one of the most primitive andmost commonly used applications in computerarithmetic. A large variety of algorithms andimplementations have been proposed forbinary addition. When high speed operation

Figure 9: Sklansky Tree Adder Structure

In the above figure, the binary trees of cellsfirst generate all the carry input bitssimultaneously. This architecture follows aregular pattern. The delay is ultimately reducedto log2N by computing intermediate prefixesalong with the large group prefixes. Here, Nrepresents the number of bits. This also results

Page 7: REVIEW ON OPTIMIZATION TECHNIQUES OF WALLACE TREE …

38

Int. J. Elec&Electr.Eng&Telecoms. 2014 K Padma Priya et al., 2014

in an increased number of fan-outs at eachlevel. In Figure 10 the black squarescorrespond to AND-OR-AND logic and in theFigure 11 grey squares correspond to AND-OR logic. The triangular box represents buffers(Vinoth et al., 2011).

multiplier, it has been proven that the pipeliningmethod could increased the speed of themultiplier. However, the area of the multiplierdesigned is also increased because of the 4-stages D flip-flop that has been implementedin the design. Basically, the objective of thisproject is mainly focus on the speed of theWallace Tree multiplier, so, the increased areaof the multiplier does not take into count.Otherwise, the high speed 8-bits x 8-bits (TajulHamimi Harun, 2007).

The conventional circuit produces themaximum speed of 14.99 MHz or maximumdelay of 66.7 nanoseconds to complete oneprocess of 8-bits x 8-bits multiplication.However, after upgrading the conventionalWallace Tree into pipeline Wallace Tree, byadding D flip-flop stages at every input of the

Figure 10: Gray Cell

Source: Dakupati Ravi Sankar and Shaik Ashraf Ali (2013)

Figure 11: Blackcell

Source: Dakupati Ravi Sankar and Shaik Ashraf Ali (2013)

Design Using Pipelined StructureThe Wallace tree architecture under pipelinedstructure is studied and compared with theconventional. As results are observed, aftercomplete analysis of the Wallace Tree

Figure 12: Conventional 8 x 8 WTArchitecture

Page 8: REVIEW ON OPTIMIZATION TECHNIQUES OF WALLACE TREE …

39

Int. J. Elec&Electr.Eng&Telecoms. 2014 K Padma Priya et al., 2014

half adder and the full adder, the speed hasimprove much, increased to 54.05 MHz andthe maximum delay has decreased to 18.3nanoseconds. It was three times faster thanthe conventional design. The Figures 12 and13 shows conventional and high speedarchitectures.

REFERENCES1. Chip-Hong Chang, Jiangmin Gu and

Mingyan Zhang (2004), “Ultra Low VoltageLow Power CMOS 4-2 and 5-2Compressors for Fast ArithmeticCircuits”, IEEE Transactions on Circuitsand Systems-I: Regular Papers, Vol. 51,No. 10.

2. Dakupati Ravi Sankar and Shaik AshrafAli (2013), “Design of Wallace Tree

Figure 13: 8 x 8 High Speed WT UnderPipelined Structure

Multiplier by Sklansky Adder”,International Journal of EngineeringResearch and Applications (IJERA),Vol. 3, No. 1, pp. 1036-1040.

3. “Implementation of 8 x 8 Wallace TreeMultiplier Using VHDL”, A Project Reportof Students of Bapatla EngineeringCollege (available on internet).

4. Jalil Fadavi-Ardekani (1993), “Mxn BoothEncode Multiplier Generator UsingOptimised Wallace Trees”, IEEETransactions on Very Large ScaleIntegrated Systems, Vol. 1, No. 2.

5. Jasbir Kaur and Kavita (2013), “StructuralVHDL Implementation of Wallace TreeMultiplier”, Vol. 4, No. 4, ISSN: 2229-5518.

6. Karthick S, Karthika S and Valarmathy S(2012), “Design and Analysis of LowPower Compressors”, Vol. 1, No. 6,ISSN: 2278-8875.

7. Lakshmanan, Masuri Othman andMohamad Alauddin Mohd. Ali (2002),“High Performance Parallel MultiplierUsing Wallace-Booth Algorithm”.

8. Naveen Kr. Gahlan et al. (2012),“Implementation of Wallace Tree MultiplierUsing Compressor”, Int. J. ComputerTechnology & Applications, Vol. 3,No. 3, pp. 1194-1199.

9. Neil H E Weste, David Harris and AyanBanerjee (2005), “CMOS VLSI Design”,A Circuits and Systems Perspective, 3rd

Edition.

10. Paul F Stelling, Vojin G Oklobdzija andRavi R (1996), “Optimal Circuits forParallel Multiplier”, IEEE Transactions onComputers, Vol. 47, No. 3.

Page 9: REVIEW ON OPTIMIZATION TECHNIQUES OF WALLACE TREE …

40

Int. J. Elec&Electr.Eng&Telecoms. 2014 K Padma Priya et al., 2014

11. Tajul Hamimi Harun (2007), “High Speed8 x 8 Wallace Tree Multiplier”, A ReportSubmitted in 2007 (from internet).

12. Vinoth C, Kanchana Bhaaskaran V S,Brindha B, Sakthikumaran S, KavinilavuV, Bhaskar B, Kanagasabapathy M andSharath B (2011), “A Novel Low Power

and High Speed Wallace Tree Multiplierfor RISC Processor”, 978-1-4244-8679-3/11.

13. Wallace C S (1964), “A Suggestion forFast Multipliers”, IEEE Transactions onElectronic Computers, Vol. EC-13,February, pp. 14-17.