[IEEE 2009 International Conference on Advances in Recent Technologies in Communication and...

A Novel Time and Energy Efficient Cubing Circuit using Vedic Mathematics for Finite Field Arithmetic

Ramalatha M@1, Thanushkodi K&2

@Electronics and Communication Dept, Karpaga Vinayaga College of Engineering,

&Principal, Akshaya College of Engineering Chennai, India

[email protected]

Deena Dayalan K#3, Dharani P#4

#Electronics and Communication Dept, Easwari Engineering College,

Chennai, India [email protected], [email protected]

Abstract— Cubing plays a vital role in secure communication systems, Signal Processing Applications, Finite Field Arithmetic etc. As the radix of the number used for cubing increases the process gets complicated which in turn increases the delay and power consumption. Vedic mathematics is an ancient mathematics concept that provides a fast and a reliable approach to perform arithmetic operations using sixteen Sutras or word- formulae. In this paper the Anurupya Vedic sutra is used for cubing operations with two different multiplier architectures - One array structured and one tree structure. The performance of these multipliers for cubing applications is compared on the basis of their delay, power consumption and area utilization and it is proved that Anurupya Sutra improves the performance tremendously.

Keywords- Cubing, Vedic mathematics, Wallace tree multipliers, Carry save multipliers.

I. INTRODUCTION Vedic mathematics has a unique technique of

calculations based on simple rules and principles with which any mathematical problem can be solved – be it arithmetic algebra, geometry or trigonometry. The system is based on 16 Vedic Sutras or aphorisms, which are actually word formulae describing natural ways of solving a whole range of mathematical problems.

The cubing operation is one of the most important operations in arithmetic process and it is found to be complicated, as we go for higher radix numbers. Cubing operation can be performed using ordinary multipliers, which are scalable but they have a larger delay Structure based array implementations are faster but scalability increases design complexity as well as expense.

Moreover, multipliers occupy large area, have long latency and consume considerable power. Therefore, multipliers which offer either of the following design targets-scalability, re-configurability, high speed, low power consumption, regularity of layout and less area or even a combination of some of these features are welcomed. The Anurupya sutra of Vedic mathematics provides an efficient way of constructing a straight cubing system without using conventional multiplication methods.

II. VEDIC MATHEMATICS For the purpose of cubing, there are two different sutras

available in Vedic mathematics. One is the Yavadunam sutra and the other is Anurupya sutra. In this paper we deal mainly with the Anurupya sutra.

‘Yavadunam’ – It so happens that squaring, cubing etc., of numbers have a particular entity and individuality of their own; and besides, they derive additional importance because of their intimate connection with the question of square-root, cube-root etc., Thus Yavadunam sutra in the further outlook can be similar to binomial theorem.

‘Anurupya’-According to this sutra, (ab) 3 = a3 + 3a2b + 3ab2 + b3. Where a, b are the digits of the number. Here (+) does

not indicate ordinary addition. Hence the cube of a number can be found by applying the sutra.

Ex1: Let us consider a two digit decimal number 43 According to the sutra, (ab) 3= a3+3a2b +3ab2+ b3 . Step 1: Let a = 4 and b=3 Step 2: Applying the sutra, (43)3=(4)3+3(4)2(3)+

3(4)(3)2+(3) 3 Step 3: Now add the partial products from right by shifting them one digit. i.e. b3 = (3*3*3) = 27 3b2a = (3*3*3*4) = 108 3a2b = (3*4*3*4) = 144 a3 = (4*4*4) = 64 (ab)3 79507

Ex 2: Let us consider a four digit binary number 1111. According to the sutra, (ab) 3= a3+3a2b +3ab2+ b3 Step 1: Let a = 11 and b=11 Step 2: Applying the sutra, (1111)3 = (11)3 + 11(11)(11)2+11(11) 2(11)+(11) 3

2009 International Conference on Advances in Recent Technologies in Communication and Computing

978-0-7695-3845-7/09 $25.00 © 2009 IEEE

DOI 10.1109/ARTCom.2009.227

873


978-0-7695-3845-7/09 $26.00 © 2009 IEEE

DOI 10.1109/ARTCom.2009.227

873


978-0-7695-3845-7/09 $26.00 © 2009 IEEE

DOI 10.1109/ARTCom.2009.227

873

Partial Product Matrix Reorganized Matrix (Array structure) (Wallace Tree Structure)

Step 3: Now add the partial products from right by shifting them two places. b3 = (11*11*11)3 = 11011 b2a = (11*11*11*11) = 1010001 3a2b = (11*11*11*11) = 1010001 a3 = (11*11*11) = 11011 (ab)3 = 110100101111

From the above operations, it is obvious to us that multipliers are to be employed. There are numerous multiplication techniques, which are employed in processors and other applications. The choice of one is based on their structure, easy application and performance.

III. TYPES OF MULTIPLIERS Multipliers can be broadly classified into structure based

and algorithm based multipliers. Regular and irregular structured multipliers have been identified, like carry save array multiplier, Wallace tree, and Dadda tree multipliers. Here both regular and irregular layout multipliers have been tried for our cubing circuit and their performance were analyzed.

A. Array Multipliers Array multipliers can be implemented by directly

mapping the manual multiplication into hardware. The partial products are accumulated by an array of adder circuits. An n x n array multiplier requires n (n-1) adders and n2 AND gates. Array multipliers are very slow as their critical path is very long. The main advantage of these multipliers is the regular structure, which leads to ease of layout and design.

B. Carry save array multipliers The carry-save array multiplier uses an array of carry-

save adders for the accumulation of partial product. It uses a carry-propagate adder for the generation of the final product. This reduces the critical path delay of the multiplier since the carry-save adders pass the carry to the next level of adders and not to the adjacent ones.

C. Wallace Tree Multiplier C. S. Wallace propounded a fast technique to perform

multiplication in 1964. A Wallace tree multiplier offers faster performance for large operands. Unlike an array multiplier the partial product matrix for a tree-multiplier is rearranged in a tree-like format, reducing both the critical path and the number of adder cells needed. Fig. 1 shows the tree structure of the partial product matrix.

The Wallace tree multiplier belongs to a family of multipliers called column compression multipliers. The underlying principle in this family of multipliers is to achieve partial product accumulation by successively reducing the number of bits of information in each column

using full adders or half adders. The full adder is known as a (3:2) compressor because of its ability to add 3 bits from a single column of the partial product matrix and output 2 bits, 1 bit in the same column and 1 bit in the next column of the output matrix. The half adder is known as a (2:2) compressor because of its ability to take 2 bits from a single column of the partial product matrix and output 2 bits, 1 bit in the same column and 1 bit in the next column of the output matrix.

Figure 1. Tree structure of partial product matrix.

The Wallace tree consists of numerous levels of such column compression structures until finally, only two full-width operands remain. These two operands can then be added using fast carry-propagate adder to obtain the product result. What differentiates the Wallace tree multiplier from other column compression multipliers is that in the Wallace tree every possible bit in every column is covered by the (3:2) or (2:2) compressors repetitively until finally the partial product matrix has a depth of only 2. Thus the Wallace tree multiplier uses as much hardware as possible to compress the partial product matrix as quickly as possible into the final product.

D. 4:2 Column compression technique A (4:2) compressor based multiplier forms a binary tree

topology, which leads to a fast, symmetric and regular design. Due to this regularity, spurious transitions can also be reduced.

The use of a 4:2 compressor introduces regularity into the Wallace tree multiplier. This results in ease of layout, reduced power consumption as well as savings in number of components used since the 4:2 compressor compresses 4-bits to 2-bits.

IV. IMPLEMENTATION The multiplier algorithms were first implemented along

with Anurupya sutra to obtain a cubing module using the hardware description language VHDL.

The logic simulation was done using MODELSIM Simulator, the synthesis and FPGA implementation were done using XILINX navigator. The design is optimized for speed and area using XILINX, Device family: Spartan-II,

874874874

Device: xc2s100, Package: tq144, Speed grade: 5. The device consists of multiplexers and LUTs. The synthesis results are shown in Table. I.

TABLE I. COMPARISON TABLE

Multiplier Parameters

No. of Slices Delay(ns) Power(mw)

8-bit Wallace 206/1200 66.496 140.29

8-bit csm 196/1200 58.477 140.29

8-bit Vedic Wallace 181/1200 56.237 112.39

8-bit Vedic csm 165/1200 29.016 282.69 8-bit 4:2 compression

Vedic Wallace 181/1200 56.237 108.80

Figure 2. Chart representing the result.

V. CONCLUSION Figure 2. shows the synthesis results of the

implementation of Vedic multipliers. It is evident from the tabulated values that Vedic Carry save cubing is better than Vedic Wallace and Vedic 4:2 column compression in terms of delay, power and no of slices. Moreover the power consumption is also reduced when Vedic algorithm is used. It is more evident that cubing operation using Vedic operations is much better than the conventional cubing. This proves that this cubing technique using Vedic mathematics is highly suitable for wide bit multiplication such as in public key cryptosystems.

REFERENCES [1] Jagadguru Swami Sri Bharati Krishna Tirthaji Maharaja, “Vedic Mathematics”, Motilal Banarsidas, Varanasi, India, 1986. [2] A.P.Nicholas, K.R.Williams, J.Pickles, “Lecture on Vedic Mathematics”, Spiritual study group, Roorkee(India), 1984 [3] Bong-II Park, In-Cheol Park and Chong Min Kyung, “A Regular Layout Structured Multiplier Based on Weighted Carry Save Adders”, Proceedings on IEEE International Conference on Computer Design, 1999 [4]Wallace, C.S. (1964) ‘A Suggestion for a Fast multiplier’, IEEE Transactions on Electronics and Computers EC-13:14-17 [5] Himanshu Thapliyal and M.B. Srinivas, “High Speed Efficient NxN bit parallel Hierarchical Overlay Multiplier Architecture based on Ancient Indian Vedic Mathematics”, Enformatica V2,2004. [6]. Himanshu Thapliyal and Hamid R. Arabania, “A Time- Area – Power Efficient Multiplier and Square Architecture Based on Ancient Indian Vedic Mathematics”, proceedings on VLSI04, Las Vegas, U.S.A, June 2004. [7]. A New Design Technique for Column Compression Multipliers, Zhongde Wang, G.A. Jullien and W.C. Miller, VLSI Research Group, University of Windsor, Ontario, Canada. [8]. Albert A. Liddicoat and Michael J.Flynn, “Parallel Square and Cube Computations”, Proceedings on 34th Asilomar Conference on Signals, Systems, and Computers, California, October 2000. [9]. Himanshu Thapliyal, S. Kotiyal and M.B. Srinivas, “Design and Analysis of a Novel Parallel Square and Cube Architecture Based on Ancient Indian Vedic Mathematics”, Proceedings on 48th IIEEE International Midwest Symposium on Circuits and Systems (MWSCAS 2005), Cincinnati, Ohio, August 2005I. S. Jacobs and C. P. Bean, “Fine particles, thin films and exchange anisotropy,” in Magnetism, vol. III, G. T. Rado and H. Suhl, Eds. New York: Academic, 1963, pp. 271–350.

875875875

[IEEE 2009 International Conference on Advances in Recent Technologies in Communication and...

Documents

Transcript of [IEEE 2009 International Conference on Advances in Recent Technologies in Communication and...