Alliance for Aging | Answers for Aging€¦ · Alliance for Aging | Answers for Aging
Efficient Adaptive Hold Logic Aging-Aware Reliable NR4SD...
Transcript of Efficient Adaptive Hold Logic Aging-Aware Reliable NR4SD...
Efficient Adaptive Hold Logic Aging-Aware Reliable NR4SD
Multiplier Design
P.Vishnuteja and Dr. CH. Ravi kumar
M.Tech (VLSI & ES),Department of ECE, Prakasam Engineering college
Professor, Department of ECE, Prakasam Engineering college
Abstaract
Digital multipliers area unit among the most essential arithmetic purposeful devices.
the common performance of those systems depends upon at the turnout of the multiplier
factor. within the meantime, the negative bias temperature instability impact happens
whereas a pMOS junction transistor is beneath negative bias (Vgs = −Vdd), increasing the
edge voltage of the pMOS junction transistor, and reducing multiplier factor pace. an
identical development, positive bias temperature instability, happens once associate
degree nMOS junction transistor is beneath positive bias. every impact degrades junction
transistor pace and within the future, the device may additionally fail because of temporal
arrangement violations. Therefore, it's essential to style dependable high-overall
performance multipliers. during this paper, counsel associate degree aging-aware
multiplier factor model with a novel adaptive hold logic (AHL) circuit. The multiplier
factor is ready to supply higher turnout through the variable latency and should modify the
AHL circuit to mitigate overall performance degradation this is often thanks to the aging
impact. what is more, the planned structure is often applied to a Pre-Encoded NR4SD
multiplier factor.
Keywords: Adaptive hold logic (AHL), Positive bias temperature instability (PBTI),
Negative bias temperature instability (NBTI), Reliable multiplier.
1. Introduction Digital multipliers area unit amongst the most essential arithmetic purposeful units in
several applications, beside the separate cos transforms, digital filtering and Fourier
transform. The turnout of these applications depends on multipliers, and if the multipliers
area unit too gradual, the performance of complete circuits are reduced. Moreover,
negative bias temperature instability (NBTI) happens whereas a pMOS junction transistor
is to a lower place negative bias (Vgs = −Vdd). On this case, the interaction between
inversion layer holes and hydrogen-passivated Si atoms breaks the Si–H bond generated
at some stage within the oxidation method, generating H or H2 molecules. once those
molecules diffuse away, interface traps area unit left. The increased interface traps
between the gate compound interface and also silicon material end in improved threshold
voltage (Vth), decreasing the circuit change speed. once the biased voltage is eliminated,
the alternative response happens, reducing the NBTI impact. However, the reverse
response doesn't get obviate all the interface traps generated for the period of the strain
phase, and Vth is extended within the future. Therefore, it's necessary to style a
dependable superior multiplier factor. The corresponding impact on associate degree
nMOS junction transistor is positive bias temperature instability (PBTI), that happens
while associate degree nMOS junction transistor is to a lower place positive bias. As
compared with the NBTI impact, the PBTI impact is way smaller on oxide/polygate
transistors, and consequently is generally unseen.
International Journal of Research
Volume VIII, Issue V, MAY/2019
ISSN NO:2236-6124
Page No:2371
However, for high-k/metal-gate nMOS transistors with huge rate saddlery, the PBTI
impact cannot be neglected. In reality, it's been shown that the PBTI impact is further
monumental than the NBTI impact on 32-nm high-k/metal-gate processes [1] – [4].
A traditional methodology to mitigate the aging impact is overdesign [5], [6], together
with such matters as guard-banding and gate oversizing; but this approach is often terribly
pessimistic and space and power inefficient. to stay off from this drawback, several NBTI-
aware methodologies were planned. associate degree NBTI-aware generation mapping
technique become planned in [7] to ensure the performance of the circuit throughout its
period.
In [8], associate degree NBTI-aware sleep junction transistor became designed to
reduce the ageing results on pMOS sleep-transistors, and also the period balance of the
power-gated circuits into consideration become improved. Wu and Marculescu [9] planned
a joint logic restructuring and pin rearrangement approach, that is based mostly all on
police investigation useful symmetries and junction transistor stacking effects. in addition,
they planned associate degree NBTI optimization technique that taken into thought path
sensitization [12]. In [10] and [11], dynamic voltage scaling and body-basing techniques
were planned to scale back power or extend circuit life. Those techniques, however, need
circuit amendment or don't provide optimization of distinctive circuits.
Traditional circuits use necessary path delay because the overall circuit clock cycle
with the intention to perform effectively. But the likelihood that the vital ways area unit
activated is low. In most cases, the trail delay is shorter than the vital route. For these
noncritical ways, the employment of the crucial path delay because the general cycle length
can bring on smart sized temporal arrangement waste. Therefore, the variable-latency style
becomes planned to scale back the temporal arrangement waste of ancient circuits. The
variable latency style divides the circuit into 2 parts: 1) shorter ways and 2) longer ways.
Shorter ways will execute effectively in one cycle, whereas longer ways want 2 cycles to
execute. while shorter ways area unit activated typically, the common latency of variable-
latency styles is more than that of ancient styles. As associate degree instance, various
variable-latency adders were projected victimization the hypothesis approach with
blunders detection and healing [13] – [15]. a brief path activation operates set of rules
become projected in [16] to boost the accuracy of the hold logic and to optimize the
performance of the variable-latency circuit. A coaching planning set of rules turned into
projected in [17] to schedule the operations on non-uniform latency sensible devices and
enhance the performance of terribly long instruction word processors. In [18], variable
latency pipelined multiplier factor design with a booth algorithmic rule became projected.
In [19], process-variant tolerant structure for arithmetic units was projected, wherever the
impact of method variation is taken into account to growth the circuit yield. Similarly, the
crucial ways area unit divided into 2 shorter ways that might be unequal and also the clock
cycle is close to the delay of the longer one. Those analysis styles are capable of reduce the
temporal arrangement waste of standard circuits to enhance performance, however they
failed to take into consideration the aging impact and could not alter themselves for the
period of the runtime. A variable-latency adder layout that considers the aging impact
become projected in [20] and [21]. However, no variable-latency multiplier factor style
that considers the aging impact and might alter dynamically has been accomplished.
1.1. Paper Contribution
In this paper, advise associate degree aging aware reliable multiplier factor style with
novel adaptive hold logic (AHL) circuit. The multiplier factor relies at the variable-latency
International Journal of Research
Volume VIII, Issue V, MAY/2019
ISSN NO:2236-6124
Page No:2372
technique and would possibly alter the AHL circuit to realize reliable operation below the
have a control on of NBTI and PBTI results. To be precise, the contributions of this paper
area unit summarized as follows:
1) Novel variable-latency multiplier factor structure with associate degree AHL circuit.
The AHL circuit will verify whether or not or not the enter designs need one or 2 cycles
and might regulate the deciding criteria to form sure that there's minimum according
performance degradation when sizable aging occurs;
2) Complete analysis and distinction of the multiplier’s overall performance underneath
totally different skip numbers to reveal the effectiveness of our projected structure;
3) associate degree aging-aware reliable multiplier factor technique that's acceptable for
large multipliers. Despite the very fact that the take a look at is accomplished in 4-, 8-, 16-
and 32-bit multipliers, our projected design will be effortlessly prolonged to massive
designs;
4) The experimental outcomes show that our projected structure with the sixteen × sixteen
and 32 × 32 Non-Redundant radix-4 Signed Digit (NR4SD) multipliers will acquire
exceptional overall performance development compared with the sixteen ×16 and thirty
two × thirty two Non-Redundant radix-four Signed-Digit (NR4SD) multipliers.
The paper is ready as follows. phase a pair of introduces the overture of the Non-Redundant
radix-four Signed-Digit (NR4SD) multiplier factor and NBTI/PBTI models. phase three
details the aging-aware reliable multiplier factor based totally at the Non-Redundant radix-
four Signed-Digit (NR4SD) multiplier factor. The experimental results and comparisons
area unit equipped in phase four. phase five concludes this paper.
2. Overture
2.1. Modified Booth Algorithm
Modified Booth (MB) may be a redundant radix-4 secret writing technique [22], [23].
Considering the multiplication of the 2’s complement numbers A, B, all consisting of n=2k
bits, B is painted in MB sort as:
B = < bn-1 . . . b0 >2’s= -b2k-122k-1+
22
0
k
i
bi2i
= <bk-1MB…b0
MB>MB=
1
0
k
j
bjMB22j (1)
Digits bjMB ∈ {- 2,-1, 0, +1, +2}, 0 ≤ j ≤ k-1, are formed as follows:
bjMB = -2b2j+1+b2j+b2j-1 (2)
In which b-1 = 0. each MB digit is delineated by victimization the bits s, one and two
(table 1). The bit s suggests if the digit is negative (s=1) or positive (s=0). One indicates if
absolutely the fee of a digit equals 1 (one=1) or not (one=0). two suggests if fully the fee
of a digit equals 2 (two=1) or not (two=0). the employment of those bits, to calculate the
MB digits bjMB as follows:
bjMB= (-1)sj.(onej + 2twoj). (3)
Equations (4) form the MB encoding signals.
sj = b2j+1; onej = b2j-1 ⊕ b2j;
twoj = (b2j+1 ⊕ b2j) ^ ~(onej):
International Journal of Research
Volume VIII, Issue V, MAY/2019
ISSN NO:2236-6124
Page No:2373
Table 1. Modified Booth Encoding
b2j+1 b2j
b2j-1
bjMB sj onej Twoj
0 0 0 0 0 0 0 0 0 1 +1 0 1 0 0 1 0 +1 0 1 0 0 1 1 +2 0 0 1 1 0 0 -2 1 0 1 1 0 1 -1 1 1 0 1 1 0 -1 1 1 0 1 1 1 0 1 0 0
2.2. Non-Redundant Radix-4 Signed Digit Algorithm
In this section have a bent to gift the Non-Redundant radix-4 Signed-Digit (NR4SD)
technique. As in MB form, the vary of partial merchandise is reduced to half. whereas
secret writing the 2’s complement amount B, digits bjNR - take thought of one in all four
values: ∈{-2, -1, 0, +1} or bjNR+ take one in all four values: ∈ {-1, 0, +1, +2} at the
NR4SD- or NR4SD+ formula, severally. solely four specific values ar used and now not 5
as in MB set of rules, which ends up in 0 ≤ j ≤ k-2. As have to be compelled to cowl the
dynamic vary of the 2’s complement kind, the most Brobdingnagian digit is MB encoded
(i.e.,bk-1MB∈{-2,-1, 0, +1, +2).The NR4SD- and NR4SD+ algorithms ar illustrated well
in Figure 1 and 2, severally
(a)
(b)
Figure 1. Block Diagram of the NR4SD-
Encoding Scheme at the (a) Digit and
(b) Word Level.
(a)
International Journal of Research
Volume VIII, Issue V, MAY/2019
ISSN NO:2236-6124
Page No:2374
(b)
Figure 2. Block Diagram of the NR4SD+
Encoding Scheme at the Word Level.
2.2.1 NR4SD- Algorithm
Step 1: ponder the initial values j = 0 and c0=0.
Step 2: Calculate the convey c2j+1 and also the add n+2j of a half of Adder (HA) with inputs
b2j and c2j (Figure 1a).
c2j+1 = b2j ^ c2j; n+2j = b2j ⊕ c2j (4)
Step3: Calculate the completely signed carry c2j+2 (+) and so the negatively signed add n-
2j+1 (-) of a Half Adder* (HA*) with inputs b2j+1 (+) and c2j+1 (+) (Figure 1a). The
outputs c2j+2 and n-2j+1 of the HA* relate to its inputs as follows:
2c2j+2 - n-2j+1 = b2j+1 + c2j+1:
The following Boolean equations summarize the HA*
operation:
c2j+2 = b2j+1 ^ c-2j+1, n-
2j+1 = b2j+1 ⊕ c2j+1.
Step 4: Calculate the value of the bjNR - digit.
bjNR- = -2n-
2j+1 + n+2j. (5)
Equation (5) shows results from the n-2j+1 is negatively signed and n+
2j is completely signed.
Step 5: j: = j + 1.
Step 6: If (j < k-1), then forward to Step two. If (j = k-1), inscribe that the foremost
important worth supported the MB technique and considering the 3 consecutive bits to be
b2k-1, b2k-2 and c2k-2 (Figure 1b). If (j = k), stop.
Table two shows however the NR4SD- digits area unit fashioned. Equations (6) show
however the NR4SD- encryption signals one+j, one-j and two-j of Table two area unit
generated.
one+j = ~ (n-
2j+1) ^ n+2j, one-
j = n-2j+1 ^ n+
2j,
two-j = n-
2j+1 ^ n+2j (6)
The minimum and most limits of the dynamic direct the NR4SD- kind area unit -2n-1 - 2n-
3 - 2n-5 -…. - 2 < -2n-1 and 2n-1 + 2n-4 + 2n-6 +…… + 1 >2n-1-1. The NR4SD- kind has
larger dynamic vary than the 2’s complement kind.
2.2.2 NR4SD+ Algorithm
Step 1: ponder the initial values j = 0 and c0=0.
Step 2: Calculate the carry completely signed worth c2j+1 (+) and also the negatively
signed worth add n-2j (-) of a HA* with inputs b2j (+) and c2j (+) (Figure 2a). The carry
c2j+1 and also the add n-2j of the HA* relate to its inputs as follows:
2c2j+1 - n-2j = b2j + c2j ,
The outputs of the HA* area unit will calculate at gate level within the following equations
as:
c2j+1 = b2j ˅ c2j , n-2j = b2j ⊕ c2j ,
International Journal of Research
Volume VIII, Issue V, MAY/2019
ISSN NO:2236-6124
Page No:2375
Step 3: Calculate the carry c2j+2 and also the add n+2j+1 of a HA with inputs b2j+1 and
c2j+1
c2j+2 = b2j+1 ^ c2j+1 , n+2j+1 = b2j+1 ˅ c2j+1 ,
Step 4: Calculate the worth of the bjNR+ digit.
bjNR+ = 2n+
2j+1 - n-2j (7)
Equation (7) results from the n+2j+1 is completely signed and n-
2j is negatively signed.
Step 5: j := j + 1.
Step 6: If (j < k-1), move to Step two. If (j = k-1), inscribe the foremost important worth
per MB technique and considering the 3 consecutive bits to be b2k-1, b2k-2 and c2k-2
(Figure 2b). If (j = k), stop.
Table three shows however the NR4SD+ digits area unit fashioned. Equations (8) show
however the NR4SD+ encryption signals one+j, one-j and two+j of Table four area unit
generated.
one+j = n+
2j+1 ^ n-2j;
one-j = n+
2j+1 ^ n-2j;
two+j = n+
2j+1 ^ n-2j: (8)
The minimum and most values of the dynamic direct the NR4SD+ kind area unit -2n-1 - 2n-
4 - 2n-6 -……. -1 < -2n-1 and 2n-1+2n-3+2n-5+…. +2 > 2n-1-1.
Table 2. Numerical Examples of the Encoding Techniques
2’s
Complement
10000000 10011010 01011001 01111111
Integer -128 -102 +89 +127
NR4SD− 2̄000 1̄1 2̄¯̄11 ̄2 2 2̄ 2̄1 200 1̄1
NR4SD+ 2̄000 2̄122 1121 200 1̄1
As determined within the NR4SD- encryption technique, the NR4SD+ kind has giant
dynamic selection than the two’s complement kind. pondering the eight-bit 2’s
complement selection N, Table two shows the restriction values -28 = -128, 28 - 1 =127,
and 2 normal values of N, and presents the MB, NR4SD- and NR4SD+ digits that end
product once creating use of the corresponding encryption ways to each value of N taken
into thought. A bar on top of the negatively signed digits so as to tell apart them from the
completely signed ones.
2.3. Pre-Encoded NR4SD Multipliers Design
The device style for the pre-encoded NR4SD multipliers is conferred in Figure 6. 2
bits at the instant area unit keep in ROM: n-2j+1, n+2j (Table 3) for the NR4SD- or n+2j+1,
n-2j (Table 4) for the NR4SD+ kind. On this way, will scale back the storage demand to
n+1 bits according to constant whilst the corresponding memory needed for the pre-
encoded MB theme is 3n/2 bits per constant. Consequently, the number of saved bits is the
image of that of the standard MB style, besides for the utmost widespread digit that desires
an additional bit as its so much MB encoded. Compared to the pre-encoded MB number,
within which the MB encryption blocks area unit unheeded, the pre-encoded NR4SD
multipliers would like extra hardware to get the values of (6) and (8) for the NR4SD- and
NR4SD+ kind, severally. The NR4SD encryption blocks of Figure four places into impact
the electronic equipment of Figure 5.
Partial product is currently given by the relation:
International Journal of Research
Volume VIII, Issue V, MAY/2019
ISSN NO:2236-6124
Page No:2376
P =A.B =COR+
1
1
PPk
j
J22J (9)
COR =
1
0
k
j
Cin,j22J + 2n(1 +
1
0
k
j
22j+1) (10)
Every partial product of the pre-encoded NR4SD- and NR4SD+ multipliers is
dispensed based totally on Figure 3b and 3c, severally, apart from the PPk-1 that
corresponds to the vastest digit. As this digit is in MB type, will use the PPG of Figure 3a
creating use of the sj bit. The partial merchandise, well weighted, and also the correction
period (COR) of (10) square measure fed right into a CSA tree. The input carry cinj of (10)
is calculated as cinj = two-j ^ one-j and cinj = one-j for the NR4SD- and NR4SD+ pre-
encoded multipliers, severally, based totally on Tables two and three. The carry save output
of the CSA tree is eventually summed the usage of a fast CLA adder.
(a)
(b)
(c)
(d)
(e)
Figure 3. Generation of the ith Bit pj,i of PPj for a) Pre-Encoded MB Multipliers, b) NR4SD−,c) NR4SD+Pre-
Encoded Multipliers, and d ) NR4SD−, e) NR4SD+Pre-Encoded Multipliers after reconstruction.
Figure 4. System Architecture of the NR4SD Multipliers.
International Journal of Research
Volume VIII, Issue V, MAY/2019
ISSN NO:2236-6124
Page No:2377
(a)
(b)
Figure 5 Extra Circuit Needed in the NR4SD Multipliers to
Complete the (a) NR4SD−and (b) NR4SD+ Encoding.
Table 3. NR4SD− Encoding
2’s
complement NR4SD−
form
Digit NR4SD-Encoding
b2j+1 b2j c2j c2j+2 n-2j+1
n+ 2j
bjNR- one+
j
one-
j
two- j
0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 1 +1 1 0 0
0 1 0 0 0 1 +1 1 0 0
0 1 1 1 1 0 -2 0 0 1
1 0 0 1 1 0 -2 0 0 1
1 0 1 1 1 1 -1 0 1 0
1 1 0 1 1 1 -1 0 1 0
1 1 1 1 0 0 0 0 0 0
Table 4. NR4SD+ Encoding
2’s complement NR4SD+ form Digit NR4SD+ Encoding
b2j+1 b2j c2j c2j+2 n+2j+1 n-
2j bjNR+ one+
j one- j Two+
j
0 0 0 0 0 0 0 0 0 0
0 0 1 0 1 1 +1 1 0 0
0 1 0 0 1 1 +1 1 0 0
0 1 1 0 1 0 +2 0 0 1
1 0 0 0 1 0 +2 0 0 1
1 0 1 1 0 1 -1 0 1 0
1 1 0 1 0 1 -1 0 1 0
1 1 1 1 0 0 0 0 0 0
3. Proposed Aging-Aware Multiplier Here advocates an aging-aware reliable multiplier factor layout with a unique adaptive
hold logic (AHL) circuit. The multiplier factor relies wholly at the variable-latency
approach and should regulate the AHL circuit to reap reliable operation beneath the have
an effect on of NBTI and PBTI effects.
To be distinctive, the contributions of this enterprise square measure summarized as
follows:
1) Novel variable-latency multiplier factor design with an AHL circuit. The AHL circuit
International Journal of Research
Volume VIII, Issue V, MAY/2019
ISSN NO:2236-6124
Page No:2378
will confirm whether or not the input designs need one or 2 cycles and may modify the
judgement criteria to confirm that there is minimum performance degradation when nice
obtaining aging happens;
2) Comprehensive analysis and comparison of the multiplier’s performance beneath one-
of-a-kind cycle intervals to reveal the effectiveness of our planned structure;
3)AN aging-aware dependable multiplier factor style methodology this can be applicable
for large multipliers. Despite the very fact that the experiment is completed in 16- and 32-
bit multipliers, our planned structure is effortlessly extended to giant designs;
3.1. Proposed Architecture
Figure 6 indicates our planned aging-aware multiplier factor design, which includes 2
m-bit inputs (m may be a positive number), one 2m-bit output, one NR4SD- or NR4SD+
multiplier factor, 2m 1-bit Razor flip-flops [21], and an AHL circuit.
Inside the planned design, the NR4SD multiplier factors could also be tested by the vary
of zero’s in either the multiplicand or multiplicator to expect whether or not the operation
needs one cycle or 2 cycles to complete. whereas enter patterns square measure random,
the quantity of zero’s and one’s within the multiplicator and multiplicand follows a
traditional distribution. Consequently, the employment of the amount of zero’s or one’s
because the judgement criteria leads to similar results. later on, the 2 aging-aware
multipliers could also be applied the employment of comparable structure, and also the
distinction between the two multipliers lies inside the enter signals of the AHL. Razor flip-
flops is wont to find whether or not or not temporal arrangement violations arise before
subsequent input pattern arrives.
Figure 6. planned design (md suggests that
multiplicand; mr suggests that multiplicator)
Figure 7 suggests the data of Razor flip-flops. A 1-bit Razor flip-flop includes a main
flip-flop, shadow latch, XOR gate, and mux. the most flip-flop catches the execution final
result for the mix circuit employing a standard clock signal, and also the shadow latch
catches the execution result the usage of a delayed clock signal, that's slower than the
regular clock signal. If the barred little of the shadow latch is not the same as that of the
most flip-flop, this means the course delay of the present operation exceeds the cycle
amount, and also the main flip-flop catches an incorrect result. If errors arise, the Razor
flip-flop can set the error signal to a minimum of one to give notice the system to re-execute
the operation and give notice the AHL circuit that a mistake has befell. Here use Razor
flip-flops to find whether or not an operation this is thought of to be a one-cycle sample
International Journal of Research
Volume VIII, Issue V, MAY/2019
ISSN NO:2236-6124
Page No:2379
can sure enough finish during a cycle. If now not, the operation is re-executed with a pair
of cycles. Despite the very fact that the re-execution could appear steeply-priced, the
general worth is low as a result of the re-execution frequency is low. additional data for the
Razor flip-flop could also be discovered in [21].
Figure 7. Razor flip flops
The AHL circuit is that the key facet within the ageing-aware variable-latency
multiplier factor. Figure 8 shows the main points of the AHL circuit. The AHL circuit
incorporates an aging indicator, judgement blocks, one mux, and one D turn-flop. The
aging indicator suggests whether or not or not the circuit has suffered tremendous overall
performance degradation thanks to the aging impact. The aging indicator is distributed
during a simple counter that counts quantity of errors over a particular amount of operations
and is reset to zero at the surrender of these operations. If the cycle amount is just too low,
the NR4SD multiplier factor is not able to end those operations effectively, inflicting
temporal arrangement violations. These temporal arrangement violations may well be
stuck by the Razor turn-flops, that generate error alerts. If errors happen frequently and
exceed a predefined threshold, it method the circuit has suffered vital temporal arrangement
degradation because of the aging impact, and also the aging indicator can output sign 1; in
the other case, it's getting to output zero to point the aging impact remains now not
intensive, and no moves are wanted.
Figure 8 Diagram of AHL (md means multiplicand,
mr means multiplicator)
The primary judgement block within the AHL circuit can output one if the quantity of
zeros within the multiplicand (multiplicator) is larger than n, and also the second judgement
block within the AHL circuit can output one if the number of zeros within the number
(multiplicator) is larger than n + 1. They’re every employed to make a decision whether or
not an enter sample needs one or 2 cycles, however handiest one in every of them is chosen
at a time. within the starting, the ageing impact is not vital, and also the aging indicator
produces zero, therefore the first judgement block is employed. when a time, frame
International Journal of Research
Volume VIII, Issue V, MAY/2019
ISSN NO:2236-6124
Page No:2380
whereas the aging impact can become vital, the second judgement block is chosen.
compared with the first judgement block, the second judgement block permits a smaller
variety of patterns to show bent on be one-cycle designs as a result of it needs a lot of zero’s
inside the multiplicand (multiplicator) the information of the operation of the AHL circuit
square measure as follows: whereas an input pattern arrives, each judgement blocks can
confirm whether or not or not the sample needs for one cycle or 2 cycles to finish and pass
each outcomes to the multiplexer.
The multiplexer device selects thought of one among each result supported the output
of the obtaining older indicator. Then an OR operation is accomplished between the top
results of the multiplexer device, and also the Q¯ signal is employed to make a decision
the input of the D flip-flop. once the pattern needs one cycle, the output of the multiplexer
device is one. The !(gating) sign turns into one, and also the input flip flops can latch new
information inside subsequent cycle. on the opposite hand, whereas the output of the
multiplexer device is zero, which suggests that the input sample needs for two cycles to
complete, the logic gate can output zero to the D flip-flop. Consequently, the !(gating)
signal is zero to disable the clock signal of the input flip-flops within the following cycle.
bear in mind that best a cycle of the input flip-flop could also be disabled as a result of the
D flip-flop can latch one within the future cycle.
The overall flow of our planned design is as follows: once input patterns arrive, the
NR4SD multiplier, and also the AHL circuit execute at the same time. In line with the
quantity of zero’s inside the number (multiplicator), the AHL circuit involves a call if the
enter patterns need one or 2 cycles. If the input pattern needs 2 cycles to complete, the AHL
can output zero to disable the clock signal of the flip-flops. Otherwise, the AHL can output
one for normal operations. once the NR4SD multiplier factor finishes the operation, the
result could also be passed to the Razor flip-flops. The Razor flip-flops check whether or
not or not there could also be the course shelve temporal arrangement violation. If temporal
arrangement violations occur, it approaches the cycle amount is not prolonged enough for
the present operation to complete which the execution final result of the multiplier factor
is wrong. consequently, the Razor flip-flops can output a blunder to inform the device that
the trendy operation needs to here execute the employment of cycles to create sure the
operation is correct. during this case, the additional re-execution cycles as a results of
temporal arrangement violation incurs a penalty to universal average latency.
But, our planned AHL circuit will accurately expect whether or not the input patterns
need one or 2 cycles in most instances. solely a couple of input designs could in addition
purpose temporal arrangement variations once the AHL circuit judges incorrectly. during
this scenario, a lot of re-execution cycles did now not turn out sensible sized temporal
arrangement degradation.
In précis, our planned multiplier factor style has three key capabilities. First, its miles
a variable-latency style that minimize the temporal arrangement waste of the noncritical
methods. 2nd, it's able to provide dependable operations even when the aging impact
happens. The Razor flip-flops hit upon the temporal arrangement violations and re-execute
the operations exploitation 2 cycles. Ultimately, our design will modify the share of 1-cycle
patterns to cut back performance degradation because of the aging impact. whereas the
circuit is aged, and lots of errors occur, the AHL circuit uses the second judgement block
to make a decision if an input is one cycle or a pair of cycles.
International Journal of Research
Volume VIII, Issue V, MAY/2019
ISSN NO:2236-6124
Page No:2381
4. Results A simulation conclusion for NR4SD multiplier factor is simulated in a very Xilinx ISE
14.1. These tools can facilitate to analysis its performance and calculate the ability, delay
and space. Snapshot is nothing however every and each moment of the appliance whereas
running shown in Figs. 9 to 10. These snapshots offer the clear read of application
developed. it'll be most helpful to the new peoples to grasp for the longer-term steps.
In the below Table 5-6 4*4,8*8,16*16,32*32 NR4SD multipliers are compared for space
of NR4SD multiplier factor. By observant the Table-5-6 it shows the clear read relating to
range of elements needed to develop explicit NR4SD multiplier factor supported input vary
with and while not AHL circuit.
Table 5. Device Utilization (area) Summary of NR4SD Multiplier with
AHL circuit
Logic Utilization 32-bit 16-bit 8-bit 4-bit
Number of Slice
Flip Flops
226 130 87 58
Number of 4
input LUTs
4608 1,050 272 106
Number of
occupied Slices
2,304 607 180 72
Average Fanout
of Non-Clock
Nets
3.97 3.41 3.15 2.86
Table 6. Device Utilization(area) Summary of NR4SD Multiplier
without AHL circuit
Logic Utilization 32-bit 16-bit 8-bit 4-bit
Number of 4 input
LUTs
2094 553 155 36
Number of
occupied Slices
1246 315 84 19
Number of bonded
IOBs
128 64 32 16
Average Fan-out of
Non-Clock Nets
4.44 4.04 3.60 3.26
Here in Table 7 it shows the time delay intense of NR4SD multiplier with and without
AHL circuit. By perceptive this it shows the vary of delay increasing because of increasing
variety of input ranges. Here the time intense is within the ranges of Nano seconds.
Table 7. Time Delay consuming of NR4SD Multiplier
Input
range
Delay with
AHL circuit
Delay without
AHL circuit
4-bit 8.068ns 12.063ns
8-bit 13.379ns 22.326ns
16-bit 26.755ns 43.790ns
32-bit 40.537ns 77.345ns
International Journal of Research
Volume VIII, Issue V, MAY/2019
ISSN NO:2236-6124
Page No:2382
Figure 9 shows the simulation result of NR4SD multiplier with AHL circuit. Here
Figure 9 (a) to (d) shows 4*4, 8*8, 16*16, 32*32 bit NR4SD multiplier response and Figs
10 shows the simulation result of NR4SD multiplier without AHL circuit. Here Figs 10 (a)
to (d) shows 4*4, 8*8, 16*16, 32*32 bit NR4SD multiplier response. In these figures shows
the input and output responses. Table 8 shows the 4*4, 8*8, 16*16, 32*32 NR4SD
multipliers power analysis with and without AHL circuit.
Table 8. Power analysis of NR4SD Multiplier
Input
range
Power(W)
without AHL
circuit
Power(W) with
AHL circuit
4-bit 0.271 0.209
8-bit 0.343 0.219
16-bit 0.508 0.233
32-bit 0.883 0.306
In the below Table 9-12 the 4*4 ,8*8,16*16,32*32 NR4SD multipliers square measure
compared for Error count supported no.of i/p and no.of Zero’s in number. Here applying
the various variety of inputs that square measure every which way generated and shows
{the variety|the amount|the quantity} of error counts supported number of zero’s gift within
every which way generated input bits.
Table 9. Error count based on no.of i/p and no.of Zero’s in
multiplicand of NR4SD 4 bit multiplier
No of i/p No of 0’s-2 No of 0’s-1
13000 5237 2891
11000 4429 2440
9000 3625 2007
5000 2000 1131
Table 10. Error count based on no.of i/p and no.of Zero’s in
multiplicand NR4SD 8 bit multiplier
No.of i/p 0’s-5 0’s-3 0’s-7
5000 2337 1333 3461
9000 4185 2374 4436
11000 5113 2896 5421
13000 6047 3436 6406
Table 11. Error count based on no.of i/p and no.of Zero’s in
multiplicand NR4SD 16 bit multiplier
No.of i/p 0’s-3 0’s-5 0’s-7 0’s-9 0’s-11 0’s-13
5000 40 467 1432 2162 2416 2422
9000 78 835 2569 3999 4379 4420
11000 102 1030 3151 4761 5359 5420
13000 119 1220 3723 5631 6339 6419
International Journal of Research
Volume VIII, Issue V, MAY/2019
ISSN NO:2236-6124
Page No:2383
Table 12. Error count based on no.of i/p and no.of Zero’s in
multiplicand of NR4SD 32 bit multiplier
No.of i/p 0’s-10 0’s-15 0’s-20 0’s-25
5000 189 1490 2402 2382
9000 334 2730 4322 4381
11000 413 3339 5284 5381
13000 494 3952 6246 6380
5. Conclusions This paper projected associate degree aging-aware reliable multiplier style with the
AHL. The multiplier is in a position to switch the AHL to mitigate overall performance
degradation because of increased delay. remember that additionally to the BTI impact that
will increase semiconductor delay, interconnect to boot has its aging issue, that's remarked
as electromigration. Electromigration happens while this density is high enough to cause
the drift of metal ions on the direction of electron flow. The metal atoms is frequently
displaced when a fundamental measure, and also the geometry of the wires can
modification. If a wire becomes narrower, the resistance and delay of the wire are going to
be dilated, and within the finish, electromigration may additionally cause open circuits.
This drawback is additionally a lot of severe in advanced manner technology as a result of
metal wires square measure narrower, and changes within the wire breadth can cause larger
resistance variations. If the aging results as a results of the BTI effect and electromigration
square measure thought-about put together, the delay and overall performance degradation
are going to be a lot of important. luckily, our projected variable latency multipliers is used
underneath the influence of each the BTI result and electromigration. Similarly, our
projected variable latency multipliers have a lot of less performance degradation as a result
of variable latency multipliers have a lot of less temporal order waste, however typical
multipliers ought to think about the degradation ensuing from each the BTI result and
electromigration and use the worst case delay because the cycle amount.
6. REFERENCES
[1] Ing-Chao Lin, Member, IEEE, Yu-Hung Cho, and YiMing Yang, ”Aging-Aware
Reliable Multiplier Design With Adaptive Hold Logic” IEEE Transactions On Very Large
Scale Integration (VLSI) Systems.
[2] S. Zafar et al., “A comparative study of NBTI and PBTI (charge trapping) in SiO2/HfO2
stacks with FUSI, TiN, Re gates,” in Proc.IEEE Symp. VLSI Technol. Dig. Tech. Papers,
2006, pp. 23–25.
[3] S. Zafar, A. Kumar, E. Gusev, and E. Cartier, “Threshold voltage instabilities in high-k
gate dielectric stacks,” IEEE Trans. Device Mater.Rel., vol5, no.1, pp.45–64, Mar.2005.
[4] H.-I. Yang, S.-C. Yang, W. Hwang, and C.-T. Chuang, “Impacts of NBTI/PBTI on
timing control circuits and degradation tolerant design in nanoscale CMOS SRAM,” IEEE
Trans. Circuit Syst., vol. 58, no. 6, pp. 1239–1251, Jun. 2011.
[5] R.Vattikonda, W.Wang, and Y. Cao, “Modeling and minimization of pMOS NBTI
effect for robust naometer design,” in Proc. ACM/IEEE DAC, Jun. 2004, pp. 1047–1052.
[6] H. Abrishami, S. Hatami, B. Amelifard, and M. Pedram, “NBTI-aware flip-flop
characterization and design,” in Proc. 44th ACM GLSVLSI, 2008, pp. 29–34.
[7] S. V. Kumar, C. H. Kim, and S. S. Sapatnekar, “NBTI-aware synthesis of digital
circuits,” in Proc. ACM/IEEE DAC, Jun. 2007, pp. 370–375.
[8] A. Calimera, E. Macii, and M. Poncino, “Design techniqures for NBTItolerant power-
International Journal of Research
Volume VIII, Issue V, MAY/2019
ISSN NO:2236-6124
Page No:2384
gating architecture,” IEEE Trans. Circuits Syst., Exp. Briefs, vol. 59, no. 4, pp. 249–253,
Apr. 2012.
[9] K.-C. Wu and D. Marculescu, “Joint logic restructuring and pin reordering against
NBTI-induced performance degradation,” in Proc. DATE, 2009, pp. 75–80.
[10] M. Basoglu, M. Orshansky, and M. Erez, “NBTI-aware DVFS: A new approach to
saving energy and increasing processor lifetime,” in Proc.ACM/IEEE ISLPED, Aug. 2010,
pp. 253–258.
[11] Y. Lee and T. Kim, “A fine-grained technique of NBTI-aware voltage scaling and body
biasing for standard cell based designs,” in Proc. ASPDAC, 2011, pp. 603–608.
[12] K.-C. Wu and D. Marculescu, “Aging-aware timing analysis and optimization
considering path sensitization,” in Proc. DATE, 2011, pp. 1–6.
[13] K. Du, P. Varman, and K. Mohanram, “High performance reliable variable latency
carry select addition,” in Proc. DATE, 2012, pp. 1257–1262.
[14] A. K. Verma, P. Brisk, and P. Ienne, “Variable latency speculative addition: A new
paradigm for arithmetic circuit design,” in Proc. DATE, 2008, pp. 1250–1255.
[15] D. Baneres, J. Cortadella, and M. Kishinevsky, “Variable-latency design by function
speculation,” in Proc. DATE, 2009, pp. 1704–1709.
[16] Y.-S. Su, D.-C. Wang, S.-C. Chang, and M. Marek-Sadowska, “Performance”
optimization using variable-latency design style,” IEEE Trans. Very Large Scale Integr.
(VLSI) Syst., vol. 19, no. 10, pp. 1874–1883, Oct. 2011.
[17] N. V. Mujadiya, “Instruction scheduling on variable latency functional units of VLIW
processors,” in Proc. ACM/IEEE ISED, Dec. 2011, pp. 307–312.
[18] M. Olivieri, “Design of synchronous and asynchronous variable-latency pipelined
multipliers,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 9, no. 4, pp. 365–376,
Aug. 2001.
[19] D. Mohapatra, G. Karakonstantis, and K. Roy, “Low-power processvariation tolerant
arithmetic units using input-based elastic clocking,” in Proc. ACM/IEEE ISLPED, Aug.
2007, pp. 74–79.
[20] Y. Chen, H. Li, J. Li, and C.-K. Koh, “Variable-latency adder (VL-Adder): New
arithmetic circuit design practice to overcome NBTI,” in Proc. ACM/IEEE ISLPED, Aug.
2007, pp. 195–200.
[21] Y. Chen et al., “Variable-latency adder (VL-Adder) designs for low power and NBTI
tolerance,” IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 18, no. 11, pp. 1621–
1624, Nov. 2010.
[22] Z. Huang, “High-level optimization techniques for low-power multiplier design,”
Ph.D. dissertation, Department of Computer Science, University of California, Los
Angeles, CA, 2003.
[23] Z. Huang and M. Ercegovac, “High-performance low-power left-to-right array
multiplier design,” IEEE Trans. Comput., vol. 54, no. 3, pp. 272–283, Mar. 2005.
International Journal of Research
Volume VIII, Issue V, MAY/2019
ISSN NO:2236-6124
Page No:2385