Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research),...

34
Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    220
  • download

    1

Transcript of Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research),...

Page 1: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Adaptive Supply and Threshold Circuits and Applications

Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz

Page 2: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Why Adaptive Vdd/Vth?

• No one transistor meets all needs• This transistor is too leaky…• This transistor is too slow…

• Modern processes usually have lots of device options, but they still have a set of fixed characteristics• Optimum characteristics often environment dependent and

hence vary with time, workload, etc.

• May want to tune both Vdd and Vth on a block-by-block basis to minimize total energy

• Supply can be set/controlled with regulators - how about Vth?

Page 3: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Body Biasing

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.60

0.5

1

1.5

2

2.5

3

3.5

Normalized FrequencyN

orm

aliz

ed

Po

wer

Simulated Power vs. Frequency

RBB (-1V)RBB (-0.5V)ZBBFBB (0.25V)FBB (0.5V)

• Unfortunately, body bias is not very effective in modern technologies• Less than 100mV shift in Vth across full range of bias

• Hardly any effect on power vs. frequency (traced by sweeping Vdd)

• Not very promising…

• Usual approach for adjusting Vth: body bias

VddVbp

Vbn

VddVbp

Vbn

Page 4: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Adjusting Vth with Skewed Supplies

• Assume that delay (and power) is dominated by edges in a particular direction• We’ll come back to the other edges shortly

• “Vth” can be adjusted by skewing supplies of pos. edge gates (PMOS) vs. neg. edge gates (NMOS)• Notation: ΔVth>0 means device’s Vth reduced by ΔVth

Vdd

Vth

Vth

Vdd

Vss+Vth

Vdd+Vth

Vss

Vdd

Vss+Vth

Vdd+Vth

Page 5: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Power vs. Frequency Preview

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.60

0.5

1

1.5

2

2.5

3

3.5

Normalized Frequency

No

rmaliz

ed

Po

wer

Simulated Power vs. Frequency

RBB (-1V)RBB (-0.5V)ZBBFBB (0.25V)FBB (0.5V)

Ring Oscillator w/Body Bias Skewed Supply Ring Oscillator

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20

0.5

1

1.5

2

2.5

Normalized Frequency

No

rmal

ized

Po

wer

Vth

=-0.1VV

th=0V

Vth

=0.1VV

th=0.2V

Page 6: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Outline

• Skewed Supply Logic Circuits

• Adaptive Implementation

• Application to Minimum Energy Systems

• Conclusions

Page 7: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

What About the Non-Critical Edge?

Vdd - Vth

pVss

pVdd

nVss

nVdd

pVss

pVdd

Vdd - Vth

• Performance benefit negated if need to wait for the other (slow) edge

pVss

pVdd

nVss

nVdd

pVss

pVdd

Vdd

Vth

Vdd

Vth

ΔVth > 0: ΔVth < 0:

• Skewed supply directly shifts Vth of non-critical devices in the opposite direction as the critical devices

• Need to return to default state so that leakage isn’t set by (reduced threshold) non-critical devices

Page 8: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Self-Resetting Skewed Supply Gates

• Can be extended to use delayed self-reset (i.e. interlock mechanism to guarantee input pulses overlap)• Another option: use self-resetting critical path replica to

generate en/en_b signals for every level of logic gates

• Keep gates in default state most of the time: self-reset

Page 9: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Self-Resetting Gates Challenges

• Pulse-width needs to (at least somewhat) track delay across Vdd and ΔVth

• Don’t want pulses to disappear• Don’t want reset, re-enable delay to become critical path

nVdd

nVss

N-Stack

P-Stack

nVdd

en

outn

• Maintain proper operation at high |ΔVth|

• ΔVth>0: P-stack in subthreshold, N-stack Vth≈0

• Gate could fire even when inputs unasserted

• ΔVth<0: subthreshold N-stack vs. Vth≈0 P-stack leakage

Page 10: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Self-Resetting N-Gate: Keeper

• Keeper structure helps boost voltage margin for ΔVth>0 (as opposed to a weakened P-stack connected to inputs)• Unfortunately, can’t really make keeper strength track because gate needs to reset

to nVdd.

• (Unless use yet another supply nnVss…)

• For ΔVth<0, need to make sure N-stack can always overpower the keeper

• May want to go back to P-stack connected to inputs as “keepers”

Page 11: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Self-Resetting N-Gate: Reset Path

• Pulse width tracks delay by alternating n/p supplies on reset path• Falling edge outn traverses “critical” edge through reset gates

• However, the re-enabling edge will then have the “non-critical” delay• For ΔVth>0, re-enable path will be slow – NOR gate cuts path in half

• For high ΔVth even this might not be enough – another option next

• For ΔVth<0, re-enable edge will be fast – need to make sure the gate fully resets (or at least turns on the keeper).

Page 12: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Self-Resetting N-Gate: Reset Path (#2)

• Even with NOR gate, At high ΔVth re-enable delay can be VERY slow• Devices in that direction can easily be in subthreshold

• Break the “rules” and have gates on re-enable path swing from nVss to pVdd

• Often costs less power than reducing the fanout

• Also allows evaluate device at bottom of N-stack to see ΔVth

Page 13: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Self-Resetting P-Gate

• Could be mirrored version of N-gate, but because of higher NMOS drive current (and sometime lower Vth) input “keeper” can still be relatively effective (even when ΔVth>0)

pVdd

P-Stack

nVdd

nVss

pVdd

pVss

pVdd

pVss

nVdd

nVss

outp

pVsspVss

N-Stack

Page 14: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Skewed Supply Oscillator• Use ring oscillator as a test structure to characterize

the gates • Helps find issues that arise at various operating points

• Pulse “chases its own tail”:

Page 15: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Operating Range and Power vs. Frequency

0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 20

0.5

1

1.5

2

2.5

Normalized Frequency

No

rmaliz

ed

Po

wer

Vth

=-0.1VV

th=0V

Vth

=0.1VV

th=0.2V

Vdd = 1V

Vdd = 600mV

Vdd = 500mV

• Simulation results from a 90nm triple well technology• Since skewing supplies for threshold adjustment triple well

isn’t a requirement

• Gates designed for ΔVth>0 operation

• Without heavy optimization covers ~300mV range in ΔVth

Page 16: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Outline

• Skewed Supply Logic Circuits

• Adaptive Implementation

• Application to Minimum Energy Systems

• Conclusions

Page 17: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Adaptive System Block Diagram

• Roles/bandwidths of Vdd/Vth loops can be flipped

• Just want separated bandwidths to minimize stability issues

• More complicated algorithms can do both at same speed, but probably not needed (environment changes usually slow)

Frequency-LockedLoop

Power MinimizationLoop

Logic Block

flogic

Critical PathReplica Oscillator

fref

Plogic

Vdd

Vth

• “High” bandwidth FLL enforces frequency constraint by setting Vdd

• “Low” bandwidth threshold loop attempts to minimize power through Vth

Page 18: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Generating the Power Supplies:Switching DC-DC Converters

• Switching DC-DC converters desirable for efficiency• But hardest to integrate• Want external inductors or efficiency may suffer

• Power measurement (for Vth loop) can be tricky• Could use extra series resistor, but again costs efficiency

• May get that resistance from an on-chip inductor anyways

+-

Vsup

Vdd

Vss

Page 19: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Generating the Power Supplies:On-Chip Linear Regulators

• Efficiency could greatly suffer however• Especially if get only one Vsup to generate both n and p supplies

+

-

+

-

Vref_dd

Vref_ss

Vsup_dd

Vsup_ss

Vdd

Vss

• On-chip linear regulators most desirable for integration• High bandwidths easy to achieve

• Easy to measure power• External supply fixed, just mirror output

device current

Page 20: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Generating the Power Supplies:Hybrid Architecture

• “Best of both worlds”• High bandwidth, easy to integrate on-chip linear regulators• Adjust external switching regulators to just meet linear

regulators’ dropout (and minimize loss)

• To minimize external component count could share external supplies across multiple blocks • Of course at some cost in efficiency however

SwitchingRegulators

LinearRegulators

Vsup_dd

Vsup_ss

Vdd

Vss

Vdropout

+

-

Vref_dd Vref_ss

Vsup

+-

Page 21: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

FLL Implementation (1)

• Charge-pump based design• Pulse generators + charge pump = analog counter

• nVss serves as global reference (i.e. chip Vss or “0”)• Control loops generate the other three rails

Pulse Generator

up_b

dn_b

nVss

nVss

Vsup_dd

1/N

fref

pVss

Critical PathOscillator

pVdd

nVdd

nVss

Vc_sup

flogic

ffb

pVdd & nVddRegulators

pVdd & nVddRegulators

Vc_thresh (from power loop)

Page 22: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

FLL Implementation (2):Regulators

• For simplicity used on-chip linear regulators

• Vsup_dd, Vsup_ss – external supplies w/headroom for regulators• Vsup_dd ≈ Vdd_max+|ΔVth|max+150mV

• Vsup_ss ≈ -150mV

• Vc_sup sets pVdd, pVss set by power loop• Shifted ground on nVdd regulator

feedback makes nVdd = pVdd – pVss

+

-

nVss

+

-

pVss

pVdd

Vsup_dd

Vsup_dd

nVdd

+

-Vc_thresh

Vsup_ss

Vc_sup Vgp_d

Vgn_d

Vgp_s

Page 23: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Power Minimization Algorithm

• Optimization problem: • min{Vdd,Vth} Pavg(Vdd,Vth)

s.t. f = ftarg

• FLL enforces constraint and eliminates Vdd as a variable

• Set by ΔVth and operating frequency

• Simplified minimization algorithm:• Step 1: Increase ΔVth by 1 step; measure average power

• Step 2: Decrease ΔVth by 1 step; measure average power

• Step 3: Move in direction of lower average power, repeat Step 1

• Works as long as P vs. Vth curve has no locally flat regions (except global minimum)

• Hard to show analytically, but intuitively (and numerically) true

0.05 0.1 0.15 0.2 0.25 0.3

1

1.2

1.4

1.6

1.8

2

2.2

2.4

2.6

Vth

No

rmal

ized

En

erg

y

Page 24: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Power Loop Implementation:Measuring Power

• Mirror regulator current to measure block’s current• Voltage fixed, so just add currents from pVdd and nVdd to find total

power (current)

• Want more processing if external supply is not fixed• Multiply Itot by Vsup_ext

• If Vsup_ext is digitally controlled multiplication could be done in current domain by programming output mirroring ratio M

Vsup_dd

Vgn_d

Vsup_dd

Vgp_d

IpVdd InVdd

Itot

M

Imirr

1Vmirr

Page 25: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Power Loop Implementation:Minimization Algorithm (I)

• Step 1: Pulse upΔ (+ΔVc), enable dnint (integrate –Imirr)

• Step 2: Pulse dnΔ (–ΔVc), enable upint (integrate +Imirr)

• Step 3 happens automatically since:• Vc_th[k+1] > Vc_th[k] if Imirr(+ΔVc)<Imirr(-ΔVc)

• Vc_th[k+1] < Vc_th[k] if Imirr(+ΔVc)>Imirr(-ΔVc)

Page 26: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Power Loop Implementation:Minimization Algorithm (II)

• To keep polarities correct need IΔtΔ > Imirrtint

• May need small pump currents and/or large capacitors, especially if shooting for small ΔVc

Page 27: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Outline

• Skewed Supply Logic Circuits

• Adaptive Implementation

• Application to Minimum Energy Systems

• Conclusions

Page 28: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Minimum Energy Systems with Global Supply

• Supply set by global activity vs. leakage energy ratio• But blocks may exhibit wide variances in their activities• Even a single block’s activity may vary with time (e.g. static vs.

dynamic MPEG frame)

Adder Afglob

Memory

Vdd_glob Vdd_glob Vdd_globVdd_glob

Adder Bfglob

Multiplier Afglob

Multiplier Bfglob

Page 29: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Minimum Energy Systems With Adaptive Supplies

• In subthreshold, minimum energy is independent of Vth

• Vth increases: both frequency and leakage decrease, net energy stays the same

• Can get minimum energy by adjusting each Vdd, but:• Each block would have to operate at its own frequency…

Adder AfaddA

Memory

Vdd_addA Vdd_addB Vdd_multAVdd_multB

Adder BfaddB

Multiplier AfmultA

Multiplier BfmultB

Page 30: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Minimum Energy Systems with Adaptive Supplies and Thresholds

• Controlling both Vdd and Vth allows blocks to achieve minimum energy at arbitrary operating frequency• All blocks can then operate at the same (system determined)

frequency• Much simpler system to design and interface with than only

adaptive supply…

Adder AVth_addA, fglob

Memory

Vdd_addA Vdd_addB Vdd_multAVdd_multB

Adder BVth_addB, fglob

Multiplier AVth_multA, fglob

Multiplier BVth_multB, fglob

Page 31: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Outline

• Skewed Supply Logic Circuits

• Adaptive Implementation

• Application to Minimum Energy Systems

• Conclusions

Page 32: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Conclusions• Skewed supplies a promising approach to allow direct

control/optimization of effective device thresholds• Still lots of issues to work out of course more research to be

done

• For low-power applications, combined adaptation of Vdd and Vth can achieve per-block minimum energy while maintaining global synchronicity• No need for software directives; chip constantly adapts itself to

keep energy dissipation as low as possible

• This technique is attractive in high-performance applications as well• Improvements in power efficiency increased performance in a

heat-dissipation limited environment

Page 33: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Bonus Slides

Page 34: Adaptive Supply and Threshold Circuits and Applications Elad Alon, Kevin Nowka (IBM Research), Vladimir Stojanović, Mark Horowitz.

Digital Control Implementation

• Particularly in advanced technologies, can be difficult to get charge pumps to behave as desired• Both FLL and power loop well suited to digital control

implementations

• FLL:• Frequency detect is really easy – just count• DAC just needs to enough resolution to keep dither small

• Power loop:• Power ADC: Use mirrored block current as supply for current-

starved ring, count• Really need (effectively) monotonic DAC however