A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and...

21
1 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved. Copyright © 2013 SuVolta, Inc. All rights reserved. 1 A 50% Lower Power ARM Cortex CPU using DDC Technology with Body Bias David Kidd August 26, 2013

Transcript of A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and...

Page 1: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

1 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved. Copyright © 2013 SuVolta, Inc. All rights reserved. 1

A 50% Lower Power ARM Cortex CPU

using DDC Technology with Body Bias David Kidd

August 26, 2013

Page 2: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

2 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

DDC transistor and PowerShrink platform overview

ARM Cortex-M0 implementation and results

Application of DDC technology to HKMG nodes

Agenda

Page 3: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

3 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

Deeply Depleted Channel™ (DDC) Transistor

1

Undoped/lightly doped region

Improved matching (low RDF)

Higher mobility ( ID,lin)

Higher analog gain ( gm)

2

3

VT setting offset region

Multiple VT on chip

Screening region

Superior scalability

Improved drive current ( Ieff DIBL)

Higher analog gain ( Rout)

Strong body coefficient (corner pull-in)

3

2

1

One example, not drawn to scale

Source Drain

Gate Gate

Page 4: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

4 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

Corner Pull-in with Body Bias N

LV

t P

LV

t

Slow

Fast

Typical

Corners pull in to typical

ION ION

ION ION

IO

FF

IO

FF

IO

FF

IO

FF

With Body Bias

Slow

Fast

Typical

No Body Bias

Page 5: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

5 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

DDC PowerShrink Platform: Save Active and

Leakage Power Using Body Bias

Delay

Le

ak

ag

e P

ow

er POR

TT

FF

SS

log

TT

FF

SS

DDC

Same VDD, similar VT,sat

Delay

Le

ak

ag

e P

ow

er POR

TT

FF

SS

log

TT

FF

SS

DDC

Lowered VDD, similar VT,sat

Save active power

Lower DDC VDD

Delay

Le

ak

ag

e P

ow

er

POR

TT

FF

SS

log

DDC

Lowered VDD, lowered VT,sat

VT shifted and corners pulled in by body bias

Match delay

Lower DDC VT,sat

Leakage power saving

DDC performance is superior at matched VDD

Body bias reduces leakage power

1.2V 1.2V

1.2V 0.9V

1.2V

0.9V

Matched delay

DDC performance degrades due to lowered VDD

Page 6: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

6 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

ARM Cortex-M0 CPU Bias Targeting

375MHz 350MHz

325MHz Speed

Leakage (delta to optimal)

Target 350MHz Many bias voltages

achieve the target frequency but power is not optimal at all biases

Measured ARM Cortex-M0

Speed and power

response to bias

PMOS Bias (mV)

NM

OS

Bia

s (m

V)

PMOS Bias (mV)

NM

OS

Bia

s (m

V)

Too Slow

TARGET

Increasing Leakage

Page 7: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

7 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

Process Monitor independently measures NMOS and PMOS corners

Digital readout correlates to corner

Bias voltage is calculated from readout

Process Monitor Bias Targeting (55nm Measured)

-0.8

-0.6

-0.4

-0.2

0

0.2

-0.8 -0.6 -0.4 -0.2 0 0.2

PM

OS B

ias (V)

NMOS Bias (V)

400

450

500

550

600

650

700

750

NMOS target

FF

TT

SS

NMOS measured

PMOS measured

FF

TT

SS

PMOS target

Process Monitor Corner Detection

Bias Target from Monitor

Bias target calculated from monitor readout

Dig

ita

l R

ea

do

ut

Page 8: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

8 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

Leakage Correction

Performance is corrected to 13% faster than target

Fast corner leakage is corrected to the same point as typical

Process Monitor Bias Targeting (55nm Measured)

100%

110%

120%

130%

140%

150%

160%

170%

Zero Bias

Target Bias

TT

SS

FF

0%

200%

400%

600%

800%

1000%

Zero Bias

Target Bias

TT

SS

FF

Target

Re

lati

ve

Sp

ee

d

Re

lati

ve

Po

wer

Performance Correction

Page 9: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

9 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

0.0

1.0

2.0

3.0

4.0

5.0

6.0

7.0

8.0

9.0

0 100 200 300 400

Po

we

r (m

W)

Frequency (MHz)

Baseline FF

Baseline SS

DDC FF

DDC SS

Corner Variation

50% power vs baseline at FF (leakage and dynamic)

Matched performance at SS

PowerShrink Platform: ARM Cortex-M0 CPU

1.2V

0.9 V

FF

SS

FF

SS

65nm CORTEX – M0

85 C

DDC corner variation pull-in

with bias

50%

Page 10: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

10 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

0.0

2.0

4.0

6.0

8.0

10.0

12.0

200 300 400 500 600

Po

we

r (m

W)

Frequency (MHz)

BaselineDDC 0.9V

0.9V

Matched performance at 50% power 0.9V

+35% performance at matched power 1.1V

+55% performance at matched voltage 1.2V

Tunable Power / Performance

85 C Performance: SS corner Power: FF corner

65nm CORTEX – M0 50% power

Matched Performance

100% power

1.0V

1.2V

1.1V 1.2V

35% Performance Increase at Matched

Power

55% Performance Increase

Page 11: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

11 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

Minimal routing resources used for bias

network Route in columns

Double-space bias wires for reliability

1 PMOS wire, 1 NMOS wire

ARM Cortex-M0 Body Bias Network Implementation

Bias Columns

Layer Available Tracks Bias Tracks

M2 300 4

M4 300 0

Total 600 4 (0.7%)

Vertical Track Usage per Column

Page 12: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

12 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

Lower SRAM VDD-min Enables Low-VDD Operation

VDD-min of 8Mb SRAM lowered by 150mV at 125°C

Production 55nm SRAM data

55nm silicon

Target Yield

Voltage (V)

Yie

ld (

%)

Page 13: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

13 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

Lower SRAM Retention Power

50% less leakage in Standby and Retention modes at nominal VBS

More than 5X less leakage in Retention mode with VBS = -0.6V

Baseline Retention Power

DDC Retention Power

87% lower

53% lower

65nm silicon

Page 14: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

14 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

Silicon-Gate

DDC Technology in Silicon Gate and HKMG

Technology Nodes Proven, Legacy Advanced

Deeply Depleted Channel™

(DDC) Technology

DDC™ PowerShrink™

Memory

DRAM Other

DDC™ PowerShrink™

28nm, 20nm

CPUs + GPUs Networking

SoC FPGA

DDC™ DesignBoost

28nm, 20nm

CPUs + GPUs Networking

SoC FPGA

DDC™ PowerShrink™

90nm, 65nm, 55nm, 40nm

Imaging CPU

WiFi

Mixed Signal

Display Drivers GPS

RF

Metal-Gate

Page 15: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

15 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

DDC Technology Modular Implementation

Page 16: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

16 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

DDC DesignBoost: Target Leakage Power

Savings

Delay

Le

ak

ag

e P

ow

er

POR

TT

FF

SS

log

TT

FF

SS

DDC

Same VDD, similar VT,sat

Delay

Le

ak

ag

e P

ow

er

POR

TT

FF

SS

log

SS

DDC

Same VDD, higher VT,sat

Save Leakage Power

Increase DDC VT,sat

TT

FF Leakage power saving

DDC DesignBoost

DDC performance is superior at matched VDD

0.9V

0.9V

0.9V 0.9V

Matched performance

Improved performance

Recover Leakage by Trading off Performance

Page 17: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

17 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

LVT and SLVT devices contribute a large portion of the leakage power at advanced nodes

Leakage power reduction is a key value adder for high performance designs

DDC DesignBoost Motivation

25C

HVT

SVT

LVT

SLVT

0.001

0.01

0.1

1

10

100

0.4 0.6 0.8 1 1.2 1.4

No

rmal

ize

d L

eak

age

Normalized Speed

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

LowPerformance

Leakage

LowPerformance

Usage

HighPerformance

Leakage

HighPerformance

Usage

HVT

SVT

LVT

SLVT

Page 18: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

18 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

DDC DesignBoost Power Savings in HKMG

LVT Devices

Re

lati

ve P

ow

er

Leakage Power Active Power

DDC DesignBoost Target matched performance to reduce Ioff

Baseli

ne

Baseli

ne

DD

C

DD

C

SS Speed FF Power

Small Dynamic Savings Due to better device capacitance

Big Leakage Savings

Page 19: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

19 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

All transistors are DDC transistors

Active Dominated Leakage Dominated

Po

we

r Sa

vin

gs

0.9V DesignBoost

0.9V with Body Bias

0.8V PowerShrink

125C SS Speed FF Power

HKMG design 10% activity factor

DDC DesignBoost No design change

Body Bias improves

power savings

PowerShrink platfrom For active power

dominated circuits

HKMG Power Analysis

100% SVT 100% LVT

0.9

V D

es

ign

Bo

os

t +

Bo

dy B

ias

0.9

V D

esig

nB

oo

st

0.8

V P

ow

erS

hri

nk

Page 20: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

20 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved.

ARM Cortex-M0 validated at 65nm Matched performance at 50% power

+35% performance at matched power

+55% performance at matched voltage

SRAM validated at 65nm and 55nm 150mV Vmin reduction in production

55nm SRAM

Less than 50% static power in standby and retention

DDC technology demonstrated at advanced HKMG nodes Lower static power at matched

performance

Lower active power with body bias

Summary

Page 21: A 50% Lower Power ARM Cortex CPU using DDC … · DDC Lowered VDD, lowered V T,sat V T shifted and corners pulled in by body bias Match delay Lower DDC V ... Slide 1 Author: Kristin

21 HOTCHIPS 2013 Copyright © 2013 SuVolta, Inc. All rights reserved. Copyright © 2013 SuVolta, Inc. All rights reserved. 21