Design of large area, pixelated ASICs for picosecond...

Post on 12-Oct-2020

9 views 0 download

Transcript of Design of large area, pixelated ASICs for picosecond...

Design of large area, pixelated ASICs for picosecond timing

applications

E. Charbon EPFL

edoardo.charbon@epfl.ch

1

Outline

•  Why large area TOA ASICs? •  TDC Basics & Architectures •  Case Studies •  ASIC vs. FPGA •  3D Integration •  Quantum computing •  Conclusions

2 © 2018 Edoardo Charbon

Why large area TOA ASICs?

3

Large Area TOA Applications

•  Optical rangefinder on-pixel (3D camera) •  Fluorescence lifetime imaging microscopy (FLIM) •  Fluorescence Correlation Spectroscopy (FCS) •  Detection of a scintillation shower upon gamma

photon detection in PET •  High energy physics (HEP)

4 © 2018 Edoardo Charbon

HEP

•  Extremely harsh conditions –  Large ionizing doses, Gamma –  Protons, Neutrons

•  Very demanding specs –  TOA resolutions in ns to ps –  Ranges of µs to ms

•  Very low dead times –  Events spaced ns –  Gevents/s

•  Large number of points-of-measurement –  Thousands to million points –  Large surfaces

5 © 2018 Edoardo Charbon

Example (Courtesy: Artur Apresyan)

6 © 2018 Edoardo Charbon

Another Example

•  In time-of-flight PET one needs –  A large number of point-of-measurement –  A high timing resolution

•  Synchronization is extremely important to enable coincidence computation and rejection of singles

7 © 2018 Edoardo Charbon

TDC Basics

8

TDC Objective

But, in most cases:

START

STOP

Time scale

Hits

t

9 © 2018 Edoardo Charbon

TDC Symbol

START

STOP

DATA

10 © 2018 Edoardo Charbon

Basic Definitions •  Bin size or LSB – τ (sec)

–  Minimum distance between time events that can be resolved •  Accuracy & precision (sec)

–  Time-invariant offset –  Time-varying drift

•  Range (sec) –  Maximum time difference that can be measured

•  Conversion rate (MS/sec) •  Latency (sec) •  Non-linearities

–  Differential non-linearity (DNL) –  Integral non-linearity (INL)

•  Single-shot accuracy (sec) 11 © 2018 Edoardo Charbon

Input Non-Idealities

•  Signals are non-Dirac –  Non-zero rise time –  Non-zero width

•  START-STOP sequence is not regular •  Signals have jitter in

–  Time –  Amplitude

•  Temperature •  Supply variations

12 © 2018 Edoardo Charbon

TDC Non-Idealities

13 © 2018 Edoardo Charbon

Time difference between START and STOPO

utpu

t dig

ital c

ode

(LSB

)

Ideal response

Actual response

LSBti

0

1

2

3

4

5

6

7

8DNLi = LSB

ti - LSB

INLi = ∑ DNLji

j=0

Digital code [LSB]

Hist

gram

(a) (b)

Standard deviation, FWHM => single-shot precision

Ideally only one digital code

DNL, INL

•  Integral non-ideality (INL) is the integral of DNL •  Depending upon definition, starts and ends at 0

14 © 2018 Edoardo Charbon

How to Measure: Density Test

•  Poisson distributed uniform START generator •  Measure statistics of TDC measurements per bin •  Normalize to average counts, differences are DNL

points

time

counts

avg. counts

15 © 2018 Edoardo Charbon

Single-Shot Accuracy (SSA)

•  Repeat measurement of single time-of-arrival and construct histogram

•  Derive statistics by Gaussian fitting and calculation of FWHM or σ or 3σ.

time

FWHM

TOA centroid

16 © 2018 Edoardo Charbon

Optical Tests

•  Density test: free running SPAD •  Single-shot experiment:

–  Histogram Δti, i=[1…N] (time-correlated single-photon counting – TCSPC)

GAPD or SPAD

STOP

DATA START

Clock

17 © 2018 Edoardo Charbon

Figures of Merit

•  Power, LSB, DNL/INL, SSA, area •  Temperature stability •  Cross-talk

65 90 130 180 350Tech. (nm)

Area

(mm

2 )

0.001

0.01

10

0.1

1

[18]Conventional TDC array

[1]

[8]

[12]

[15]

[11]

[3]

[2][5]

[16][17][9]

[14]

[7][6]

[10]

18 © 2018 Edoardo Charbon

Architectures

19

The Simplest: A Counter

•  Resolution: τ = 1/fclock

•  Conversion rate = 1/latencySTART STOP

CLOCK DATA

START

STOP

CLOCK

DATA VALID BUSY IDLE 20

Counter – Register

•  Advantage: fast counter can be shared among many HIT lines

•  Fast registers easier to build RESET

HIT

CLOCK

DATA REGISTER

COUNTER

21 © 2018 Edoardo Charbon

Delay Chain

•  Non-Inverting Gates

Register

START

STOP

START(Φ0)

STOP

DATA VALID BUSY IDLE

Φ1

ΦJ+1

ΦJ

J J+1

1 1 1 1 1 0 0 0

τ

22

Delay Chain

•  Resolution: τ = delay element •  Conversion rate = 1/latency •  Latency = N×τ•  Need a thermometer decoder: Nèlog2(N) •  Issues: metastability, bubbles

1 1 1 0 1 0 0 0

23 © 2018 Edoardo Charbon

Phase Interpolator

•  Non-inverting gates

Register

Clock

HIT

CLOCK(Φ0)

HIT

DATA VALID BUSY IDLE

Φ1

ΦJ+1

ΦJ

J J+1

1 1 1 1 1 0 0 0

τ

24

Phase Interpolator

•  Resolution: τ = delay element •  Conversion rate = 1/latency •  Latency = N×τ•  Need a thermometer decoder: Nèlog2(N) •  Issues: metastability, no bubbles

25 © 2018 Edoardo Charbon

Vernier Lines

•  Resolution: τ = τslow - τfast •  Conversion rate = 1/latency •  Latency = N×τslow

•  Need a thermometer decoder: Nèlog2(N) •  Issues: metastability, matching

START

STOP

D D D D D D D D

τslow

τfast

N 1

26 © 2018 Edoardo Charbon

Pulse Shrinking

•  Resolution: τ = τrise - τfall •  Conversion rate = 1/latency •  Latency = N×τslow

•  Need a thermometer decoder: Nèlog2(N) •  Issues: matching

START N 1 STOP

START STOP

Asymmetric rise, fall time

27 © 2018 Edoardo Charbon

Ring Oscillators

•  Resolution: τ = delay element •  Conversion rate = 1/latency •  Latency = N×τ•  Need a thermometer decoder: Nèlog2(N) •  Issues: metastability, matching, asymmetric load

START

COUNTER To extend range

τ

N 1 STOP

START STOP R/O signal

Uncertainty region

28

Actual Implementation

•  Fully differential •  Partial propagation readout

–  lower oscillation frequency or higher resolution –  Rise times and fall times doubles resolution

•  Invariant load to improve linearity

Mandai and C

harbon, ES

SC

IRC

11

29 © 2018 Edoardo Charbon

Delay Element Implementation

•  Uniform rise/fall time •  Bias control used for feedback •  Positive feedback for speed VDD

VBIAS

VDD

In+

In-

Out+

Out-

In+ In-

Out+ Out-

30 © 2018 Edoardo Charbon

Asymmetric Rise/Fall Time

•  E.g. inverter starved cell •  Rise time =VDD� Cload/I

•  Fall time: inverter delay

VDD

In Out

I

31 © 2018 Edoardo Charbon

Semi-Digital TDCs

•  Determine time difference based on propagation through an RC line

R

C

R

C

R

C

R

C

V

t

RC delay chain

t1 t2 t3 t4

32 © 2018 Edoardo Charbon

Time Difference Amplifier (TDA)

•  Time differences are multiplied as in successive approximation ADCs

•  Issues: gain stability, jitter

Mandai and Charbon , ESSCIRC11

33 © 2018 Edoardo Charbon

TDA Base Cell

Bias Circuit

34 © 2018 Edoardo Charbon

TDA Base Cell

Bias Circuit Fast Behavior

35 © 2018 Edoardo Charbon

TDA Base Cell

Bias Circuit Fast Behavior Slow Behavior

36 © 2018 Edoardo Charbon

TDA in a TDC

Upper bitTDC

Upper bitADC

DACLower bit

ADC

VoltageamplifierVin

DelayLower bit

TDC

Time Differenceamplifier

Tstart

Two-stage ADC

Two-stage TDC

×-

+-

Tstop

-

Vin-LSB < V < Vin

Amplified time

residue

×

Amplified voltageresidue

Tdiff-LSB < T < Tdiff

(Tdiff=Tstart-Tstop)

37 © 2018 Edoardo Charbon

TDA in a TDC

Upper bitTDC

Upper bitADC

DACLower bit

ADC

VoltageamplifierVin

DelayLower bit

TDC

Time Differenceamplifier

Tstart

Two-stage ADC

Two-stage TDC

×-

+-

Tstop

-

Vin-LSB < V < Vin

Amplified time

residue

×

Amplified voltageresidue

Tdiff-LSB < T < Tdiff

(Tdiff=Tstart-Tstop)

38

Other Composite TDCs

•  Counter + Phase Interpolator + Vernier Niclass et al., JSSC08

•  Ring Oscillators + Counters Veerappan et al., ISSCC11

•  Ring Oscillators + TDA Mandai and Charbon , ESSCIRC11

… and many more

39 © 2018 Edoardo Charbon

Stabilization Techniques

•  Process, Voltage supply, Temperature (PVT) variations eliminated using a delay locked-loop (DLL) in clock phase generation

Clock

Register HIT1

Register HIT2

CP LF PFD

40 © 2018 Edoardo Charbon

PVT Stabilization in Phase Interpolators

•  DLL running in parallel as a replica of delay chain •  Distribute bias to all delay chains

Clock

CP LF PFD

START1

START2

REPLICA DELAY CHAIN

41 © 2018 Edoardo Charbon

Nested Stabilization Loops Clock

PD

PD

PD

PD

PD

T1

Resolution: T2 – T1 = Δ

T2

42 © 2018 Edoardo Charbon

Metastability in Ring Oscillators

Q

En

En_b

DX

Q1

CLK

Q2Qn

CLK

CLK_B CLK_B En_b

Data

CLK

Metastability of Latch

43 © 2018 Edoardo Charbon

Case 1: Monolithic Fully Parallel TDC

44

An Array of 20,480 TDCs

•  Massive array of pixels comprising –  single-photon avalanche diode (SPAD) –  TDC (ring oscillator type) –  Memory

•  Readout –  Frame rate: 1us –  Fully digital

45 © 2018 Edoardo Charbon

TDC Implementation

Single-gate delay means less power, faster transitions

Analog techniques allow greater architecture flexibility

C. Veerappan, J. Richardson, R. Walker, D.-U. Li, M. W. Fishburn, Y. Maruyama, D. Stoppa, F. Borghetti, M. Gersbach, R.K. Henderson, E. Charbon, ISSCC2011 46

The MEGAFRAME Pixel

47 © 2018 Edoardo Charbon

The MEGAFRAME Chip •  Format: 160x128 pixels •  Timing resolution: 55ps •  Impulse resp. fun.: 140ps •  DCR (median): 50Hz •  R/O speed: 250kfps •  Size: 11.0 x 12.3 mm2

TDC Ring oscillator (3 bits) + counter (7 bits) = 10 bits 48

49

The Megaframe-128 Chip

C. Veerappan, J. Richardson, R. Walker, D.-U. Li, M. W. Fishburn, Y. Maruyama, D. Stoppa, F. Borghetti, M. Gersbach, R.K. Henderson, E. Charbon, ISSCC2011

12.3mm

49

Imager Block Diagram

50 © 2018 Edoardo Charbon

Pixel Architecture

51 © 2018 Edoardo Charbon

Photon Counting

52 © 2018 Edoardo Charbon

Photon Time-of-Arrival

53 © 2013 Edoardo Charbon

TDC Characterization

55ps resolution, 55ns range

INL DNL

54 © 2018 Edoardo Charbon

System-level Timing Blue laser Red laser

55 © 2018 Edoardo Charbon

INL Uniformity

% Resolution Variation

Frac

tion

of

Pix

els

56

Optical Burst Detection

57 © 2018 Edoardo Charbon

IR Drop in MEGAFRAME

•  If a large number of TDCs are operating at once, then IR drop occurs

•  As a result the LSB of TDCs changes in space

58

Pitfalls of MEGAFRAME

•  LSB changes as a function of position of the pixel •  There is a dependency to brightness that will change

the current absorbed •  If a VCO is disrupted, the disruption will propagate

through the array in unpredictable ways

© 2018 Edoardo Charbon 59

You Can Compensate, but…

e.g. A replica of the pixel VCO can be placed in a PLL but mismatch will dominate the error

© 2018 Edoardo Charbon 60

Case 2: Column-Parallel TDC

61

Column-parallel TDC Idea

S. Mandai and E. Charbon, IEEE Nuc. Sci Symp. (NSS) 2012 62

Column-parallel TDC Idea

•  A single VCO distributing the oscillation to all TDCs in a line

•  Pros –  Picosecond skew among TDCs –  No LSB variability –  Good PVT control

•  Cons –  Power & buffers create skews

© 2018 Edoardo Charbon 63

Column-parallel TDC Solutions

192 TDCs

16x26 SPAD

Mandai, C

harbon, IEE

E N

uc. Sci. S

ymp. (N

SS

) 2012

Carim

atto, Mandai, C

harbon, IEE

E IS

SC

C 2015

64

Digitally controlled charge pump

9 x 18 MD-SiPM array

432 TDC array

signal processing

20.1

4 m

m

7.6 mm

6.9

mm

4.654 mm

192 Column-parallel TDC

mask and energy registers

Pixel with 2-bit counter

Pixel with 1.5-bit counter

Pixel with 1-bit counter

SPAD

test

stru

ctur

e

(a)

(c)

4 x 4 SiPMs26 x 16pixels

192 Column-parallel TDC

mask and energy registers

VCO

and

refe

renc

e ge

nera

tor

Row

add

ress

dec

oder

5.24

mm

4.22 mm

44%(D4)

47%(D5)

47%(D6)

51%(D7)

50%(D12)

53%(D13)

54%(D14)

57%(D15)

41%(D9)

39%(D8)

43%(D0)

41%(D1)

46%(D2)

50%(D3)

53%(D10)

56%(D11)

(b)

16x26 SPAD

Carim

atto, Mandai, C

harbon, IEE

E IS

SC

C 2015

Column-parallel TDC Solutions

192 TDCs

16x26 SPAD

Mandai, C

harbon, IEE

E N

uc. Sci. S

ymp. (N

SS

) 2012

65

Digitally controlled charge pump

9 x 18 MD-SiPM array

432 TDC array

signal processing

20.1

4 m

m

7.6 mm

6.9

mm

4.654 mm

192 Column-parallel TDC

mask and energy registers

Pixel with 2-bit counter

Pixel with 1.5-bit counter

Pixel with 1-bit counter

SPAD

test

stru

ctur

e

(a)

(c)

4 x 4 SiPMs26 x 16pixels

192 Column-parallel TDC

mask and energy registers

VCO

and

refe

renc

e ge

nera

tor

Row

add

ress

dec

oder

5.24

mm

4.22 mm

44%(D4)

47%(D5)

47%(D6)

51%(D7)

50%(D12)

53%(D13)

54%(D14)

57%(D15)

41%(D9)

39%(D8)

43%(D0)

41%(D1)

46%(D2)

50%(D3)

53%(D10)

56%(D11)

(b)

16x26 SPAD

SiPMs: The position of the photon detection is lost!

Column-parallel TDC Uniformity

© 2018 Edoardo Charbon 67

192 TDC array

20 40 80 100 120TDC address

60 140 160 1800

1

2

3

4

5

Com

pens

ated

INL

[LSB

]

Column-parallel TDC Uniformity

© 2018 Edoardo Charbon 68

150TDC address

100 400200 300150 250 350

Dig

ital c

ode

16810

16820

16830

16840

16850

Sing

le-s

hot n

oise

in s

igm

a (p

s)

80100

140

60

1st group 2nd group 3rd group

120

432 TDC array

Column-parallel TDC Uniformity

© 2018 Edoardo Charbon 69

(a)

START0

VCO

φ1

φ0

φ2

φ3

Band gap

voltage referece

Bias voltage

START1

START2

Phase

STOP

START143

TDC0TDC1

TDC2

TDC143

TDC431

TDC144

TDC287

TDC288

Phase repeater

START144

START287

START288

START431

τ+Δτ

φ0

φ0

φ1

φ2

φ0

φ1

φ2

φ3 V+

V-

LSB16 phases

φ0

φ0

φ1

φ1

φ2 φ3

phase detector

START

VCO phases

Δτ = LSB

12b CNT

(b)

TDCDUM

Bias voltage

STARTDUM

Reftdc

Reftdc

Reftdc

16 + 16 = 32 phases

phase detector

time

τ

432 Column-parallel TDC

Digital circuit

840μ

m

7.2 mm0.4 mm(c)

VCO

and

refe

renc

e

(a) (b)

0 200Digital code

-4

INL

(LSB

)

400 1000

-2

4

600 800

2

0

0 200Digital code

-1

DN

L (L

SB)

400 1000

-0.5

1

600 800

0.5

0

(a) (b)

0 200Digital code

-4

INL

(LSB

)

400 1000

-2

4

600 800

2

0

0 200Digital code

-1

DN

L (L

SB)

400 1000

-0.5

1

600 800

0.5

0

Column-parallel TDC with Memory Features of ‘Piccolo’:

–  32x32 pixels •  FF: 28% •  Pitch: 28µm •  PDP: 50% (max); 11% (800nm) •  DCR: 60cps/pixel

–  128 TDCs •  LSB: 49ps •  Range: 400ns •  Output: 5.12 Gbps (244 Mphotons/s) •  DNL/INL: +/- 0.15 LSB •  SPTR: better than 90ps (SPAD dom.)

© 2018 Edoardo Charbon 70

S. Lindner et al., IIS

W 2017

ASIC vs. FPGA

FPGA vs. discrete ASIC

•  An application-specific integrated circuit (ASIC) is a chip with static circuitry optimized for one task

•  A field-programmable gate array (FPGA) is a

chip whose configuration, specified by a hardware description language, can be changed many times

72 © 2018 Edoardo Charbon

General Comparison

FPGA •  Fast Development

Time •  Reconfigurable

–  Lower fault risk –  Iterate design

•  Low non-recurring costs –  Development –  Testing

ASIC •  Lower power •  Faster operation •  Smaller footprint •  Better integration •  More flexibility •  Low unit costs

–  High-volume applications

73 © 2018 Edoardo Charbon

How to Build a Delay Chain From multiple adder’s carry units

74 © 2018 Edoardo Charbon

FPGA Caveats: Clock Regions

Bad location for a TDC

75 © 2018 Edoardo Charbon

Example FPGA Architecture Only digital techniques available with existing cells

76 © 2018 Edoardo Charbon

Virtex-6 FPGA TDC

77

Temperature Dependence

78 © 2018 Edoardo Charbon

Location, Location, Location

79 © 2018 Edoardo Charbon

Chip-to-chip Variation

80 © 2018 Edoardo Charbon

TDC Comparison

FPGA •  Best time

uncertainty: 20ps •  Usage examples

–  High-energy physics –  OpenPET

ASIC •  Best time

uncertainty: <1ps •  Examples

–  Time-correlated imaging

–  Frequency synthesizers for RF

81 © 2018 Edoardo Charbon

FPGA- or ASIC-based TDC?

•  Consider an FPGA-based TDC if your application: –  Is low-volume –  Doesn’t require <20ps time uncertainty –  Is sensitive to development time, or is being created in

iterations –  Is open source (FPGA-based TDCs are code-based)

82 © 2018 Edoardo Charbon

3D Integration

83

3D ICs – Hybrid Bonding

J. Mata Pavia, M. Wolf, E. Charbon, JSSC 2015

84 © 2018 Edoardo Charbon

3D ICs – Hybrid Bonding

•  Sony Corp. (?/?) •  STMicroelectronics (65/45nm) •  TSMC (45/65nm) •  Tezzaron (anything/anything)

85 © 2018 Edoardo Charbon

TSMC BSI + 3D-Stacking

•  Tier 1: SPADs + microlenses •  Tier 2: quenching, recharge, TDCs, multi-core,

memories, communication unit, I/O A7A1 A2 A3 A4 A5 A6 A8

ALU

DTOF

ADDR_WRITE

MEM_WRITE

DATA_WRITEMEM_READ

Σ16 10b

8 10bDEC

SHAREDTDC

21b

21b

21b

DIGITALPROCESSING

DATA_ROUT

DATA_NEW

14b

6b

ADDR_ROUT

MeM64x21b

we

31.3%FF

EN_READ

DTOFSAFF* DFF**

VQ

MeM

masking

MODE

RST

SPAD

100n/390n 60n/390n

60n/520nPassivequenching

1.2VTier2Tier1

RST:GlobalResetMODE:Pulse/State

in1

‘1’

in2

‘1’

q1 q2

Q

rst

A

LATCH

DQ

rst

DQ

rst

Q

SYMMETRICOR

ADDRESSSR-LATCH

PhotonsMultiple3Dconnections

perSPAD

Tier2

Tier1

DTOF6b ADDR_READ01

DECISIONMAKER

ADDR_READ[0] ADDR_READ[1] ADDR_READ[2]

MEM_READ[0] MEM_READ[1] MEM_READ[2]

ADDR_WRITE[0] ADDR_WRITE[1]

DATA_WRITE[0] DATA_WRITE[1]

X

X

XX

X MEM_WRITE[0] MEM_WRITE[1]

DTOF

X

Combinedevents(8x8pixels) Ignoredevent Ignoredevent

Q

in1 in2A

Q1

Q2

Q3

Q4

Q5

Q6

Q7

Q8

Q63

Q64

Q61

Q62

Q59

Q60

Q57

Q58

QUENCHING

READOUT

ACCESS&CONTROL

SERIALIZER

DECISIONTREE6levelsofDECISIONMAKERS6levelsofMUXforADDRESS

QUENCHINGARRAYPassivequenchingMasking&multi-modeoutput

QUENCHING ARRAY (8 x 8)

*SAFF(Sense-AmplifierFlip-Flop)usedforROphasesampling**DFF(Standard-cellD-typeFlip-Flop)usedforcountersampling

M.J. Lee, A.R. Ximenes, P. Padmanabhan, Y. Yamashita, D.N. Yaung, E. Charbon, IEDM 2017

86 © 2018 Edoardo Charbon

TDC Sharing

•  Virtually zero skew •  Preservation of origin of pulse

87

A7A1 A2 A3 A4 A5 A6 A8

ALU

DTOF

ADDR_WRITE

MEM_WRITE

DATA_WRITEMEM_READ

Σ16 10b

8 10bDEC

SHAREDTDC

21b

21b

21b

DIGITALPROCESSING

DATA_ROUT

DATA_NEW

14b

6b

ADDR_ROUT

MeM64x21b

we

31.3%FF

EN_READ

DTOFSAFF* DFF**

VQ

MeM

masking

MODE

RST

SPAD

100n/390n 60n/390n

60n/520nPassivequenching

1.2VTier2Tier1

RST:GlobalResetMODE:Pulse/State

in1

‘1’

in2

‘1’

q1 q2

Q

rst

A

LATCH

DQ

rst

DQ

rst

Q

SYMMETRICOR

ADDRESSSR-LATCH

PhotonsMultiple3Dconnections

perSPAD

Tier2

Tier1

DTOF6b ADDR_READ01

DECISIONMAKER

ADDR_READ[0] ADDR_READ[1] ADDR_READ[2]

MEM_READ[0] MEM_READ[1] MEM_READ[2]

ADDR_WRITE[0] ADDR_WRITE[1]

DATA_WRITE[0] DATA_WRITE[1]

X

X

XX

X MEM_WRITE[0] MEM_WRITE[1]

DTOF

X

Combinedevents(8x8pixels) Ignoredevent Ignoredevent

Q

in1 in2A

Q1

Q2

Q3

Q4

Q5

Q6

Q7

Q8

Q63

Q64

Q61

Q62

Q59

Q60

Q57

Q58

QUENCHING

READOUT

ACCESS&CONTROL

SERIALIZER

DECISIONTREE6levelsofDECISIONMAKERS6levelsofMUXforADDRESS

QUENCHINGARRAYPassivequenchingMasking&multi-modeoutput

QUENCHING ARRAY (8 x 8)

*SAFF(Sense-AmplifierFlip-Flop)usedforROphasesampling**DFF(Standard-cellD-typeFlip-Flop)usedforcountersampling

© 2018 Edoardo Charbon

TDC Layer

88 © 2018 Edoardo Charbon

A7A1 A2 A3 A4 A5 A6 A8

ALU

DTOF

ADDR_WRITE

MEM_WRITE

DATA_WRITEMEM_READ

Σ16 10b

8 10bDEC

SHAREDTDC

21b

21b

21b

DIGITALPROCESSING

DATA_ROUT

DATA_NEW

14b

6b

ADDR_ROUT

MeM64x21b

we

31.3%FF

EN_READ

DTOFSAFF* DFF**

VQ

MeM

masking

MODE

RST

SPAD

100n/390n 60n/390n

60n/520nPassivequenching

1.2VTier2Tier1

RST:GlobalResetMODE:Pulse/State

in1

‘1’

in2

‘1’

q1 q2

Q

rst

A

LATCH

DQ

rst

DQ

rst

Q

SYMMETRICOR

ADDRESSSR-LATCH

PhotonsMultiple3Dconnections

perSPAD

Tier2

Tier1

DTOF6b ADDR_READ01

DECISIONMAKER

ADDR_READ[0] ADDR_READ[1] ADDR_READ[2]

MEM_READ[0] MEM_READ[1] MEM_READ[2]

ADDR_WRITE[0] ADDR_WRITE[1]

DATA_WRITE[0] DATA_WRITE[1]

X

X

XX

X MEM_WRITE[0] MEM_WRITE[1]

DTOF

X

Combinedevents(8x8pixels) Ignoredevent Ignoredevent

Q

in1 in2A

Q1

Q2

Q3

Q4

Q5

Q6

Q7

Q8

Q63

Q64

Q61

Q62

Q59

Q60

Q57

Q58

QUENCHING

READOUT

ACCESS&CONTROL

SERIALIZER

DECISIONTREE6levelsofDECISIONMAKERS6levelsofMUXforADDRESS

QUENCHINGARRAYPassivequenchingMasking&multi-modeoutput

QUENCHING ARRAY (8 x 8)

*SAFF(Sense-AmplifierFlip-Flop)usedforROphasesampling**DFF(Standard-cellD-typeFlip-Flop)usedforcountersampling

A.R. Ximenes, P. Padmanabhan, M.J. Lee, Y. Yamashita, D.N. Yaung, E. Charbon, ISSCC 2018

3D-Stacked Chip Micrograph

89 © 2018 Edoardo Charbon

LiDAR Demonstrator

Interference

ModulatedLaser

HistTransmitted

1MHz

TDC

Mod+

- Chip+FPGA

1MHz

Laser

INTERFERENCE

Interferencew/omodulation

Recoveredsignal

Received

Hist

LaserSignature

X-control

Y-control

Dual-axisscanner

Waveformgenerator

Beamsplitter

Laser

TDCcode

TDCcode

Horizontalresolution

Verticalresolution

90 © 2018 Edoardo Charbon

Distance Measurements

Backgroundlight+DCR

UnmodulatedInterference

Target

-12.6dB

-15.6dB

-18.6dB

91 © 2018 Edoardo Charbon

Interference Suppression

Backgroundlight+DCR

UnmodulatedInterference

Target

-12.6dB

-15.6dB

-18.6dB

92 © 2018 Edoardo Charbon

256x256 3D Image Reconstruction

93 © 2018 Edoardo Charbon

Large TDC Arrays

Instead of a large VCO distributing the sync to a large array of TDCs… build a large array of continuously running VCOs Pros

–  Individual FOM improved by 10 log (M) –  Synchronization is ~1ps –  PVT robust –  Robust to local disruptions

CTRL

PLL

Individual FOM improved by 10·log10 (M)

Overall constant FOM

94 © 2018 Edoardo Charbon

Mutual Coupling

© 2018 Edoardo Charbon

•  Use injection locking for coupling VCOs •  The PLL only forces the desired frequency on the VCOs

10k 100k 1M 10M 100M-160

-140

-120

-100

-80

-60

-40

-20

Phas

e N

oise

[dB

c/H

z]

Freq [Hz]

1 x 1 2 x 2 4 x 4 8 x 8 16 x 16

10·log10 (M)

CTRL

PLL

Individual FOM improved by 10·log10 (M)

Overall constant FOM

95

Mutual Coupling

96

Σ

TDC

CTRL

8

8

8

en

en

en

en

en enTo Left RO

To Right RO

To Bottom RO

To Top RO

Coupling Element

3D stacked technology• Only SPADs on top tier• Only processing bottom tier

A

IMAGERModularFlexible

10b

A.R

. Xim

enes, P. Padm

anabhan, E. C

harbon, IISW

2017

Mutual Coupling Measurements

© 2018 Edoardo Charbon 97

coupled

uncoupled

Phas

e N

oise

(dBc

/Hz)

-130

-120

-110

-100

-90

-80

-70

Offset Frequency (Hz)

~18dB

Center Frequency = 500 MHz

uncoupled

coupled

-1401M 10M 100M

Mutual Coupling Measurements

© 2018 Edoardo Charbon 98

uncoupledcoupled

uncoupled: 22 ~ 26%coupled: <0.11%

coupled

uncoupled

coupled

uncoupled

~18dB

~14dBFreq

uenc

y (M

Hz)

100

200

300

400

500

600

700

800

900

1000

Control Voltage (V)0.65 0.6 0.55 0.5 0.45

Phas

e N

oise

(dBc

/Hz)

Inte

grat

ed ji

tter

(ps)

20

40

60

80

0

-100

-95

-90

-85

-105

-110

# Oscillator0 10 20 30 40 50 60

0 10 20 30 40 50 60

Frequency variationPhase noise (3 MHz) and

integrated jitter @ 500MHz

Perspectives for 2020

•  Sub-65nm CMOS •  Large, scalable designs (Lego™ approach) •  Backside illumination (BSI) 3D IC •  Hybrid approaches (InP, GaAs, Ge, polymers) •  Cryogenic operation

99 © 2018 Edoardo Charbon

Moore’s Law Will Help

2006

1 kpixel

32 pixel

10 kpixel

100 kpixel

1 Mpixel 0.8 CMOS 0.35 CMOS

2009

65nm CMOS

2012 2020 2015

128x2

130nm CMOS

1M

512x256

128x128 160x128

64x48

112x4

2003

3D IC SPAD 100 © 2018 Edoardo Charbon

Quantum Computing

The2012NobelPrize

102©2018EdoardoCharbon

Frombitstoqubits•  AquantumbitorqubitisaquantumsysteminwhichtheBooleanstates0and1arerepresentedbyapairofmutuallyorthogonalquantumstateslabeledas

•  QuantumproperAes:superposi7onandentanglement

103

0 , 1

©2018EdoardoCharbon

Semiconductor quantum dots

Superconducting circuits Impurities in diamond or silicon

Semiconductor-superconductor hybrids

Source:L.Vandersypen,2017

Qbits on a Chip

©2017EdoardoCharbon 104

QuantumComputerArchitecture

•  Carrierfrequency:100MHz–15GHz,70GHz•  Pulses:10–100ns

[DiCarlo]

[L.Vandersypen]

Control

Read-out

Quantumbits(qubits)

Quantumprocessor(≪1K)

Classicalcontroller

105©2018EdoardoCharbon

QuantumComputerArchitecture

•  Carrierfrequency:100MHz–15GHz,70GHz•  Pulses:10–100ns

[DiCarlo]

[L.Vandersypen]

Control

Read-out

Quantumbits(qubits)

©2018EdoardoCharbon 106

AReal-lifeQuantumComputer

4K

20mK

300K

x8qubitsx8qubits

77K

107©2018EdoardoCharbon

PossibleSolu7ons

•  Proposedsolu7on–  Electronicsat4K–  OnlyconnecAonsto4Kto20mKareneeded

108

T=20mK T=4K T=300K

ElectronicReadout&control

T=20mK T=4K

ElectronicRead-out&control

T=300K

[Ristèetal.2014-15]

5-qubitcomputer

©2018EdoardoCharbon

PossibleSolu7ons

•  Proposedsolu7on–  Electronicsat4K–  OnlyconnecAonsto4Kto20mKareneeded

•  Ul7matesolu7on

–  Qubitsat4K– MonolithicintegraAon

109

T=20mK T=4K T=300K

ElectronicReadout&control

T=20mK T=4K

ElectronicRead-out&control

T=300K

[Ristèetal.2014-15]

5-qubitcomputer

©2018EdoardoCharbon

ElectronicReadout&Control

110

20-100mK

1-4K 300K

ADC

ADC

DAC

DAC

MUX

DEMUX

QuantumProcessor

TSensorsBias/References

TDC

Digitalcontrol(ASIC/FPGA)

OPTICAL GUIDE APD

E.Charbonetal.,IEDM2016

©2018EdoardoCharbon

CoolingPowerIssue

Courtesy:Oxfordinstruments

20mK

100mK

4K

70K

300K

Dilu7onrefrigerator

T(K)

111©2018EdoardoCharbon

ScalabilityIssue

•  Noisebudget…...........................................<0.1nV/√Hz•  Powerbudget(forscalability)….................<<2mW/qubit•  Physicaldimensions(forscalability)….......30nm•  Bandwidth(formulAplexing)…..................1-12GHz•  Kick-backavoidance

©2018EdoardoCharbon 112

CryogenicElectronics

Cryo-CMOSTechnologies

115

Thinoxide Thickoxide

Thinoxide Thickoxide

DeviceModeling(160nm)

R.M.Incandelaetal.,ESSDERC2017

Thickoxide

116

Thinoxide Thickoxide

Thinoxide

DeviceModeling(40nm)

R.M.Incandelaetal.,ESSDERC2017

BJTsandDTMOSinmKdomain

BJT

DTMOS

4K* 1K 40mK

•  BJTscanworkasbandgapreferenceatT>77K•  DTMOScanbeusedasbandgapreferenceatcryotemperatures

117

To be published

To be published

H. Homulle, E. Charbon, F. Sebastiano, JEDS 2018

SubstrateResis7vity

118©2018EdoardoCharbon

SPICEModels,Farms

•  Wecreatedmodelsfor4KcomponentsinVerilog-AMS,BSIM6,PSP

•  Wearebuildingacompletemodeltoolkitfor40nmand160nmCMOStechnologies

•  Modelsaretestedusingcryogeniccomponentfarms

119©2018EdoardoCharbon

CryogenicCircuits&Systems

Cryo-FPGAs

121

HaraldHomulle

•  Artix-7 full operation down to 4K •  Other FPGAs only limited to 30K

©2018EdoardoCharbon

•  All FPGA components are working in the cryogenic environment down to 4K

•  No modifications required

FPGAfunc7onality

Component Functional Behavior IOs ✓ LVDS ✓ LUTs ✓ Delay change < 5% CARRY4 ✓ Delay change < 2% BRAM ✓ No corruption (800 kB) MMCM ✓ Jitter reduction of roughly 20% PLL ✓ Jitter reduction of roughly 20% IDELAYE2 ✓ Delay change of up to 30% DSP48E1 ✓ No corruption over 400 operations

122©2018EdoardoCharbon

A/DconversiononFPGA•  Principle

–  Timestampthecross-overofinputwithreferenceramp–  UseTDCforAmestamping

•  Booleneck:weareboundtotheCMOStechnologyoftheFPGAC LK /

S TOP

S TART

VREF

V IN

VOUT

2 .5 nsC LK /S TOP

S TART

VREF

V IN

VOUT

2 .5 nsC LK /S TOP

S TART

VREF

V IN

VOUT

2 .5 ns

126©2018EdoardoCharbon 126

ADConFPGA(1.2GSa/s)

15K

Signalbandwidth:2MHz Signalbandwidth:40MHz

300K

H.Homulleetal.,TCASI,63(11),1854-1865,2016

©2018EdoardoCharbon 127

ADConFPGA

15K

300K

Signalbandwidth:2MHz Signalbandwidth:40MHz

H.Homulleetal.,TCASI,63(11),1854-1865,2016

©2018EdoardoCharbon 128

1 2 3 4 5 610-2

10-1

100

101

102

103

104

Excess Bias (V)

DCR

(Hz)

77K120K

160K

200K

250K

300K

300K250K

200K160K

120K

77K

SPAD GreenSPAD Red

50 100 150 200 250 300 350

18

20

22

24

26

28

30

SPAD Green, Φ=12µm

SPAD Red, Φ=12µm

Temperature (K)

Brea

kdow

n Vo

ltage

(V)

p+

n-

n+

V

+

- depletion region

Reverse bias

R Q

VIA

VOP’

OperaAoninproporAonalandGeigermode(SPAD)

Cryo-SPADs

©2017EdoardoCharbon

E.Charbonetal.,ISSCC2017

131

Cryo-SPADs

©2017EdoardoCharbon

1 2 3 4 5 610-2

10-1

100

101

102

103

104

Excess Bias (V)

DCR

(Hz)

77K120K

160K

200K

250K

300K

300K250K

200K160K

120K

77K

SPAD GreenSPAD Red

50 100 150 200 250 300 350

18

20

22

24

26

28

30

SPAD Green, Φ=12µm

SPAD Red, Φ=12µm

Temperature (K)

Brea

kdow

n Vo

ltage

(V)

180nmCMOSbulkB.Patraetal.,JSSC2018

132

Cryo-LNA

•  Standard160nmCMOS•  500MHzBandwidth•  0.1dBNoisefigure•  7Knoise-equivalenttemperature

©2018EdoardoCharbon

E.Charbonetal.,ISSCC2017

134

Cryo-LNA

©2018EdoardoCharbon

B.Patraetal.,JSSC2017

135

Buildingup

IEEESensors2016

20-100mK

1-4K 300K

ADC

ADC

DAC

DAC

MUX

DEMUX

QuantumProcessor

TSensorsBias/References

TDC

Digitalcontrol(ASIC/FPGA)

OPTICAL GUIDE APD

136

IEEEISSCC2017IEEEISSCC2017/JSSC2018

Tobepublished2018

IEEEIEDM2016

IEEEIEDM2016

©2018EdoardoCharbon

IEEEISSCC2017

2DReadoutandControl

•  UseimagingsensorreadoutasinspiraAon

•  Reducenumberoftransistors(ideallytozero)

•  Usetunnelingbarriersasselectors

•  (limited)useof3Dstacking ∂∂∂

∂20mK

4K

©2018EdoardoCharbon 137

PuZngThingsinContext

©H.Homulle2016

©2018EdoardoCharbon

Conclusions

140

Take-homeMessages•  LargearraysofTDCsforTOAarenecessarytoanumberofemergingfields

•  ModularityisanimportantingredienttolargeTDCarraysbutoneneedstobeawareofsynchronizaAon,reliability,anduniformityissues

•  3D-stacking/3DintegraAonisbecomingawayoflife!

•  QuantumCompuAngwillneedthesecircuitsbutwillrequirecryogenicoperaAon

©2018EdoardoCharbon 141

Acknowledgements

142©2018EdoardoCharbon

SwissNaAonalScienceFoundaAonEuropeanSpaceAgencyFP6andFP7NCCR-MICSNOW-STWNIH

h]p://aqua.epfl.ch

©2018EdoardoCharbon