University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and...

24
University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner , Claas Cornelius, Andreas Tockhorn, Dirk Timmermann; MEMICS 2010, Mikulov, Czech Republic, October 22- 24

description

3 Tim Wegner - 23 October 2010 MEMICS 2010, Mikulov, Czech Republic, October Monitoring and Control of Temperature in NoCs 1. Introduction  Increasing integration density → rising complexity, shrinking device sizes  NoCs able to deal with arising requirements (e.g. for communication)  But: Reliability becomes a dominant factor for chip design  Goal: Increase reliability in NoC-based systems  Increasing integration density → rising complexity, shrinking device sizes  NoCs able to deal with arising requirements (e.g. for communication)  But: Reliability becomes a dominant factor for chip design  Goal: Increase reliability in NoC-based systems Impacts of technological development

Transcript of University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and...

Page 1: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

University of Rostock Institute of Applied Microelectronics and Computer Engineering

Monitoring and Control of Temperature in Networks-

on-ChipTim Wegner, Claas Cornelius, Andreas Tockhorn, Dirk

Timmermann;

MEMICS 2010, Mikulov, Czech Republic, October 22-24

Page 2: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

2Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCsOutline

1. Introduction

2. Networks-on-Chip (NoCs)

3. Impact of Temperature on Reliability

4. Monitoring & Control of Temperature in NoCs

5. Summary

Page 3: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

Tran

sist

or c

ount

1954: IBM 704 Mainframe

1981: IBM PC5150

2007: Apple iPhone

3Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCs1. Introduction

Increasing integration density → rising complexity, shrinking device sizes

NoCs able to deal with arising requirements (e.g. for communication)

But: Reliability becomes a dominant factor for chip design Goal: Increase reliability in NoC-based systems

Impacts of technological development

Page 4: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

4Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCsOutline

1. Introduction

2. Networks-on-Chip (NoCs)

3. Impact of Temperature on Reliability

4. Monitoring & Control of Temperature in NoCs

5. Summary

Page 5: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

IP core

IP core

IP core

IP core

R

R R

R

CLK0

CLK3

CLK1

CLK2

5Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCs2. Networks-on-Chip

Infrastructure for on-chip interconnection Point-to-point links replace long global

busses Parallel packet-based communication Separation of communication &

computation Globally asynchronous locally synchronous

(GALS) Modularity of IP cores (not part of actual

NoC) reusability, high abstraction level

Properties

NoCs are able to satisfy requirements of modern VLSI systems

Page 6: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

6Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCsOutline

1. Introduction

2. Networks-on-Chip (NoCs)

3. Impact of Temperature on Reliability

4. Monitoring & Control of Temperature in NoCs

5. Summary

Page 7: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

7Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCs3. Impact of Temperature on Reliability

Increasing integration densities, progress of nanotechnology Growing number of transistors per chip = raised probability

of failure decreasing structural size of ICs = higher susceptibility to

environmental influences & deterioration

Impacts of technological progress

Intel 8086 (1978): ≈879

transistors/mm²

Intel Bloomfield (2008): ≈2,78 Mio.

transistors/mm²

Page 8: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

8Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCs3. Impact of Temperature on Reliability

Particular physical effects (e.g. TDDB, EM) contribute to deterioration Abetted by high temperatures

Correlation between temperature & failure mechanisms established by Arrhenius model Exponential decrease of IC lifetime with

temperature

Why is thermal awareness important?

Growing influence of on-chip temperature distribution on lifetime, operability, performance etc.

TkE

failb

a

eT *

Page 9: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

9Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCsOutline

1. Introduction

2. Networks-on-Chip (NoCs)

3. Impact of Temperature on Reliability

4. Monitoring & Control of Temperature in NoCs

5. Summary

Page 10: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

Mitigate effects contributing to deterioration & delay occurrence of failures Control of on-chip temperature distribution

10Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCs4. Monitoring and Control of Temperature for NoCs

Objective:

Effective mechanisms to monitor & control on-chip temperature

Integration into existing NoC Preservation of modularity & reusability Minimum costs (area, frequency) Maximum performance of monitoring and control Minimum impact on system performance

Requirements:

Page 11: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

11Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCs4.1 Mechanisms for monitoring Concept: attach physical monitoring

probes to every IP core

temperature variation ∆T Continuous checking of

TIPC

|TIPC,old - TIPC,new| ≥ ∆T ? Report TIPC,new

Area: 66 LUT/FF pairs Frequency: 227 MHz

Event-driven:

Period of time ∆t Report TIPC,new every ∆t

Area: 80 LUT/FF pairs Frequency: 338 MHz

Time-driven:

Page 12: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

IP core

CCU

IP core

IP core

R

R R

R

12Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCs4.2 Mechanisms for control

Reception & interpretation of probe packets

Instructions for Dynamic Frequency Scaling to probes (if necessary)

Area: 507 LUT/FF pairs Frequency: 165 MHz

Central Control Unit (CCU):

!!! Not the smartest approach, but suffices to test functionality !!!

Page 13: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

R

IP coreP

IP coreP

R

IP core

RP

Area penalty: 30,5%

Freq. penalty: 8,2%

Area penalty: 7,3% Freq. penalty: /

(but Mux/Demux)

Area penalty: / Freq.

penalty: /

13Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCs4.3 Integration of monitoring 3 approaches Different impact on performance & costs

Into IP core: Router port of IP core: Extra router port:

Page 14: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

14Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCs4.4 Impact on system performance

Page 15: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

15Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCs4.5 Performance of monitoring & control

Page 16: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

16Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCs5. Summary

Event-driven approach preferable (situational monitoring, better performance, no redundant traffic, lower area costs)

Integration into NoC using router port of IP core best trade-off between costs & preservation of modularity/non-intrusiveness

Conclusion

Implementation of 2 approaches for monitoring on-chip temperature + 3 methods for integration into NoC

Investigation of: Costs (area, frequency) Impact on system performance Performance of monitoring & control

Page 17: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Thanks for your attention!Any questions?

[email protected]

www.networks-on-chip.com

University of Rostock, GermanyInstitute of Applied Microelectronics and Computer Engineering

Contact:

Homepage:

Page 18: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

Establishes relationship between temperature and failure mechanisms

Describes dependence of chemical reactions on temperature changes

Assumption: all other parameters constant

T fai

l

Temperature

18Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Arrhenius Model

TbkaE

efailT*

Lifetime of ICs decreases exponentially with temperature

Monitoring and Control of Temperature in NoCs

Page 19: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

19Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Monitoring and Control of Temperature in NoCs

Inoperability of transistor through gate oxide breakdown (long-term)

Time Dependent Dielectric Breakdown (TDDB)

Page 20: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

20Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Transport of material in conductors (i.e. wires) Cause: ion movement induced by current flow (ions’

mobility increases with temperature) Effects:

• Hillocks short circuits

• Voids interruption of current paths

Electromigration (EM)

Monitoring and Control of Temperature in NoCs

Page 21: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

21Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Intel Bloomfield:• Year: 2008• 731 Mio. Transistors• 263mm²• 2779467 Tr./mm2

Intel 8086:• Year: 1978• 29k transistors• 33mm²• 879 Tr./mm²

Intel Processors

Monitoring and Control of Temperature in NoCs

Page 22: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

22Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Impact on system performance

Monitoring and Control of Temperature in NoCs

Page 23: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

23Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Performance of monitoring & control

Monitoring and Control of Temperature in NoCs

Page 24: University of Rostock Institute of Applied Microelectronics and Computer Engineering Monitoring and Control of Temperature in Networks-on- Chip Tim Wegner,

24Tim Wegner - 23 October 2010MEMICS 2010, Mikulov, Czech Republic, October 22-24

Synthesis results for monitoring & control

Component Integration method

Event-driven probe

Time-driven probe

Central Control

Unit

Into IP core

Using IP core port

Extra port

Frequency [MHz]

227 338 165 122 119 112

Area [LUT/FF pairs]

66 80 507 1901 1896 2312

Unmodified NoC router: 1771 LUT/FF pairs, 122 MHz

Monitoring and Control of Temperature in NoCs