Sensor Placement and Allocation.pdf

83
Machine Learning Approach to Sensor Alloca3on and Placement in System onChips (SoCs) Santanu Sarma Centre for Embedded and CyberPhysical Systems (CECS) University of California, Irvine Spring 2013

Transcript of Sensor Placement and Allocation.pdf

Page 1: Sensor Placement and Allocation.pdf

Machine  Learning  Approach  to  Sensor  Alloca3on  and  Placement  in  System-­‐

on-­‐Chips  (SoCs)  

Santanu  Sarma  Centre  for  Embedded  and  Cyber-­‐Physical  Systems  (CECS)  

University  of  California,  Irvine  Spring  2013  

Page 2: Sensor Placement and Allocation.pdf

Impact  of  Temperature  

•  Elevated   temperatures   directly   impact   all   key   circuit  metrics  including:  life3me  and  reliability,  speed,  power,  and  costs.    

•  Hot   spots   reduce   the   mean   3me   to   failure   as   most   failure  mechanisms   have   strong   temperature   dependencies  [Pedram2006].  

•  Different   thermal   expansion   coefficients   of   chip   materials  cause  mechanical  stresses  that  can  eventually  crack  the  chip/package  interface  [Brooks2007]  

•  The   exponen3al   dependency   of   leakage   power   on  temperature  further   increases  total  power  and  could   lead  to  thermal  runaway  [Lin2008].  

6/1/12   ©  Santanu  Sarma,  UCI     2  

Page 3: Sensor Placement and Allocation.pdf

Impact  of  Temperature  

•  The   failure   rate   due   to   thermal   cycling   increases   with   the  increasing   magnitude   and   frequency   of   the   temperature  cycles  [JEDEC2006].    

•  A  10  oC    rise  in  temperature  can  reduce  the  chip  life-­‐3me  by  half  [h\p://www.nanowerk.com].  

•  Increasing   temperature   increases   local   resistances,   and   thus  circuit  delays  and  IR  drop  [Santarini2005].  

•  Elevated   temperatures   also   slow   down   devices   and  interconnects   leading   to   3ming   failures   [Cheng1998],  [Pileggi2006].  

6/1/12   ©  Santanu  Sarma,  UCI     3  

Page 4: Sensor Placement and Allocation.pdf

Impact  of  Temperature  

•  Inaccuracies  in  thermal  tracking  decreases  the  processor’s  performance  and  wastes  power.  In  par3cular,  it  was  shown  that  a  1oC  accuracy  translates  to  2W  power  savings,  and  that  due  to  lack  of  proximity,  sensor  measurements  and  hot  spot  temperatures  could  differ  by  up  to  10oC  [Rotem2006]  

•  In  mobile  computers,  1.5oC  accuracy  in  temperature  measurement  is  equivalent  to  1  Wa\  of  CPU  power  [Rotem2006]  

•  Inaccuracy  1oC  in  thermal  es3mates  can  trigger  DTM  and  unwanted  performance  loss  of  upto  14.4  %  [Zhang  2011].  

6/1/12   ©  Santanu  Sarma,  UCI     4  

Page 5: Sensor Placement and Allocation.pdf

Thermal  Profile  Characteris3cs    

•  Hot  spot  loca3ons  and  temperatures  are  applica3on  dependent  [Skadron2005]  –  Hot  spots  will  not  always  remain  in  the  same  loca3ons  on  the  chip  during  execu3on  of  a  single  program  [Hamann2007,  IBM]  

–  Various  applica3ons  running  on  the  same  chip  will  show  hot  spots  in  different  regions  [Hamann2007,  IBM]  

– Within-­‐die  temperature  varia3on  can  be  up  to  50  °C  [Borkar2003].  Large  number  of  delay  viola3ons  would  occur  if  the  peak  temperature  exceeds  85°C  [Skadron2003].  

– Maximum  temperature  varies  across  layers  [Im2000]  

6/1/12   ©  Santanu  Sarma,  UCI     5  

Page 6: Sensor Placement and Allocation.pdf

Thermal  Distribu3on  During  Boo3ng  [Hamann2007,  IBM]  

6/1/12   ©  Santanu  Sarma,  UCI     6  

Page 7: Sensor Placement and Allocation.pdf

Thermal  Gradient  as  high  as  50oC  

6/1/12   ©  Santanu  Sarma,  UCI     7  

Temperatures  for  gcc  benchmark.  [Han2007]  

Page 8: Sensor Placement and Allocation.pdf

Within  die  temperature  varia0on  

6/1/12   ©  Santanu  Sarma,  UCI     8  Within  die  maximum  temperature  varia0on  of  up  to  50  oC  [Borkar2003]    

Page 9: Sensor Placement and Allocation.pdf

Sensor  Placement  Far  Away  From  Ho\est  Block    

6/1/12   ©  Santanu  Sarma,  UCI     9  Temperature  for  nbench  benchmark  [Han2006]  

Page 10: Sensor Placement and Allocation.pdf

Maximum  Temperature  Distribu3on  Across  Layers  

6/1/12   ©  Santanu  Sarma,  UCI     10  Maximum  temperature  distribu3on  along  ver3cal  distance  from  the  substrate  to  the  top  metal  layer  [Im2000]  

Page 11: Sensor Placement and Allocation.pdf

Maximum  Temperature  Distribu3on  Across  Benchmarks  

6/1/12   ©  Santanu  Sarma,  UCI     11  

[Srinivasan  2004]  

Page 12: Sensor Placement and Allocation.pdf

Thermal  Profile  Characteris3cs    

Proper0es   Magnitude   Remarks  

Independent  Dimensions   Spa3al  &  temporal      

In  all  three  dimensions  and  3me,  Technology  &  Workload    dependent    

Maximum  Spa3al  Varia3ons    

Up  to  40-­‐60  oC  across  adjacent  blocks    

In  the  order  of  block  size,    

Maximum  Spa3al    Gradients    

Up  to  40-­‐60oC  per  nm^2  

Maximum  Temporal  Varia3ons    

Up  to  40-­‐100oC  In  Seconds  in  the  same  block.    

[0.1oC/  30  us  =  3333  oC/S]  [Skadron2003]  

Maximum  Temporal  Gradients    

?  

6/1/12   ©  Santanu  Sarma,  UCI     12  

Page 13: Sensor Placement and Allocation.pdf

Sensor  Placement  Requirements  

•  Sensor  placement  configura3on  must  suffice  for  all  hot  spots  that  may  arise  during  the  execu3on  of  any  program  –  It  is  unlikely  that  a  solu3on  op3mized  for  a  single  applica3on  will  be  sufficient  for  other  workloads  

•  Need  to  reduce/minimize  the  overheads  of  using  large  number  of  sensors  

•  Placement    need  to  ensure  overall  reduc3on  in  error  in  the  thermal  profile  reconstruc3on    

6/1/12   ©  Santanu  Sarma,  UCI     13  

Page 14: Sensor Placement and Allocation.pdf

Problem  Classifica3on    

Placement  Techniques    

•  Based  on  Approach  –  Sta3c  placement    –  Dynamic  placement  

[Buedo2004]    

•  Based  on  Geometry  –  Uniform  –  Non-­‐uniform  

•  Based  on  Transformed  Domain  

Reconstruc0on  Techniques  •  Direct  Methods    •  Inverse  Methods  

–  Based  on  Sensing  Approach  –  Based  on  Sampling  Mechanism  –  Based  on  Transformed  domain  –  Based  on  Informa3on  Theory  

•  Adap3ve  Methods    –  Direct/Indirect  [Sharif  2010]  –  Phase  based  [Reda  2012]  –  Predic3ve  Model  based  [Coskun  

2009]  

6/1/12   ©  Santanu  Sarma,  UCI     14  

Page 15: Sensor Placement and Allocation.pdf

Problem  Classifica3on  

6/1/12   ©  Santanu  Sarma,  UCI     15  

THERMAL  CHARACTERIZATION  

SENSOR  PLACEMENT     RECONSTRUCTION  

APPROACH   GEOMETRY   XFORMS   DIRECT   INVERSE   ADAPTIVE  

HYBRID            (IR  Imaging)  

Page 16: Sensor Placement and Allocation.pdf

Thermal  Characteriza0on  

•  Design-­‐Time  Technique    –  allocate  sensors  near  poten3al  hot  spot  loca3ons  –  Sensor  placement  algorithms  fall  into  two  main  categories:    

•  Uniform  Sensor  Placement    •  Nonuniform  Sensor  Placement  •  Hybrid  

•  Run-­‐3me  Technique  –  full  thermal  map  characteriza3on  &  hot  spot  detec3on  –  Categories:  

•  Direct  Method  •  Inverse/Indirect  Method  •  Hybrid    

6/1/12   ©  Santanu  Sarma,  UCI     16  

Page 17: Sensor Placement and Allocation.pdf

Thermal  Sensor  Placement  Techniques  

•  Approach:  – Sta3c  Placement  Technique  

•  Dynamic  Selec3on  &  Scheduling  [Forte  2013]    – Dynamic  Placement  Technique  [Buedo2004]    

•  Geometric  Sensor  Placement    – Uniform  – Non-­‐Uniform  

•  Transformed  Domain  Placement  •  Informa3on  Theore3c  Sensor  Placement  

6/1/12   ©  Santanu  Sarma,  UCI     17  

Page 18: Sensor Placement and Allocation.pdf

Uniform  Sensor  Placement  

•  Intended  for  use  with  chips  that  have  an  unknown  typical  thermal  pa\ern  

•  Sensors  are  placed  in  a  uniform  sta3c  grid  throughout  the  en3re  chip  

•  Only  a  finely-­‐grained  grid  of  sensors  is  capable  of  achieving  near-­‐perfect  accuracy  

•  Significant  cost  restric3ons  associated  with  sensor  overheads  

6/1/12   ©  Santanu  Sarma,  UCI     18  

Page 19: Sensor Placement and Allocation.pdf

Uniform  Sensor  Placement  

6/1/12   ©  Santanu  Sarma,  UCI     19  [Skadron  2005,  Sankaranarayanan2009  ]  

Actual   Sensed  

Page 20: Sensor Placement and Allocation.pdf

Uniform  Sensor  Placement  

6/1/12   ©  Santanu  Sarma,  UCI     20  [Sankaranarayanan2009  ]  

Page 21: Sensor Placement and Allocation.pdf

Uniform  Sensor  Placement  

Advantages    •  Does  not  rely  on  thermal  

profiling  data  •  No  knowledge  of  hot  spot  

loca3ons  and  temperatures  needs  to  be  acquired  prior  to  implemen3ng  a  technique  of  this  type  

Disadvantages  •  limits  the  accuracy  of  the  

uniform  grid  model  •  distances  between  the  

sensor  loca3ons  and  the  hot  spots  cannot  be  minimized  

•  Not  always  be  able  to  detect  hot  spots  as  accurately  as  the  same  number  of  sensors  located  near  common  hot  spots  

6/1/12   ©  Santanu  Sarma,  UCI     21  

Page 22: Sensor Placement and Allocation.pdf

Interpola0on-­‐Based  Sensing  [Memic  2008]  

•  Uniform  Sensor  Placement  •  A  straight-­‐forward  linear  

interpola3on  approach  •  accounts  for  fine-­‐grain  grid  

restric3on  and  refine  the  temperature  measurements  

•  Interpola3on  scheme  with  a  4  x4  grid  of  sensors  improve  upon  a  sta3c  uniform  grid  of  the  same  size  with  no  interpola3on  by  an  average  of  1.59◦C  

6/1/12   ©  Santanu  Sarma,  UCI     22  [Memic  2008]  

Page 23: Sensor Placement and Allocation.pdf

Interpola0on-­‐Based  Sensing    

6/1/12   ©  Santanu  Sarma,  UCI     23  [Sankaranarayanan2009  ]  

Page 24: Sensor Placement and Allocation.pdf

Non-­‐uniform  Sensor  Placement  

•  Intended  for  use  where  thermal  maps  from  typical  chip  execu3on  across  several  applica3ons  are  available  

•  Take  advantage  of  the  known  hot  spots  to  determine  the  most  advantageous  loca3ons  

•  Methods:  –  Hot  Spots  based  Placement  –  Analy3cal  Model  based  Non-­‐Uniform  Placement  [Skadron2005a]  

–  Quality-­‐Threshold  Clustering  [Yun  2008]  –  K-­‐Means  Clustering  [Mukherjee  2006]  –  Power-­‐Driven  Correla3on  Clustering  Based[Wang  2013]  

6/1/12   ©  Santanu  Sarma,  UCI     24  

Page 25: Sensor Placement and Allocation.pdf

Hot  Spots  Based  Placement  

•  HotSpot  based  Placement:  place  a  sensor  on  each  hot  spot  found  through  thermal  profiling  across  several  applica3ons  

6/1/12   ©  Santanu  Sarma,  UCI     25  

Page 26: Sensor Placement and Allocation.pdf

Hot  Spots  Based  Placement  

Advantages    •  Easy  to  detect  hotspots  for  

a  given  thermal  maps  •  Can  detect  thermal  

viola3ons  with  a  limited  number  of  sensors  (less  than  uniform)  

•  Temperatures  found  via  thermal  profiling  of  several  applica3ons  can  provide  good  es3mates  

Disadvantages  •  Hot  spot  loca3ons  and  

temperatures  are  applica3on  dependent  

•  Solu3on  op3mized  for  a  single  applica3on  will  not  be  sufficient  for  other  workloads  

•  Can  have  too  many  hot  spots  points    and  hence  many  sensors    

6/1/12   ©  Santanu  Sarma,  UCI     26  

Page 27: Sensor Placement and Allocation.pdf

Analytical Model based Non-Uniform Placement [Skadron2005a]

•  Non-­‐uniform  sensor  placement  based  on  hot  spot  loca3ons  and  temperatures  found  via  thermal  profiling  

•  Describe  the  maximum  radius  R  between  a  hot  spot  and  a  poten3al  thermal  sensor  loca3on,  while  capping  the  error  to  a  degree  ΔT  

6/1/12   ©  Santanu  Sarma,  UCI     27  

ΔT  denotes  the  difference  between  the  maximum  and  minimum  temperature  value  in  the  chip  

Page 28: Sensor Placement and Allocation.pdf

Analy0cal  Model  based  Non-­‐Uniform  Placement  [Skadron2005a]  

6/1/12   ©  Santanu  Sarma,  UCI     28  

R  

Page 29: Sensor Placement and Allocation.pdf

Quality-­‐Threshold  Clustering  [Yun  2008]  

•  Hot  spot  groupings  and  corresponding  sensor  loca3ons  are  determined  based  on  the  values  of  Tmax  

•  Incorporates  analy3cal  model  of  sensor  placement  radius  with  the  quality  threshold  (QT)  clustering  algorithm  commonly  used  in  gene  clustering  

•  Itera3ve  technique  that  assigns  hot  spots  to  clusters    –  based  on  their  physical  loca3ons  on  the  chip  rela3ve  to  the  other  hot  spots.  

6/1/12   ©  Santanu  Sarma,  UCI     29  

Page 30: Sensor Placement and Allocation.pdf

6/1/12   ©  Santanu  Sarma,  UCI     30  

Page 31: Sensor Placement and Allocation.pdf

Quality-­‐Threshold  Clustering  [Yun  2008]  

•  Sensor  loca3on  for  each  cluster  is  refined  axer  the  addi3on  of  a  candidate  hot  spot  to  be  the  centroid  of  the  included  hot  spots  –  obtains  the  best  possible  sensor  loca3on  for  the  given  set  of  hot  spot  data  points  

•  QT  Clustering  resulted  in  placing  23  sensors  in  Alpha  21364  with  an  average  error  of  0.2899◦C.  

•  To  place  fewer  sensors  using  QT  clustering:      –  hot  spot  to  sensor  distance  value  must  be  increased,    – may  decrease  the  accuracy  of  the  en3re  model’s  results.  

6/1/12   ©  Santanu  Sarma,  UCI     31  

Page 32: Sensor Placement and Allocation.pdf

Quality-­‐Threshold  Clustering  [Yun  2008]  

Advantages  •  algorithm  proves  to  be  

sufficient  for  monitoring  thermal  events  

•  obtains  the  best  possible  sensor  loca3on  for  the  given  set  of  hot  spot  data  points  

Disadvantages  •  algorithm  does  not  end  

execu3on  un3l  every  hot  spot  is  placed  in  a  cluster  

•  Creates  new  clusters  where  necessary  to  include  hot  spots  that  are  located  far  away  from  the  others  

•  number  of  sensors  required  by  the  QT  clustering  may  be  large  for  prac3cal  design  

6/1/12   ©  Santanu  Sarma,  UCI     32  

Page 33: Sensor Placement and Allocation.pdf

K-­‐Means  Clustering  Based  Sensor  Placement  [Mukherjee  2006]  

•  Hot  spots  are  placed  into  k  different  clusters,  with  a  temperature  sensor  placed  at  the  centroid  of  each  cluster  

•  cluster  assignments  are  chosen  such  that  the  mean  squared  distance  from  each  hot  spot  to  the  nearest  cluster  center  is  minimized  

•  Algorithm:  –  First,  the  k  cluster  centers  are  chosen  randomly  from  the  set  of  known  hot  

spot  points  –  Each  hot  spot  is  then  assigned  to  a  cluster  –  Each  cluster  center  is  updated  at  the  end  of  each  itera3on  –  Euclidean  distances  between  the  hot  spots  and  the  cluster  centers  are  then  

recomputed  –  If  a  new  minimum  distance  between  a  hot  spot  and  a  different  cluster  center  

is  found,  the  hot  spot  is  reassigned  to  the  corresponding  cluster.  –  process  is  repeated  un3l  no  hot  spot  are  reassigned  to  a  different  cluster  

6/1/12   ©  Santanu  Sarma,  UCI     33  

Page 34: Sensor Placement and Allocation.pdf

k-­‐Means  Clustering  Sensor  Placement  [Mukherjee  2006]  

6/1/12   ©  Santanu  Sarma,  UCI     34  

Thermal-­‐Aware  K-­‐Means:  Place  the  temperature  sensors  to  hot  spots  that  typically  have  higher  temperatures  

Page 35: Sensor Placement and Allocation.pdf

Thermal-­‐Aware  k-­‐Means  Clustering  [Mukherjee  2006]  

6/1/12   ©  Santanu  Sarma,  UCI     35  

sensors  have  been  placed  closer  to  the  hot  spots  of  higher  temperature  and  further  from  the  hot  spots  of  lower  temperature  

Clustering  results  on  the  same  hot  spot  set  

Page 36: Sensor Placement and Allocation.pdf

Thermal-­‐Aware  k-­‐Means  Clustering  [Mukherjee  2006]  

Advantages    •  Thermal-­‐gradient  aware  k-­‐

means  clustering  is  effec3ve  for  single-­‐core  processors  

•  Works  well  under  many  condi3ons  

•  Be\er  than  [Long  2008]  for  given  number  of  sensors  

Disadvantages  •  Not  appropriate  for  mul3-­‐core  

processors  with  strong  inter-­‐core  thermal  interac3on  

•  Not  always  op3mal  in  complex  hot  spot  distribu3on  scenarios  and  may  produce  solu3ons  worse  than  the  basic  k-­‐means  approach  

•  hot  spots  are  oxen  sorted  into  inappropriate  clusters  due  to  their  common  temperature  regardless  of  posi3on  

6/1/12   ©  Santanu  Sarma,  UCI     36  

Page 37: Sensor Placement and Allocation.pdf

Non-­‐uniform  Subsampling  Method  Based  Placement  [Sabuncu2004]    

•  In  many-­‐core  architectures,  there  is  a  high  likelihood  of  measuring  a  very  large  number  of  global  hot  spots  

•  No  of  hotspots  can  be  too  large  that  clustering  methods  are  not  able  to  place  a  sufficient  number  of  sensors  near  the  ho\est  points  

•  To  reduce  the  number  of  points  to  be  clustered  while  maintaining  clear  representa3on  of  thermal  data,  non-­‐uniform  subsampling  algorithms  can  be  used  

•  Obtains  a  subset  of  key  thermal  analysis  loca3ons  on  a  chip  •  Types  Non-­‐uniform  Subsampling  :  

–  Determinis3c  Subsampling  –  Stochas3c  Subsampling  

6/1/12   ©  Santanu  Sarma,  UCI     37  

Page 38: Sensor Placement and Allocation.pdf

Determinis3c  Hotspot  Subsampling  [Sabuncu2004]      

6/1/12   ©  Santanu  Sarma,  UCI     38  

Samples  are  selected  more  frequently  in  regions  of  high  gradient  

Page 39: Sensor Placement and Allocation.pdf

Stochas3c  Hotspot  Subsampling  [Sabuncu2004]    

6/1/12   ©  Santanu  Sarma,  UCI     39  

More  Samples  in  the  hoYest  region  than  in  the  coolest  region  

Page 40: Sensor Placement and Allocation.pdf

Correla0on  Clustering  based  Sensor  Placement  [Wang  2013]  

•  Uses  direct  method  (like  hotspot)  to  reconstruct  the  hotspot  using  approximate  power  es3mates  of  the  blocks.  

•  Correct  the  approximates  power  es3mates  of  the  block  by  using  the  o-­‐chip  thermal  measurements    

•  Exploits  the  correla3on  between  power  es3ma3on  errors  among  func3onal  blocks  to  perform  sensor  placement  

•  Applies  the  correla3on  clustering  algorithm  [Bansal2002]  to  determine  both  the  loca3ons  of  sensors  and  the  number  of  sensors  automa3cally    

6/1/12   ©  Santanu  Sarma,  UCI     40  

Page 41: Sensor Placement and Allocation.pdf

Correla0on  Clustering  based  Sensor  Placement  [Wang  2013]  

•  Reports  be\er  results  than  uniform  and  k-­‐mean  clustering  methods    

6/1/12   ©  Santanu  Sarma,  UCI     41  

Page 42: Sensor Placement and Allocation.pdf

Thermal  Sensor  Placement    

References     Uniform   Non  Uniform  

Interpolated  sensing  

Accuracy   Remarks  

[Memic  2008]     Yes   No   Yes   1.59◦C  With  16  sensors  

Nearest  neighbor  interpola3on  

[Sankaranarayanan2009]  

Yes   No   Yes   Be\er    than  [Memic  2008]  

Bilinear  /  Cubic  Spline  

[Long2008]   Yes   No   Yes   3.1◦C  with  16  sensors    

Non-­‐linear  Spline  

[Skadron2005a]     No   Yes   Yes   Analy3cal  Model  Based  

[Yun  2008]   No   Yes   No   0.2899◦C  with  23  sensors  

QT  Clustering  +  Analy3cal  Model    

[Mukherjee06]   No   Yes   No   4.58◦C  with  16  sensors  

Basic  K-­‐means  clustering  

[Mukherjee06]   No   Yes   No   2.1◦C  with  16  sensors  

Thermal-­‐aware  K-­‐means  clustering  6/1/12   ©  Santanu  Sarma,  UCI     42  

Page 43: Sensor Placement and Allocation.pdf

Thermal  Sensor  Placement    

References     Uniform   Non  Uniform  

Interpolated  sensing  

Accuracy   Remarks  

[Long2008]   No   Yes   Yes   ~2.0◦C  with  16  sensors    

Local  &  Global  Hotspots  

[Sabuncu2004]     No   Yes   Depends   -­‐   Sub-­‐sampling    

[Wang  2013]   No   Yes   Yes   0.26oC  with  14  sensors  

Correla3on  Clustering  

6/1/12   ©  Santanu  Sarma,  UCI     43  

Page 44: Sensor Placement and Allocation.pdf

Thermal  Reconstruc3on  Methods  

•  Direct  Methods    –  Hotspot  [Skadron  et  al.]  –  Temptor  [Koren  et  al.]  –  TILTS  [Koren  et  al.]  –  3DICE  [A3naza  et  al.]  –  FlowTherm  [Mentor  Graphics]  

–  ANSYS  [Commercial  FEM]  

•  INVERSE  Methods    –  Based  on  Sensing  Approach  

•  Physical  /  Computa3onal  (  Interpolated  /  Virtual)    

•  Hard/  Sox  [Reda  2011]  •  Dynamic  Selec3on  [Jong2008]  

•  INVERSE  Methods    –  Based  on  Sampling  Mechanism  

•  Uniform  Determinis3c  Subsampled/  Stochas3c  Random    Subsampled    

–  Based  Transformed  domain    •  FFT/DFT/  DCT/  KLT/  DWT  

–  Based  on  Informa3on  Theory  •  Eigenmaps  based  •  Entropy  Based  •  Bayesian  Sta3s3cs  based    

•  ADAPTIVE  Methods    –  Direct/Indirect  [Sharif  2010]  –  Phase  based  [Reda  2012]  –  Predic3ve  Model  based  

[Coskun  2009]  

6/1/12   ©  Santanu  Sarma,  UCI     44  

Page 45: Sensor Placement and Allocation.pdf

Thermal  Reconstruc3on  Problem  

•  The  thermal  map  of  a  processor  can  be  es3mated  using  two  dual  strategies:  – Solu3on  of  the  direct  problem,  given  the  heat  sources  and  the  physical  model  of  the  temperature  diffusion  (e.g.  a  nonlinear  diffusion  equa3on),  

– Solu3on  of  the  inverse  problem,  given  the  value  of  the  temperature  in  some  loca3ons  and  some  a-­‐priori  informa3on  about  the  thermal  map.  

6/1/12   ©  Santanu  Sarma,  UCI     45  

Page 46: Sensor Placement and Allocation.pdf

Direct  Methods  :  R-­‐C  Network  Based  Thermal  Profile  Reconstruc0on  

•  Direct  Problem  Formula3on  

•  Tools:  – Hotspot  [Skadron  et  al.]  –  Temptor  [Koren  et  al.]  –  TILTS  [Koren  et  al.]  –  3DICE  [A3naza  et  al.]  –  FlowTherm  [Mentor  Graphics]  – ANSYS  [Commercial  Tool]  

6/1/12   ©  Santanu  Sarma,  UCI     46  

Page 47: Sensor Placement and Allocation.pdf

Heat  Diffusion  through  an  IC  given  by  Poisson’s  PDE  Equa0on:  

Direct  Method  Thermal  Dissipa3on  Model  

6/1/12   ©  Santanu  Sarma,  UCI     47  

Page 48: Sensor Placement and Allocation.pdf

Direct  Methods  :  R-­‐C  Network  Based  Thermal  Profile  Reconstruc0on  

6/1/12   ©  Santanu  Sarma,  UCI     48  

Page 49: Sensor Placement and Allocation.pdf

6/1/12   ©  Santanu  Sarma,  UCI     49  

Page 50: Sensor Placement and Allocation.pdf

6/1/12   ©  Santanu  Sarma,  UCI     50  

Page 51: Sensor Placement and Allocation.pdf

Ti  –Temperature  of  node  I  Tj-­‐  Temperature  at  node  j  Pi-­‐  Power  dissipa0on  at  node  I  Ci-­‐    thermal  Capacitance  at  node  i  Gij-­‐  lateral  conductance  between  node  I  and  j    =1/Rij  

RC-­‐Network  Based  Direct  Method    

6/1/12   ©  Santanu  Sarma,  UCI     51  

Page 52: Sensor Placement and Allocation.pdf

Direct  Methods  

Advantages    •  Highly  Accurate    •  Finite  Element  Model  (FEM)  

based    automated  R-­‐C  network  can  be  generated    

•  Supported  by  many  tools  

Disadvantages  •  Computa3onally  Intensive  •  Not  feasible  as  run-­‐3me  

approach  •  Requires  power  at  every  

block/grid  point      

6/1/12   ©  Santanu  Sarma,  UCI     52  

Page 53: Sensor Placement and Allocation.pdf

Inverse  Methods  

•  Inverse  methods:  given  the  value  of  the  temperature  in  some  loca3ons  and  some  a-­‐priori  informa3on  about  the  thermal  map,    reconstruct  the  complete  map.  

6/1/12   ©  Santanu  Sarma,  UCI     53  

Page 54: Sensor Placement and Allocation.pdf

Transformed  Domain  Methods  

•  FFT/DFT  Based  [Chochran  2009]  •  DCT  Based  [Nowroz2010]  •  KLT/PCA  Based  [Juri  2012]  •  DWT  Based  [Cho2009]  

6/1/12   ©  Santanu  Sarma,  UCI     54  

Page 55: Sensor Placement and Allocation.pdf

FFT  Based  Reconstruc3on  [Cochran2010]    

•  Considers  temperature  as  simply  a  space-­‐varying  signal  and  performs  Spectral  Fourier  analysis  technique  

•  Space  domain  Convolu3on  (interpola3on)  is  replaced  by  mul3plica3on  in  Frequency  Domain  

•  Proposes  methods  to  handle  uniform  and  non-­‐uniform  thermal  sensor  placements  

6/1/12   ©  Santanu  Sarma,  UCI     55  

Page 56: Sensor Placement and Allocation.pdf

FFT  Based  Reconstruc3on  [Cochran2010]    

6/1/12   ©  Santanu  Sarma,  UCI     56  

Page 57: Sensor Placement and Allocation.pdf

FFT  Based  Reconstruc3on  [Cochran2010]    

6/1/12   ©  Santanu  Sarma,  UCI     57  

Page 58: Sensor Placement and Allocation.pdf

FFT  Based  Reconstruc3on  [Cochran2010]    

6/1/12   ©  Santanu  Sarma,  UCI     58  

Page 59: Sensor Placement and Allocation.pdf

FFT  Based  Reconstruc3on  [Cochran2010]    

6/1/12   ©  Santanu  Sarma,  UCI     59  

Hot  spot  es3ma3on  full  thermal  characteriza3on  

Page 60: Sensor Placement and Allocation.pdf

K-­‐LSE  :  DCT  Based  Placement  [Norwiz  2010]  

•  On-­‐chip  thermal  gradients  lead  to  sparse  signals  in  the  frequency  domain    

•  Use  DCT  based  transforma3on  to  establish  the  sparsity  in  frequency  domain  

•  Exploit  this  observa3on  to    – devise  thermal  sensor  alloca3on  techniques,    – devise  signal  reconstruc3on  techniques  that  fully  characterize  the  thermal  status    

6/1/12   ©  Santanu  Sarma,  UCI     60  

Page 61: Sensor Placement and Allocation.pdf

K-­‐LSE  :  DCT  Based  Placement  [Norwiz  2010]  

6/1/12   ©  Santanu  Sarma,  UCI     61  

Page 62: Sensor Placement and Allocation.pdf

K-­‐LSE  :  DCT  Based  Placement  [Norwiz  2010]  

6/1/12   ©  Santanu  Sarma,  UCI     62  

placed  at  the  centroids    Placed  at  the  center  

Page 63: Sensor Placement and Allocation.pdf

K-­‐LSE  :  DCT  Based  Placement  [Norwiz  2010]  

6/1/12   ©  Santanu  Sarma,  UCI     63  

Page 64: Sensor Placement and Allocation.pdf

Eigenmaps  (KLT/PCA  Based)  [Juri  2012]  

•  Uses  Principal  Component  Analysis  (PCA)  to  determine  the  transform    

•  Exploits  the  structural  correla3on  and  temporal  varia3ons  in  the  thermal  map  to  achieve  very  high  reconstruc3on  accuracy  

•  Performs  sensor  placement  and  alloca3on  to  the  most  important  loca3ons  corresponding  to  the  principal  components  

•  Considers  non-­‐ideal  sensors  with  noise  and  error.    •  Proposes  a  LSE  formula3on  for  reconstruc3on  •  Greedy  Algorithm  for  Placement        

6/1/12   ©  Santanu  Sarma,  UCI     64  

Page 65: Sensor Placement and Allocation.pdf

Eigenmaps  (KLT/PCA  Based)  [Juri  2012]  

6/1/12   ©  Santanu  Sarma,  UCI     65  

Page 66: Sensor Placement and Allocation.pdf

Eigenmaps  (KLT/PCA  Based)  [Juri  2012]  

6/1/12   ©  Santanu  Sarma,  UCI     66  

The  reconstruc3on  error  as  a  func3on  of  the  number  of  sensors  used.  

The  reconstruc3on  error  in  presence  of  measurement  noise  as  a  func3on  of  the  SNR  using  16  sensors  

Page 67: Sensor Placement and Allocation.pdf

Informa3on  Theore3c  Approaches    

•  Compressive  Sensing  Based  [Candes  2006,  Donoho2006,  Tropp2007,  Zang2011a]  

•  Bayesian  Sta3s3cs  Based  [Zang2010,  Zang2011a]    

•  Entropy  Based  [Zhou  2012]  

6/1/12   ©  Santanu  Sarma,  UCI     67  

Page 68: Sensor Placement and Allocation.pdf

Compressive  Sensing  Based  Reconstruc3on  [Donoho2006]    

•  Key  Idea:  Thermal  Profile  is  Sparse  in  Either    temporal  or  spa3al  domain  

•  Random  sampling  in  3me  or  spa3al  domain  i.e.  the  sensor  placement  can  be  random      

•  From  few  random  samples  it  is  possible  to  reconstruct  the  complete  profile  if  the  thermal  signal  is  Sparse    

6/1/12   ©  Santanu  Sarma,  UCI     68  

Page 69: Sensor Placement and Allocation.pdf

Bayesian  Sta3s3cs  Based  Reconstruc3on  [Zang2010]    

•  Uses  the  idea  of  Bayesian  inference  and  informa3on  theory  from  sta3s3cs    –  to  determine  an  op3mal  set  of  sampling  loca3ons  where  test  structures/sensor  should  be  deployed  and  measured  

–  to  monitor  spa3al  varia3ons  with  maximum  accuracy  •  Unlike  Random  Sampling  in  Compressive  Sensing,  it  used  Bayesian  inference  to  select  the  best  loca3ons  

•  Can  be  used  characterize  and  monitor  spa3al  temperature  

6/1/12   ©  Santanu  Sarma,  UCI     69  

Page 70: Sensor Placement and Allocation.pdf

Bayesian  Sta3s3cs  Based  Reconstruc3on  [Zang2010]    

6/1/12   ©  Santanu  Sarma,  UCI     70  

Page 71: Sensor Placement and Allocation.pdf

Entropy  Based  Op0mal  Temperature  Sensor  Alloca0on  [Zhou  2012]  

6/1/12   ©  Santanu  Sarma,  UCI     71  

Temperature  sensor  loca3ons  are  selected  by  different  alloca3on  algorithms:  (a)  the  k-­‐mean  clustering  method,  (b)  the  par33on  method,  (c)  the  Bayesian  method,  and  (d)  the  entropy  method  

[Zhou2012]  

Key  Idea:  Entropy  of  the  Thermal  Map    Can  precisely  iden0fy  the  hotspots  and  And  place  them  near  to  them.  

Entropy  =measure  of  randomness  or    varia3ons  in  the  signal    

Page 72: Sensor Placement and Allocation.pdf

Adap0ve  Online  Methods  

•  Ability  to  update  parameters  /model  at  run3me  /on-­‐line  

•  Model-­‐Based  Control  Centric  Approach  and  System  Iden3fica3on    

•  Measurement  Driven  Es3ma3on    –  State  Es3mators  &  Observers    –  Kalman  Filters  –  Adap3ve  Filters  

•  Regression  based  Predic3on      –  AR/ARMA  /Other  Parametric  Models  –  PCA    

6/1/12   ©  Santanu  Sarma,  UCI     72  

Page 73: Sensor Placement and Allocation.pdf

Regressive  Model  based  Predic3on  &  Reconstruc3on  [Coskun2008]    

6/1/12   ©  Santanu  Sarma,  UCI     73  

Auto-­‐regressive  moving  average  (ARMA)  based  forcus3ng  

Online  adapta0on  when  exis3ng  model  is  not  fi{ng  the  current  workload    

Page 74: Sensor Placement and Allocation.pdf

Full-­‐Chip  Run-­‐3me  Thermal  Es3ma3on  and  Predic3on  [Wang2011]  

6/1/12   ©  Santanu  Sarma,  UCI     74  

correla0on  based  method  for  error  compensa0on  

Page 75: Sensor Placement and Allocation.pdf

Phase  Predic3on  Based  Reconstruc3on  [Reda  2013]  

6/1/12   ©  Santanu  Sarma,  UCI     75  

Page 76: Sensor Placement and Allocation.pdf

Other  Adap3ve  Methods  [Noise  Compensa3ng]  

•  Based  on  Kalman  &  Adap3ve  Filters:  – ZHANG,  Y.,  AND  SRIVASTAVA,  A.  “Adap3ve  and  autonomous  thermal  tracking  for  high  performance  compu3ng  systems,”  In  DAC,  2010.  

– ZHANG,  Y.,  AND  SRIVASTAVA,  A.  “Accurate  temperature  es3ma3on  using  noisy  thermal  sensors,”  In  DAC,  2009.  

•  Sensor  Error  Compensa3on    – Compensa3ng    &  Calibra3ng  Noisy  and  Erroneous  On-­‐Chip  sensors  [Sharif  2010]  

6/1/12   ©  Santanu  Sarma,  UCI     76  

Page 77: Sensor Placement and Allocation.pdf

Hybrid  Method:  Infrared  (IR)  Imaging  

6/1/12   ©  Santanu  Sarma,  UCI     77  

Page 78: Sensor Placement and Allocation.pdf

References  •  [Skadron2003]  K.  Skadron,  M.  R.  Stan,  W.  Huang,  S.  Velusamy,  K.  Sankaranarayanan,  and  D.  Tarjan,  

“Temperature  aware  microarchitecture,”  in  Proc.  Int.  Symp.  Comput.  Architect.,  Jun.  2003,  pp.  2–13.  

•  [Skadron2005]  Skadron,  K.,  Lee,  K.:  Using  Performance  Counters  for  Run3me  Temperature  Sensing  in  High-­‐Performance  Processors.  In:  19th  IEEE  Interna3onal  Parallel  and  Distributed  Processing  Symposium,  pp.  232a–232a  (2005),  h\p://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1420152  

•  [Skadron2005a]  Skadron,  K.,  Huang,  W.:  Analy3cal  model  for  sensor  placement  on  microprocessors.  In:  2005  Interna3onal  Conference  on  Computer  Design,  pp.  24–27  (2005),  h\p://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=1524125  

•  [Hamann2007]  H.  F.  Hamann,  A.  Weger,  J.  Lacey,  Z.  Hu,  P.  Bose,  E.  Cohen,  and  J.  Wakil.  Hotspot-­‐limited  Microprocessors:  Direct  Temperature  and  Power  distribu3on  Measurements.  IEEE  Journal  of  Solid-­‐State  Circuits,  42:56–65,  January  2007.  

•  [Im200]  S.  Im  and  K.  Banerjee,  B  “Full  chip  thermal  analysis  of  planar  (2-­‐D)  and  ver3cally  integrated  (3-­‐D)  high  performance  Ics,”    in  Tech.  Dig.  IEEE  Int.  Electron  Devices  Mee3ng,  2000,  pp.  727–730.  

•  [Han2006]  Y.  Han,  I.  Koren,  and  C.  Krishna.  Temptor:  A  lightweight  run3me  temperature  monitoring  tool  using  performance  counters.  Third  Workshop  on  Temperature-­‐Aware  Computer  Systems  in  conjunc3on  with  ISCA-­‐33,  June  2006.  

•  [Han2007]  Y.  Han,  I.  Koren,  and  C.  M.  Krishna.  TILTS:  A  fast  architectural-­‐level  transient  thermal  simula3on  method.  Journal  of  Low  Power  Electronics,  3(1),  2007.  

•  [Memik2008]  Memik,  S.O.,  Mukherjee,  R.,  Ni,  M.,  Long,  J.:  Op3mizing  Thermal  Sensor  Alloca3on  for  Microprocessors.  IEEE  Transac3ons  on  Computer-­‐Aided  Design  of  Integrated  Circuits  and  Systems  27(3),  516–527  (2008),  h\p://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?arnumber=4454017  

6/1/12   ©  Santanu  Sarma,  UCI     78  

Page 79: Sensor Placement and Allocation.pdf

References  •  [Borkar2003]  S.  Borkar,  T.  Karnik,  S.  Narendra,  J.  Tschanz,  A.  Keshavarzi  and  V.  De,  “Parameter  varia3ons  

and  impact  on  circuits  and  microarchitecture,”  IEEE  DAC,  pp.  338-­‐342,  2003.  •  [Sankaranarayanan2009]  K.  Sankaranarayanan,  W.  Huang,  M.  R.  Stan,  H.  Haj-­‐Hariri,  R.  J.  Ribando,  and  K.  

Skadron.  Granularity  of  microprocessor  thermal  management:  a  technical  report.  Technical  Report  CS-­‐2009-­‐03,  University  of  Virginia  Department  of  Computer  Science,  April  2009.  

•  [Long2008]  J.  Long,  S.  O.  Memik,  and  G.  Memik,  “Thermal  monitoring  mechanisms  for  chip  mul3processors,”  ACM  Trans.  Architect.  Code  Op3m.,  vol.  5,  no.  2,  pp.  9.1–9.23,  Aug.  2008.  

•  [Yun2008]  Yun,  X.:  On-­‐Chip  Thermal  Sensor  Placement,  Master’s,  University  of  Massachuse\s  Amherst  (2008),  h\p://scholarworks.umass.edu/cgi/viewcontent.cgi?ar3cle=1242&context=theses  

•  [Mukherjee2006]Mukherjee,  R.,  Memik,  S.O.:  Systema3c  temperature  sensor  alloca3on  and  placement  for  microprocessors.  In:  Proceedings  of  the  43rd  Annual  Design  Automa3on  Conference,  DAC  2006,  pp.  542–547.  ACM,  New  York  (2006)  

•  [Sabuncu2004]  Sabuncu,  M.R.,  Ramadge,  P.J.:Gradient  based  nonuniform  subsampling  for  informa3ontheore3c  alignment  methods.  In:  26th  Annual  Interna3onal  Conference  of  the  IEEE  on  Engineering  in  Medicine  and  Biology  Society  (IEMBS),  pp.  1683–1686  (2004)  

•  [Cochran2009]  R.  Cochran  and  S.  Reda.  Spectral  Techniques  for  High-­‐Resolu3on  Thermal  Characteriza3on  with  Limited  Sensor  Data.  In  Design  Automa0on  Conference,  pages  478–483,  2009.    

•  [Buedo2002]  S.  Lopez-­‐Buedo,  J.  Garrido,  E.I.  Boemo,  ‘Dynamically  inser3ng,  opera3ng  and  elimina3ng  thermal  sensors  of  FPGA-­‐based  systems’,  IEEE  Transac3ons  on  components  and  packaging  technologies,  Vol.25,  No.4,  Dec  2002.  

•  [Buedo2004]  L.Buedo  and  E.  Boemo.”Making  Visible  the  Thermal  Behaviour  of  Embedded  Microprocessors  on  FPGAs.  A  Progress  Report”.  .  FPGA’04,  February  22–24,  2004,  Monterey,  California,  USA  

6/1/12   ©  Santanu  Sarma,  UCI     79  

Page 80: Sensor Placement and Allocation.pdf

References  

•  [NOWROZ2010]  NOWROZ,  A.  N.,  COCHRAN,  R.,  AND  REDA,  S.  Thermal  monitoring  of  real  processors:  techniques  for  sensor  alloca3on  and  full  characteriza3on.  In  DAC  (2010).  

•  [REDA2011]  REDA,  S.,  COCHRAN,  R.,  AND  NOWROZ,  A.  N.  Improved  Thermal  Tracking  for  Processors  Using  Hard  and  Sox  Sensor  Alloca3on  Techniques.  IEEE  Trans.  Comput.  60,  6  (Nov.  2011),  841–851.  

•  [Sharif2010]  Sharif  et  al.  “Accurate  Direct  and  Indirect  On-­‐Chip  Temperature  Sensing  for  Efficient  Dynamic  Thermal  Management,”  IEEE  TRANSACTIONS  ON  COMPUTER-­‐AIDED  DESIGN  OF  INTEGRATED  CIRCUITS  AND  SYSTEMS,  VOL.  29,  NO.  10,  OCTOBER  2010.  

•  [Zhang2010]  Y.  Zhang,  A.  Srivastava,  and  M.  Zahran,  “On-­‐Chip  Sensor-­‐Driven  Efficient  Thermal  Profile  Es3ma3on  Algorithms,”  ACM  Trans.  Design  Automa3on  of  Electronic  Systems,  vol.  15,  no.  3,  p.  25:1,  2010.  

•  [Zhang2009]  ZHANG,  Y.,  AND  SRIVASTAVA,  A.  Adap3ve  and  autonomous  thermal  tracking  for  high  performance  compu3ng  systems.  In  DAC  (2010).References  ZHANG,  Y.,  AND  SRIVASTAVA,  A.  Accurate  temperature  es3ma3on  using  noisy  thermal  sensors  .  In  DAC  (2009)  

6/1/12   ©  Santanu  Sarma,  UCI     80  

Page 81: Sensor Placement and Allocation.pdf

References  

•  [Donoho2006]  D.  Donoho,  “Compressed  sensing,”  IEEE  Trans.  Informa3on  Theory,  vol.  52,  no.  4,  pp.  1289-­‐1306,  Apr.  2006.  

•  [Candes  2006]  E  Candès.  Compressive  Sampling.  Proceedings  of  the  Interna3onal  Congress  of  Mathema3cians,  pages  1–20,  2006.  

•  [Tropp2007]  Joel  Tropp  and  Anna  Gilbert,  “Signal  recovery  from  random  measurements  via  orthogonal  matching  pursuit,”  IEEE  Trans.  on  Informa3on  Theory,  Vol.  53,  No.  12,  pp.  4655-­‐4666,  December  2007.  

•  [Tibshirani1996]  R.  Tibshirani,  “Regression  shrinkage  and  selec3on  via  the  Lasso,”  Journal  of  Royal  Sta3s3cal  Society,  vol.  58,  no.  1,  pp.  267-­‐288,  1996.  

•  [Zang2011a]  W.  Zhang,  K.  Balakrishnan,  Xin  Li,  D.  Boning,  and  R.  Rutenbar,    “Toward  efficient  spa3al  varia3on  decomposi3on  via  sparse  regression,”  IEEE  ICCAD,  pp.  162-­‐169,  2011.  

•  [Zang2010]  W.  Zhang,  X.  Li,  and  R.  Rutenbar,  “Bayesian  virtual  probe:  Minimizing  varia3on  characteriza3on  cost  for  nanoscale  IC  technologies  via  Bayesian  inference,”  in  Proc.  DAC,  2010,  pp.  262–267  

•  [Kudithipudi2013]Thermal  Management  in  Many  Core  Systems,  Thermal  Management  in  Many  Core  Systems,  Springer  ,2013.  

6/1/12   ©  Santanu  Sarma,  UCI     81  

Page 82: Sensor Placement and Allocation.pdf

References  

•  [Zhang2010]  ZHANG,  Y.,  AND  SRIVASTAVA,  A.  “Adap3ve  and  autonomous  thermal  tracking  for  high  performance  compu3ng  systems,”  In  DAC,  2010.  

•  [Zhang2010]  Y.  Zhang,  A.  Srivastava,  and  M.  Zahran,  “On-­‐Chip  Sensor-­‐Driven  Efficient  Thermal  Profile  Es3ma3on  Algorithms,”  ACM  Trans.  Design  Automa3on  of  Electronic  Systems,  vol.  15,  no.  3,  p.  25:1,  2010.  

•  [Zhang2009]  ZHANG,  Y.,  AND  SRIVASTAVA,  A.  “Accurate  temperature  es3ma3on  using  noisy  thermal  sensors,”  In  DAC,  2009.  

•  [Wang2013]  Hai  Wang,  Sheldon  X.-­‐D.  Tan,  Sahana  Swarup,  and  Xue-­‐Xin  Liu  “A  Power-­‐Driven  Thermal  Sensor  Placement  Algorithm  for  Dynamic  Thermal  Management,”  DATE  2013.  

•  [Bansal2002]  Correla3on  Clustering,  h\p://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.10.3857  

6/1/12   ©  Santanu  Sarma,  UCI     82  

Page 83: Sensor Placement and Allocation.pdf

www.variability.org   www.nsf.gov   www.uci.edu  

6/1/12   ©  Santanu  Sarma,  UCI     83  

Thanks    [email protected]