The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 ·...

36
The Curiosity Rover Landing http://www.youtube.com/watch?v=a4YqNoLkmxE Chenyang Lu 1

Transcript of The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 ·...

Page 1: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

The  Curiosity  Rover  Landing  

Ø  http://www.youtube.com/watch?v=a4YqNoLkmxE

Chenyang Lu 1

Page 2: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Landing  a  Spacecra7  on  Mars  

Ø  The control software onboard the spacecraft consists of about 3 MLOC. Mostly in C, with a small portion (mostly for surface navigation) in C++.

Ø  The code runs on a radiation hardened CPU. The CPU is a version of an IBM PowerPC 750, called RAD750. It has 4 GB of flash memory, 128 MB of RAM, and runs at 133 MHz.

Ø  ~75% of the code is autogenerated from formalisms, e.g., state-machine descriptions and XML files. The remainder was handwritten, in many cases building on code from earlier Mars missions.

Chenyang Lu 2

1

10

100

1,000

10,000

1970 1975 1980 1985 1990 1995 2000 2005 2010 2015

Line

s of

cod

e in

thou

sand

s (K

LOC)

Year

Code size—exponential growth trend

MSLRoverMars

exploration rovers

PhoenixPathfinder

Viking

FIGURE A. The amount of flight code that is flown to land spacecraft on Mars has grown exponentially in the last 36 years. Its Compound Annual Growth Rate comes out at roughly 1.20—close to the median value of 1.16 from previous columns.

Holzmann,  Gerard  J.,  "Landing  a  Spacecra7  on  Mars,"  IEEE  So7ware,  30(2),  pp.  83,  86,  March-­‐April  2013  

Page 3: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Midterm  Demo  

Ø  3/26, in class. Ø  20 min/team (including 3 min for discussion).

Ø Must show something real! Ø  Test and set up your demo in advance.

Ø  Email Rahav a summary of your demo and progress q  Deadline: 11:59pm, 3/26. q  Clearly state the contribution of each team member.

Page 4: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Midterm  Exam  

Ø Open book, note, papers. Ø  Bring your calculator!

Ø  Lecture slides override textbook when inconsistent.

Chenyang Lu 4

Page 5: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Scope  

Ø Introduction Ø Power management Ø Program optimization

Ø TinyOS

Chenyang Lu 5

Page 6: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Embedded  Systems  

Ø Non-functional constraints q Real-time

q Power q Energy

q Memory q Cost

q Size

Ø Designed to tight deadlines by small teams

Chenyang Lu 6

Page 7: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Alterna@ve  Technologies  

Ø Application-Specific Integrated Circuits (ASIC) Ø Microprocessor

Ø  Field-Programmable Gate Arrays (FPGA)

Chenyang Lu 7

Page 8: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

ASIC  

ü Performance ü Power: Fewer logic elements à low power û Development cost: Very high

q 2 million $ for starting production of a new ASIC q Needs a long time and a large team

û  Reprogrammability: None! q Difficult to upgrade systems

q Single-purpose devices

Chenyang Lu 8

Page 9: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Microprocessor  

–  Performance û  Programmable architecture is fundamentally slow!

•  Fetch, decode instructions

ü  Highly optimized architecture and manufacturing process

•  Pipeline; cache; clock frequency; circuit density; multi-core…

û  Power q  Processors perform poorly in terms of performance/watt!

q  Power management can alleviate the power problem.

ü  Flexibility, development cost and time q  Let software do the work!

Chenyang Lu 9

Page 10: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

State  of  the  Prac@ce  Ø Microprocessor is the dominant player

q  Flexibility + low development cost >> low performance/watt q  Power management is crucial.

Ø Microprocessor + ASIC is common q  Ex: cell phone

Ø  FPGA is expected to improve

Chenyang Lu 10

Page 11: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Power  vs.  Energy  

Ø Power = energy consumption per unit time

Ø Power à Heat

Ø Energy à Battery life

Chenyang Lu 11

Page 12: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Hardware  Support  

Ø Clock gating Ø Shut down power supply Ø Dynamic Voltage Scaling

Chenyang Lu 12

Page 13: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Requirements  

Ø Minimize power under performance constraints q Real-time applications

Ø Optimize performance under power constraints q Battery lifetime constraint

Ø Different tradeoff points in design space

Chenyang Lu 13

Page 14: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Factors  in  Dynamic  Power  Management  

Ø Device: Power State Machine (PSM) Ø Workload: distribution of active and idle intervals

Chenyang Lu 14

Page 15: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

SA-­‐1100  Power  State  Machine  

Chenyang Lu 15

run  

idle   sleep  

PON  =  400  mW  

POFF  =  50  mW   POFF  =  0.16  mW  

10  µs  

10  µs  90  µs  

160  ms  90  µs  

PTR  =  PON  

Page 16: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Analysis  Ø  Inherent exploitability

q  No performance penalty

q  Assume full knowledge of workload in advance

Ø Actual energy saving and performance penalty under a practical policy

Chenyang Lu 16

Page 17: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Break-­‐Even  Time  TBE  Ø Enter an inactive state is beneficial only if the idle

time is longer than the break-even time q PTR ≤ PON: TBE = TTR = TON,OFF + TOFF,ON q PTR > PON: Larger TBE to compensate for energy cost

Chenyang Lu 17

Page 18: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Inherent  Exploitability  

Chenyang Lu 18

Page 19: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Metrics  Ø  (Performance) Safety: Prob(p|o)

q  If an observed event happens à the probability of Tidle > TBE q  Overprediction à lower safety à higher performance penalty

Ø  (Energy) Efficiency: Prob(o|p) q  If Tidle > TBE à the probability of successfully predicting it. q  Underprediction à lower efficiency à waste more energy.

Chenyang Lu 19

Page 20: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Fixed  Timeout  Policy  Ø  Enter inactive state when the system remains idle for TTO. Ø  Wake up in response to activity. Ø  Premise: If a system has been idle for TTO à remain idle for >TBE.

Chenyang Lu 20

Page 21: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Impact  of  Timeout  Threshold  

Chenyang Lu 21

Page 22: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Possible  Improvement  

Ø  Predictive shutdown: shut down immediately when the processor becomes idle. q  Avoid wasting energy before reaching timeout threshold

q  More efficient, less safe

Ø  Predictive wakeup: wake up before activity occurs. q  Reduce performance penalty for wake up

q  Less efficient, safer

Chenyang Lu 22

Page 23: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Advanced  Configura@on  and  Power  Interface  

Open standard for power management.

Chenyang Lu 23

Hardware  plaRorm  devices,  processor,  chipset  

device  drivers  

ACPI  BIOS  

OS  kernel  

applicaXons  

power  management  

Page 24: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

ACPI  System  Power  States  

Chenyang Lu 24

Used as contract between hardware and OS vendors

Page 25: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Power  Consumers  

Ø  Instruction execution (CPU) Ø Cache (instruction, data)

Ø Main memory Ø  Storage

Ø Display

Ø Network interface Ø  I/O devices

Chenyang Lu 25

Page 26: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Energy  Efficiency  of  Memory  Opera@ons  Relative energy per operation: register < cache < memory Ø memory transfer: 33 Ø  external I/O: 10 Ø  SRAM write: 9 Ø  SRAM read: 4.4 Ø multiply: 3.6 Ø  add: 1

Chenyang Lu 26

Page 27: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Power  Op@miza@ons  Ø  Reduce memory footprint

q  Reduce code and data size q  Analyze footprint to find right size

Ø  Find correct cache size q  Analyze cache behavior (size of work set)

Ø Minimize memory and cache access q  Use registers efficiently à fewer cache access q  Eliminate cache conflicts à fewer memory access

Ø  Shorter execution time à more idle time

Chenyang Lu 27

Page 28: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Program  Op@miza@on  and  Analysis  Ø  Performance Ø Memory footprint

Chenyang Lu 28

Page 29: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Chenyang Lu 29

for  (i=0;  i<N;  i++)    for  (j=0;  j<M;  j++)        z[i][j]  =  b[i][j];  

zptr  =  z;  bptr  =  b;  for  (i=0;  i<N;  i++)  

 for  (j=0;  j<M;  j++)  {        zind  =  i*M+j;        bind  =  i*M+j;        *(zptr+zind)=*(bptr+bind)    }  

zptr  =  z;  bptr  =  b;  for  (i=0;  i<N;  i++)  

 for  (j=0;  j<M;  j++)  {        zbind  =  i*M+j;        *(zptr+zbind)=*(bptr+zbind);    }  

zptr  =  z;  bptr  =  b;  zbind  =  0;  for  (i=0;  i<N;  i++)  

 for  (j=0;  j<M;  j++)  {        zbind++;  

                 *(zptr+zbind)=*(bptr+zbind);    }  

induction var elimination

strength reduction

Page 30: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Array  Conflicts  in  Cache  

Chenyang Lu 30

a[0,0]

b[0,0]

main memory cache

1024 4099

...

1024

4099

for  (i=0;  i<N;  i++)          for  (j=0;  j<M;  j++)  

           a[i][j]  =  a[i][j]  +  b[i][j];  

256

Page 31: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

More  

Ø Function Inlining q Cost of function calls

q Code size vs. performance

Ø Register allocation

Chenyang Lu 31

Page 32: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Nested  Func@on  Calls  (ARM)  int  main()  {  f1(x);  }  void  f1(int  a)  {  f2(a);  }  

     ;  f1  is  called  by  main()      LDR  r0,  [r13]  ;  load  para.  into  r0  from  stack      STR  r14,  [r13]  ;  store  f1’s  return  addr.  

     ;  f1  calls  f2()      STR  r0,  [r13,  #4]!  ;  push  para.  for  f2  to  stack      BL  f2  ;  branch  and  link  to  f2  

     ;  f1  receives  return  from  f2()      SUB  r13,  #4  ;  pop  f2’s  para.  off  stack  

     ;  f1  returns  to  main()      LDR  r15,  [r13]  ;  restore  register  and  return  

Chenyang Lu 32

Page 33: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Execu@on  Time  Analysis  

Ø  Execution time is affected by both program path and instruction timing q  Program path depends on input data values.

q  Instruction timing depends on pipelining, cache behavior…

Ø Accurate execution time is unknown a priori

Ø Compile-time analysis vs. measurement

Chenyang Lu 33

Page 34: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Reducing  Code  Size  

Ø Function inlining? Ø Avoid loop unrolling. Ø Use processors with dense instruction sets.

Ø Use compact instruction set. Ø Hardware support for code compression.

Chenyang Lu 34

Page 35: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

TinyOS  Two-­‐level  Scheduling  

Ø  Tasks do intensive computations q  Non-preemptive FIFO scheduling q  Bounded number of pending tasks

Ø  Events handle interrupts q  Interrupts trigger lowest level events

q  Events can signal events, call commands, or post tasks

Ø  Two priorities q  Event/command

q  Tasks

Chenyang Lu 35

Hardware

Interrupts

even

ts

commands

FIFO Tasks

POST Preempt

Time

commands

Page 36: The$Curiosity$Rover$Landing$lu/cse467s/slides/mid_review.pdf · 2014-03-05 · The$Curiosity$Rover$Landing$!  Chenyang Lu! 1!

Good luck! J

Chenyang Lu 36