Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing
description
Transcript of Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing
![Page 1: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/1.jpg)
HPEC 2012
Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing
Quinn MartinAlan George
![Page 2: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/2.jpg)
SOAPSOAP
2
Background FPGAs and Radiation in Space Traditional Scrubbing Methods
SOAP Approach Mission Parameters Markov Models
Mission Case Studies Results Conclusions
![Page 3: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/3.jpg)
FPGAsFPGAs
3
Field-Programmable Gate Arrays (FPGAs) Implement custom digital logic hardware with
fabric of logic resources and interconnect Lookup tables (LUTs) implement combinational logic User flip flops (FFs) implement sequential logic Switch and connection boxes route among resources
Many are reconfigurable Allows update of routing and logic state Partial reconfiguration can update partition of device E.g., Virtex from Xilinx and Stratix from Altera
![Page 4: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/4.jpg)
Reconfigurable FPGAs in Reconfigurable FPGAs in SpaceSpace
4
Advantages Very high performance/power ratio Reconfigurable (fully and partially)
Adaptable to changing environments and mission requirements
Can update design after launchDisadvantages
Relatively difficult to design/test applications Configuration memory vulnerable to radiation
Can change application processor architecture in unpredictable way
Must repair upsets via configuration scrubbing
![Page 5: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/5.jpg)
Radiation Effects on Radiation Effects on FPGAsFPGAs
5
Single-event Effects (SEE) Single-event Latchup (SEL) – Causes current
spike that may damage device Single-event Upset (SEU) – Changes state of
bit(s), e.g. from logic ‘0’ to ‘1’ Can be single-bit upset (SBU) or multi-bit upset (MBU)
Single-event Functional Interrupt (SEFI) – Like SEU, but affecting critical device resource
Total Ionizing Dose Degrades performance over time leading to
eventual device failure
![Page 6: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/6.jpg)
Xilinx V-5/V-6 Xilinx V-5/V-6 ConfigurationConfiguration
6
Programmed via SelectMAP interface Runtime configuration interface Also allows readback of existing configuration 32 bits per configuration word Parallel bus width of 8, 16, or 32 bits Max clock frequency 100 MHz
Configuration memory arranged in frames Minimum unit of access to config. memory Virtex-5 – 41 words per frame Virtex-6 – 81 words per frame
![Page 7: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/7.jpg)
FPGA ScrubbingFPGA Scrubbing
7
FPGA Configuration ScrubbingQuickly repairs SEUs before accumulation
Accumulation defeats redundancy strategies (e.g., TMR)
Fast repair can prevent SEUs from manifesting as errors
Can be decomposed into basic scrubbing techniques Correction techniques repair upsets Detection techniques discover and locate upsets
![Page 8: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/8.jpg)
FPGA Scrubbing FPGA Scrubbing TechniquesTechniques
8
Correction TechniquesGolden Copy – Repairs configuration based on know “golden” copy (e.g., in rad-hard PROM)Frame ECC – Repairs based on per-frame error syndrome code stored on-chip
Detection TechniquesFrame ECC – Detects based on per-frame SECDED Hamming codeCRC-32– Detects using device-wide CRC-32
![Page 9: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/9.jpg)
FPGA Scrubbing FPGA Scrubbing StrategiesStrategies
9
Scrubbing StrategiesAny combination of detection and correction techniques with controller to implement algorithmBlind Scrubbing – Golden copy correction onlyReadback Scrubbing – Some detection technique used
![Page 10: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/10.jpg)
FPGA Scrubbing FPGA Scrubbing StrategiesStrategies
10
![Page 11: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/11.jpg)
SOAP ApproachSOAP Approach
11
Scrubbing Optimization via Availability Prediction (SOAP) Uses system availability as primary metric for
scrubbing efficacy Models scrubbing strategies as Markov diagrams Vary free parameters to find optimal scrubbing
system Environmental parameters λ and α (orbits) System parameters B and fCCLK (memory and pin
constraints) Scrubbing parameters μ and γ (device configuration
capability)
![Page 12: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/12.jpg)
SOAP ApproachSOAP Approach
12
![Page 13: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/13.jpg)
Environmental ParametersEnvironmental Parameters
13
λ - SEU rates for devices in various orbits of interest Calculated per-bit and per-device using
CREME96 α – Correction factors for single-bit and multi-
bit upsets (SBU/MBU) From beam tests on Virtex-5 devices
![Page 14: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/14.jpg)
System ParametersSystem Parameters
14
Factors chosen by the system designer based on available memories, power budget, etc.
Affect scrubbing detection and correction rates (see equations on next slide)
B – Configuration bus width in bits fCCLK – Configuration clock speed in Hz
![Page 15: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/15.jpg)
Scrubbing ParametersScrubbing Parameters
15
μ – Repair rate for scrubbing technique (per second)
γ – Detection rate for scrubbing technique (per second)
![Page 16: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/16.jpg)
Markov Algorithm ModelsMarkov Algorithm Models
16
Blind No detection
Built-in CRC-32 Basic detection
Frame ECC with CRC-32 CRC acts as “safety net” for upsets
undetected by Frame ECC Frame ECC with CRC-32 and Essential
Bits (EB) Only scrubs errors that may be critical
![Page 17: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/17.jpg)
Blind ScrubbingBlind Scrubbing
17
![Page 18: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/18.jpg)
Readback CRC-32 Readback CRC-32 ScrubbingScrubbing
18
![Page 19: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/19.jpg)
CRC-32 w/ Frame ECC CRC-32 w/ Frame ECC ScrubbingScrubbing
19
![Page 20: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/20.jpg)
Case StudyCase Study
20
Applies SOAP method to hypothetical systems with realistic parameters
Devices Xilinx Virtex-5 Xilinx Virtex-6
Orbits ISS low earth orbit (LEO) Molniya highly elliptical orbit (HEO)
8-bit SelectMAP bus at 33 MHz Accounts for access speed of slow rad-hard PROM
![Page 21: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/21.jpg)
Case StudyCase Study
21
Two mission types Non upset critical (non-UC) – System continues
to run upon detection and correction of upsetOnly count critical upsets as system “unavailable”
Upset critical (UC) – System requires reset upon detection of upset to ensure state integrity
Requires detectionAll detected upsets render system unavailable for reset periodWill benefit from essential bits mask used in detection
![Page 22: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/22.jpg)
Non-UC ResultsNon-UC Results
22
Continuous blind scrubbing offers highest availability
CRC-32 offers similar availability with low implementation complexity
Frame ECC suffers because TBUs can be falsely corrected, resulting in further errors
![Page 23: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/23.jpg)
UC ResultsUC Results
23
![Page 24: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/24.jpg)
UC ResultsUC Results
24
![Page 25: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/25.jpg)
ResultsResults
25
Frame ECC with CRC-32 and Essential Bits mask offers highest availability Roughly one extra nine over other methods Xilinx-provided soft-error mitigation (SEM) core
implements similar strategy
Other strategies still competitive Complex state machine or software and additional
memory required for Frame ECC/EB Model does not account for vulnerability associated
with internal scrubbing
![Page 26: Scrubbing Optimization via Availability Prediction (SOAP) for Reconfigurable Space Computing](https://reader036.fdocuments.net/reader036/viewer/2022081419/56815876550346895dc5d524/html5/thumbnails/26.jpg)
ConclusionsConclusions
26
Predicts availability for various FPGA scrubbing strategies on real and hypothetical platforms
Uses analytical models rather than experimentation
Markov availability modeling with parametric approach
Allows optimization of scrubbing strategy during design phase
In case study, blind scrubbing best for non-UC and Frame ECC with EB mask best for UC