John David Eriksen Jamie Unger-Fink

John David EriksenJamie Unger-Fink

Using Simulated Partial Dynamic Run-Time Reconfiguration to Share Embedded FPGA Compute and Power Resources across a Swarm of Unpiloted Airborne Vehicles

Background Objective Previous Work Resource Sharing Within Single Node Resource Sharing Across All Nodes Conclusions Questions

Outline

Unmanned Air Vehicles◦ Lower cost, less risk to pilot, longer flight times◦ Autonomous and/or remote control◦ Military applications

Reconnaissance Attack

◦ Scientific applications Hurricane data collection Severe climates

Background

Background

United States Air Force Global Hawk (Reconaissance)

Background

MQ-9 Reaper (Combat)

Micro Air Vehicles (MAVs)◦ Less than 25kg◦ Wingspans smaller than 3 meters◦ Limited resources

Computation power requirement significant when compared to flight power requirement, unlike in larger UAVs.

UAV Swarms◦ Groups of UAV's that cooperate in a decentralized

fashion that enables them to complete tasks that no individual could complete.

Background

Background

US Military – MAV Prototype

Background

Lockheed Martin MicroSTAR

Demonstrate a scheme for sharing a single FPGA among multiple tasks.

Allow tasks to migrate between different UAVs.

Effectively allow power to be shared between UAVs and to provide for replacement of members of swarm that temporarily or permanently leave the swarm.

Objective

Builds on relatively new field of sharing an FPGA amongst multiple applications dynamically.◦ System must address:

Allocation Partitioning Placement Routing

◦ Can take advantage of SoC and NoC research.

Previous Work

Advantages of UAV swarms◦ Military roles

Cheaper, more expendable swarms can approach a target and collect data from closer proximity with less risk

Swarms provide fault-tolerance via redundancy – if one UAV is lost others are still in place to continue mission

Multiple UAVs can be used to precisely locate target via by combining readings of each of their sensors and employing geolocation techniques.

Previous Work

Task mobility scenarios ◦ Continuous surveillance

Individual UAVs in a swarm are retired for refueling in a staggered fashion, so some minimum number N of UAVs remain in flight at any one time.

◦ Uneven power consumption A specific sensor may consume a great amount of

during operation. If task migration does not exist, then flight time of

the swarm is compromised by this sensor. If task migration does exist, then UAVs could take

turns engaging this sensor to distribute power consumption load across swarm.

Previous Work

The computational agent paradigm◦ Properties

Autonomy (To act and decide without external direction)

Social Interaction (Inter-agent comm. via messaging)

Reactivity (Ability to respond to nondeterministic changes)

◦ Mobile Agents Can migrate between host computer systems

during execution. Agent environment facilitates this mobility.

Previous Work

Not all UAV tasks need to be loaded at the same time◦ Selectively loading tasks as needed reduces costs by:

Eliminating unneeded computation Allowing for smaller FPGAs

A run-time system or operating system must intervene to coordinate loading of tasks◦ Constraint-based dynamic FPGA configuration writing, a

two-dimensional geometric packing problem.◦ Fast packing heuristics must be used: Best Fit, Bottom Left,

Bazargan's, Minkowski Sum◦ Minkowski Sum considered to be best. Reasonably fast and

does not introduce much fragmentation of FPGA surface.

Resource Sharing Within Single Node

Minkowski sum chosen for application allocation◦ Can be used to allocate non-rectangular cores on FPGA◦ Good handling of holes left over after a portion of the

FPGA has been de-allocated◦ Good run-time performance characteristics


Dynamic Partial Reconfiguration (PR)◦ Needed for dynamic allocation of portions of

the FPGA surface.◦ Currently impractical due to technological

limitations.◦ Current FPGAs with support for PR not easily

capable of handling non-rectangular regions.◦ Simulated PR implemented to compensate for

these deficiencies. Checkpointing used.


Checkpointing for PR◦ Entire FPGA configuration is saved. Tasks store

their state in off-FPGA memory.◦ Saved configuration is modified to deallocate

unused areas and allocate new applications, then entire FPGA is reconfigured. Task states are reloaded from memory.


Checkpointing Implementations◦ Cooperative

Run-time system notifies tasks that they should take a checkpoint. Run-time system waits until all tasks finish checkpoint.

Risk: an unresponsive task can cause entire system to hang◦ Pre-emptive

Tasks periodically take their own checkpoints. Run-time system can then force a reconfiguration at any arbitrary time.

Easier for task developers to interface with since there is no need to handle reconfiguration interrupt sent by system.

Chosen over cooperative checkpointing due to increased run-time safety and decreased implementation complexity.


Sharing off-chip memory among applications◦ Three necessary components indentified:

On-chip network Wiring definitions Read-write request protocols

Arbitrator Manages accesses of multiple tasks to shared

external memory Memory partitioning policy


Memory network topologies explored:◦ Bus◦ Star◦ Mesh◦ Ring◦ Tree◦ Fat Tree


Fat Tree Topology

Memory network topology characteristics:◦ Ease of implementation◦ Wire routing cost◦ Concurrency support◦ Latency◦ Scalability◦ Impact on area allocation


Resource Sharing Within Single NodeEase of Implementation

Wire routing cost

Concurrency Latency Scalability

Bus ++ - -- - --

Star ++ -- ++ ++ --

Ring -- -- + + +

Mesh ++ + - 0 0

Tree 0 ++ 0 + +

Fat Tree - - + + +

From favorable to unfavorable: {++, +, 0, - , --}

Star topology is the favored topology◦ Good concurrency support and low latency are

highly favored in the UAV swarm scenario, so star topology is selected.

◦ Since UAV swarm scenario does not anticipate the need to run a large number of applications simultaneously, so high wiring cost and poor scalability not seen as an issue here.

◦ In scenarios where a large amount of applications are anticipated, mesh or tree topologies should be explored.


Resource arbitration implementation◦ Off-chip arbitrator◦ Off-chip memory◦ On-chip network implemented using star topology

Data bus Address bus Command bus (specify read, write, or stream operations) Clock and control lines

Feasibility experimentally verified using a Celoxica RC1000 development board with built-in memory running a reconnaissance role simulator consisting of several tasks running in parallel.◦ Some performance loss due to memory contention of memory

resources revealed, but this loss was judged to be acceptable.


Management of resource sharing◦ Decentralized scheme preferred, since single point

of failure very undesirable in UAV swarm scenario A decentralized scheme using computing

agents determined to be best solution◦ Static agents

Cameras Other sensors

◦ Mobile agents Applications that can move between different

computation nodes Interact with static and mobile agents

Resource Sharing Across All Nodes

Agents identified using unique identifier tuple:◦ {sequence_number, home_node, class,

current_node, ability_list} sequence_number – unique ID with respect to

home node home_node – node where the agent was created class – type of agent current_node – node where the agent is currently

located ability_list – services or capabilities agent is

equipped with


Agent environment requirements◦ Facilitate discovery of other agents ◦ Facilitate communication between agents◦ Provide node information◦ Facilitate migration between nodes

All-or-nothing transactional node migration◦ Provide message routing and forwarding

mechanisms


Agent migration considerations◦ High cost

Power consumption Lost application throughput due to downtime

◦ Migrations should be carefully managed to reduce cost


◦ Migration decisions depend on following factors CPU and memory usage Physical location Resource availability Power usage Communication bandwidth to other nodes

◦ Migration decisions carried out using rule-based system, where fuzzy logic was determined to be useful since it is suited to dynamic system


Example of rule:IF (the visibility of sensors on this platform is

LOW) AND (the visibility on another platform is HIGH) THEN (desire to migrate is HIGH)

Fuzzy logic: Multi-valued, non-binary logic For example, in addition to LOW, and HIGH values,

above, a MEDIUM value could be introduced.


The authors addressed the problem of designing a simulation that exhibits the capability to distribute computation and power loads across a swarm of simulated UAVs by using a combination of technologies and paradigms spanning reconfigurable computing, embedded systems, and distributed computing.

Conclusions

They addressed one of the stumbling blocks regarding the newly emerging technology of partial reconfiguration by using checkpointing, but did not address how checkpointing impacts system availability and application throughput.

Conclusions

The techniques described within could be applied to wireless sensor networks, microsatellites, and other distributed systems built from many relatively small and resource-constrained nodes.

Future work:◦ Examine security concerns◦ Examine impact of checkpointing-based partial

reconfiguration

Conclusions

Questions?

John David Eriksen Jamie Unger-Fink

Documents

Transcript of John David Eriksen Jamie Unger-Fink