Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation...

208
Deliberative Optimization of Reactive Agents for Autonomous Navigation by Anthony Douglas Jones Bachelor of Science Mechanical Engineering Technology University of Arkansas at Little Rock 2007 Associate of Science Computer Programming University of Arkansas at Little Rock 2008 A dissertation submitted to the College of Engineering at Florida Institute of Technology in partial fulfillment of the requirements for the degree of Doctor of Philosophy in Ocean Engineering Melbourne, Florida July 2013

Transcript of Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation...

Page 1: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Deliberative Optimization of Reactive Agents forAutonomous Navigation

by

Anthony Douglas Jones

Bachelor of ScienceMechanical Engineering Technology

University of Arkansas at Little Rock2007

Associate of ScienceComputer Programming

University of Arkansas at Little Rock2008

A dissertation submitted to theCollege of Engineering at

Florida Institute of Technologyin partial fulfillment of the requirements

for the degree of

Doctor of Philosophyin

Ocean Engineering

Melbourne, FloridaJuly 2013

Page 2: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

c©Copyright 2013 Anthony Douglas JonesAll Rights Reserved

The author grants permission to make single copies.

Page 3: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

We the undersigned committee hereby approve the attached dissertation “De-liberative Optimization of Reactive Agents for Autonomous Navigation,” byAnthony Douglas Jones.

Stephen L. Wood, Ph.D., P.E.Major AdvisorProgram Chair, Ocean Engineering

Ronnal P. Reichard, Ph.D.Committee MemberProfessor, Engineering

Marius C. Silaghi, Ph.D.Committee MemberAssistant Professor, Computer Science

Hector M. Gutierrez, Ph.D., P.E.Committee MemberAssociate Professor, Mechanical and Aerospace Engineering

George A Maul, Ph.D.Department Head, Marine and Environmental Systems

Page 4: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Abstract

Deliberative Optimization of Reactive Agents for Autonomous Navigation

Author: Anthony Douglas Jones

Major Advisor: Stephen L. Wood, Ph.D., P.E.

This work develops a type of reactive agent that navigates according to a steering

behavior that is governed by the agent’s preferences. These preferences are

optimized during a deliberative planning phase, and the agent’s trajectory can

be recorded as a series of waypoints or encoded as the parameters governing its

preferences. The agent behaviors are tested in realistic ocean environments, and

their performance is compared against simple tracking and homing behaviors

as well as against an A*-based path planner.

The environmental model for testing uses historic oceanographic and bathy-

metric data in two regions of interest. The first region consists entirely of open

water in the Gulf of Mexico, and the second region lies in the South-Eastern

Caribbean, where it is obstructed by a chain of islands. The Gulf region has

no obstructions but faster local currents, while the Caribbean region has slower

currents, but obstructing land masses. In general the algorithm performs better

than simple tracking or homing behaviors, but not as well as the A* planner.

The algorithm shows its most promise in its simplistic encoding of mission be-

haviors that are much more complex than homing or tracking, but it falls short

iii

Page 5: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

of replacing a synoptic mission plan.

To explore the feasibility of physically implementing one of these agents, a simple

Autonomous Surface Vehicle(ASV) was designed and constructed, which relies

primarily on commercial off-the-shelf electronics in a custom polycarbonate hull.

During a field test, the ASV successfully carried out a series of simple 250-meter

point to point navigation missions.

iv

Page 6: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Contents

1 Introduction 15

1.1 Intellectual Merit . . . . . . . . . . . . . . . . . . . . . . . . . . 18

1.2 Broader Impacts . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2 Project Background and Literature Review 20

2.1 Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

2.1.1 Dijkstra and A* Graph Search . . . . . . . . . . . . . . 22

2.2 Artifical Potential Fields . . . . . . . . . . . . . . . . . . . . . . 25

2.3 Swarm Intelligence . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.3.1 MAKLINK Graph – Habib 1991 . . . . . . . . . . . . . . 27

2.3.2 Ant Colony Optimization – Bonabeau 1999 . . . . . . . . 28

2.3.3 Intensified ACO – Fan 2003 . . . . . . . . . . . . . . . . 29

2.3.4 Dijkstra and ACO – Tan 2006 . . . . . . . . . . . . . . . 30

2.3.5 Hybrid ACO with APF – Xing2011 . . . . . . . . . . . . 31

2.3.6 Particle Swarm Optimization – Kennedy 1995 . . . . . . 32

2.3.7 Optimized Reactive Systems – Rhoads 2010 . . . . . . . 33

2.4 Iterative Optimization – Perdomo 2011 . . . . . . . . . . . . . . 33

2.5 Steering Behaviors – Reynolds 1999 . . . . . . . . . . . . . . . . 34

2.6 Tracking and Homing . . . . . . . . . . . . . . . . . . . . . . . 35

v

Page 7: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

3 Software Description 38

3.1 Concept of Operation . . . . . . . . . . . . . . . . . . . . . . . 38

3.2 Detailed Description of Operation . . . . . . . . . . . . . . . . 45

3.3 Supporting Software Utilities . . . . . . . . . . . . . . . . . . . 49

3.3.1 ncinfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

3.3.2 nc2npz . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.3.3 toponc2npz . . . . . . . . . . . . . . . . . . . . . . . . . 50

3.3.4 GenWaypoints and GenWaypointsGulf . . . . . . . . . . 50

3.3.5 GrouperBehavior-P3d . . . . . . . . . . . . . . . . . . . 51

3.3.6 GrouperView-P3d . . . . . . . . . . . . . . . . . . . . . 51

3.3.7 popro . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

3.3.8 HullProp . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

4 Environmental Model and Regions of Interest 53

4.1 Bathymetric Data . . . . . . . . . . . . . . . . . . . . . . . . . 55

4.2 Regions of Interest . . . . . . . . . . . . . . . . . . . . . . . . . 56

4.2.1 Gulf of Mexico . . . . . . . . . . . . . . . . . . . . . . . 56

4.2.2 Caribbean . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5 Software Development 60

5.1 2-D A* Planner . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.2 4-D A* Planner . . . . . . . . . . . . . . . . . . . . . . . . . . 63

5.2.1 4-D Grid Independent Design . . . . . . . . . . . . . . . 64

5.2.2 Testing of Prototype Assumptions . . . . . . . . . . . . . 65

5.2.3 Speed of Planning . . . . . . . . . . . . . . . . . . . . . 66

5.3 NetLogo Prototype . . . . . . . . . . . . . . . . . . . . . . . . . 69

5.4 Python Prototype in Gulf ROI . . . . . . . . . . . . . . . . . . 70

vi

Page 8: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

5.4.1 Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

5.4.2 Test Conditions . . . . . . . . . . . . . . . . . . . . . . 73

5.4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.5 Modifications for Use in Obstructed

Environments . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

5.6 Pseudo Scientific Tests and Modifications . . . . . . . . . . . . . 83

5.6.1 Selecting Values of M and T . . . . . . . . . . . . . . . 84

5.6.2 Exploration vs. Exploitation . . . . . . . . . . . . . . . 85

5.6.3 Simple Learning Does not Help . . . . . . . . . . . . . . 86

5.7 Impact of DORA’s Individual Parameters . . . . . . . . . . . . 87

5.7.1 The Influence of A . . . . . . . . . . . . . . . . . . . . . 87

5.7.2 The Need for C . . . . . . . . . . . . . . . . . . . . . . . 88

6 Platform 90

6.1 Hull Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

6.2 Hull Construction . . . . . . . . . . . . . . . . . . . . . . . . . 95

6.3 Electronics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

6.4 Control Software . . . . . . . . . . . . . . . . . . . . . . . . . . 102

6.5 Estimated Speed . . . . . . . . . . . . . . . . . . . . . . . . . . 103

7 Experiment 107

7.1 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . 108

7.2 Choice of Parameters . . . . . . . . . . . . . . . . . . . . . . . 109

7.2.1 A* . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

7.2.2 DORA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

7.2.3 All Agents . . . . . . . . . . . . . . . . . . . . . . . . . . 112

7.3 Program Environment Specifications . . . . . . . . . . . . . . . 113

vii

Page 9: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

7.4 Gulf of Mexico ROI . . . . . . . . . . . . . . . . . . . . . . . . 113

7.4.1 Trial at 3.0 m/s . . . . . . . . . . . . . . . . . . . . . . . 114

7.4.2 Trial at 1.5 m/s . . . . . . . . . . . . . . . . . . . . . . 115

7.4.3 Trial at 0.4 m/s . . . . . . . . . . . . . . . . . . . . . . 115

7.5 Caribbean ROI . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

7.5.1 Trial at 3.0 m/s . . . . . . . . . . . . . . . . . . . . . . 117

7.5.2 Trial at 1.50 m/s . . . . . . . . . . . . . . . . . . . . . . 119

7.5.3 Trial at 0.40 m/s . . . . . . . . . . . . . . . . . . . . . . 120

7.6 Field Trials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

7.6.1 Impromptu Experiment . . . . . . . . . . . . . . . . . . 123

7.6.2 Design Improvements . . . . . . . . . . . . . . . . . . . . 126

8 Future Developments and Conclusions 129

8.1 Future Development . . . . . . . . . . . . . . . . . . . . . . . . 129

8.2 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

A Arduino Source Code 143

B NetLogo Source Code 151

C Python Source Code 158

C.1 environment.py . . . . . . . . . . . . . . . . . . . . . . . . . . 158

C.2 navigation.py . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

C.3 agents.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164

C.4 asearch.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

C.5 nc2npz.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173

C.6 toponc2npz . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

C.7 ncinfo.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174

viii

Page 10: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

C.8 GenWaypoints.py . . . . . . . . . . . . . . . . . . . . . . . . . . 175

C.9 GenWaypointsGulf.py . . . . . . . . . . . . . . . . . . . . . . . 176

C.10 GrouperBehavior-P3d.py . . . . . . . . . . . . . . . . . . . . . 177

C.11 GrouperView-P3d . . . . . . . . . . . . . . . . . . . . . . . . . . 180

C.12 popro.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185

C.13 HullProp.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188

C.14 FieldTest.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

C.15 csv2kml.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

C.16 ANav.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190

C.17 GrouperNav.py . . . . . . . . . . . . . . . . . . . . . . . . . . . 191

C.18 GrouperNav0A.py . . . . . . . . . . . . . . . . . . . . . . . . . . 196

C.19 PNav.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

C.20 CNav.py . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

ix

Page 11: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

List of Figures

2.1 Weighted Graph . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.2 A* vs. Dijkstra . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3 MAKLINK Graphs . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.4 Tracking and Homing . . . . . . . . . . . . . . . . . . . . . . . . 36

4.1 Gulf of Mexico Region of Interest . . . . . . . . . . . . . . . . . 54

4.2 Caribbean Region of Interest . . . . . . . . . . . . . . . . . . . . 55

4.3 Gulf of Mexico Velocity Field . . . . . . . . . . . . . . . . . . . 56

4.4 Gulf of Mexico Bathymetric Map . . . . . . . . . . . . . . . . . 57

4.5 Caribbean Velocity Field . . . . . . . . . . . . . . . . . . . . . . 58

4.6 Caribbean Bathymetric Map . . . . . . . . . . . . . . . . . . . . 59

5.1 2-D A* Path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

5.2 Astar Island Navigation . . . . . . . . . . . . . . . . . . . . . . 64

5.3 NetLogo Prototype . . . . . . . . . . . . . . . . . . . . . . . . . 69

5.4 Tracking Vs. Dora . . . . . . . . . . . . . . . . . . . . . . . . . 78

5.5 PHD-Significance . . . . . . . . . . . . . . . . . . . . . . . . . . 83

6.1 Hull Plan . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6.2 ASV . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

6.3 ASV Wiring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

x

Page 12: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

6.4 ASV Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6.5 ASV Power Vs. Speed . . . . . . . . . . . . . . . . . . . . . . . 105

6.6 ASV Range Vs. Speed . . . . . . . . . . . . . . . . . . . . . . . 106

7.1 Example Path from Gulf . . . . . . . . . . . . . . . . . . . . . . 122

7.2 Field Trial 0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

7.3 Field Trial 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

xi

Page 13: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

List of Tables

5.1 First Performance Trade Study . . . . . . . . . . . . . . . . . . 67

5.2 Results of Planar Earth Trial . . . . . . . . . . . . . . . . . . . 76

5.3 Results of Round Earth with High Speed Trial . . . . . . . . . . 79

5.4 Success Rates for Round Earth with Low Speed Trial . . . . . . 81

5.5 Results of Solved Trials on Round Earth with Low Speed . . . 81

6.1 ASV Hull . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

6.2 ASV Performance . . . . . . . . . . . . . . . . . . . . . . . . . . 103

7.1 Results of Gulf Trials at 3.0 m/s . . . . . . . . . . . . . . . . . 114

7.2 Results of Gulf Trials at 1.5 m/s . . . . . . . . . . . . . . . . . 115

7.3 Results of Gulf Trials at 0.4 m/s . . . . . . . . . . . . . . . . . 117

7.4 Results of Caribbean Trials at 3.0 m/s . . . . . . . . . . . . . . 119

7.5 Results of Caribbean Trials at 1.5 m/s . . . . . . . . . . . . . . 119

7.6 Results of Caribbean Trials at 0.4 m/s . . . . . . . . . . . . . . 120

xii

Page 14: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Acknowledgements

I would like express my appreciation to everyone that has made this project

possible, including the following people and organizations in particular:

• My Major Professor, Dr. Stephen Wood

• The members of my doctoral committee:

– Dr. Ronnal Reichard

– Dr. Marius Silaghi

– Dr. Hector Gutierrez

• My colleagues from the Underwater Technology Laboratory

• The Department of Defense SMART Scholarship Program and my mentors

at SPAWAR Systems Center Pacific

– Marshal Dvorak

– Michael Tall

• My parents, Danny and Connie Jones

• My family and friends

xiii

Page 15: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Dedication

Dedicated to my wife, Jenna.

xiv

Page 16: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Chapter 1

Introduction

Most autonomous underwater vehicles (AUV’s) are severely limited in intelli-

gence. In general, they are only capable of carrying out a mission that is care-

fully scripted before deployment. They may be able to reactively avoid collisions

or make small departures from the plan, but they inevitably either return to the

plan as soon as possible or abort the mission entirely. Indeed, the AUV-related

algorithms described in the literature review all address pre-mission planners

rather than any onboard capability. As AUV’s navigate from one point to the

next, they usually do so by attempting to travel along the shortest distance

to their objectives. A number of planning methods are currently used to de-

velop these missions including a variety of A*-based graph searching methods,

some swarm-based techniques, and other optimization methods, which will be

discussed at length in Chapter 2.

However, these planners aspire to find an efficient path consisting of a number of

waypoints or bearings from a starting point to a goal. Conversely, this project

15

Page 17: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

attempts to allow greater flexibility in path planning by using a more complex

steering behavior than simply moving toward the next objective. The is addi-

tional flexibility potentially reduces the need for a detailed list of waypoints. In

fact when successful and properly tuned, the behavior that is developed here al-

lows an agent to navigate around obstacles or opposing currents and to achieve

its objective without any global information other than the location of its ob-

jective. It makes all course adjustments based only on local information. This

steering behavior is purely reactive, but it is optimized by a deliberative mission

planner, so the algorithm is called Deliberative Optimization of Reactive Agents

(DORA), and is described in detail in Chapter 3.

Since the agent only has access to local data, it lacks the synoptic view of

the mission that is employed by waypoint-to-waypoint planning methods. This

has the potential benefit of reducing the memory required to store a mission,

since some of the trajectory is inherently a function of the agent’s environment.

Not withstanding these differences, an A*-based point-to-point planner is the

standard of comparison used here, as it is a well-proven and well-researched

method that has the benefits of mathematical elegance and successful use in

AUV navigation. But due to the fundamentally different nature of these two

systems, some difficulties of comparison arise. The A* algorithm searches in a

different problem space than DORA, and this must be borne in mind during any

comparative study of the two. Since both algorithms are being used to solve the

same physical problem, their comparison is still practical, but comparisons will

generally be made in terms of mission-related performance measures such as

computation time and the quality of the mission that is planned, rather than in

regard to metrics which are defined by the algorithms’ problem spaces. Both the

A*-based software that is used as the standard of comparison, and the current

16

Page 18: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

version of the DORA software system have undergone much development and

iterative improvement, and this process is discussed in Chapter 5.

To evaluate the potential use of DORA aboard autonomous vehicles, a num-

ber of simulations were run that tasked both algorithms with planning paths

through historic ocean current fields in the Gulf of Mexico and the Caribbean.

The environmental data used to model these regions is described in Chapter 4.

In these simulations, the software was tasked with finding paths for agents oper-

ating at different speeds from much slower to slightly faster than the local ocean

currents. It was also required to navigate through a region that was divided by

an intervening island chain. In all cases, the success rates of the two methods

were compared as was the average quality of their solutions. In general, the

A*-based planner was more successful and planned better paths in shorter time

than did DORA. However, there were some exceptions in which DORA was the

better option, and the details of these cases are discussed in Chapter 7.

Since this algorithm is intended to be a path planner for a mobile robot, and

the agents are simplistic virtual approximations of autonomous vehicles, the

final test of the feasibility of this system is the deployment of an autonomous

surface vehicle (ASV) that navigates according to the steering behavior that is

developed by this research. The ASV’s navigation system instantiates a DOR

agent that is optimized before deployment in the Eau Gallie River, where it

successfully navigates to a waypoint 250 meters from its launch, and then returns

to its starting position. The design and construction of the ASV are described

in Chapter 6.

Finally, the conclusions drawn from the experiments address the questions that

began this research, but they also raised additional questions, and opened the

17

Page 19: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

way for future improvements and developments. Chapter 8 then summarizes the

outcome of this project and describes some potential directions for its continued

development.

The preceding several paragraphs provide an overview of this project, however

this is an experiment, and as such it must be formalized before moving forward.

All of the following chapters serve to address this hypothesis:

An agent that is influenced by ocean currents and is subject to collisions with

terrain need not carry a detailed, synoptic mission plan, but rather can act

according to simple preferences based on its locally observable environment,

provided that these preferences are deliberatively optimized before its deploy-

ment. Further, the trajectory of such an agent will serve as a path plan that is

comparable in speed and quality to a plan generated by an A*-based mission

planner.

1.1 Intellectual Merit

The merit of this project lies primarily in the development of a simplistic means

of encoding a mission trajectory through a series of four floating point numbers.

This represents a lower memory requirement for a path plan than a long list of

waypoints. This would be more similar to the use of spline-based representa-

tions of mission paths, however fitting a spline to a mission plan requires either

neglecting environmental features, or arbitrarily adjusting the number of param-

eters used to describe the path. The use of optimized reactive agents, whose

reactions are based on environmental conditions allows for the construction of

18

Page 20: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

trajectories, which are tuned to the particular mission.

1.2 Broader Impacts

The agents developed to enable this algorithm are simple enough to implement

with only high-school level math. More importantly, the construction of the

field testing ASV primarily from commercial off-the-shelf (COTS) components,

which are drawn largely from the hobby electronics community will enable it to

be easily and inexpensively reproduced and programmed. This will make the

ASV a viable platform for additional navigation-related research, as well as a

potential tool for educational outreach similar to terrestrial robotics kits that

are commonly used for that purpose. Beyond its potential use for robotics edu-

cation, the ASV’s mathematically derived hull allows for very simple calculation

of many parameters that are commonly used in Naval Architecture, which in

turn allows for simple trade studies between design parameters and hull prop-

erties. To support these broader impacts, the ASV is fully described in Chapter

6, which includes the description of the hull and all onboard electronics, and

Appendix A includes the source code for the ASV firmware.

19

Page 21: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Chapter 2

Project Background and

Literature Review

There is a large volume of literature regarding general path planning, path

planning for autonomous robots, and path planning for autonomous underwater

vehicles in particular. This section summarizes the concepts and papers that are

most relevant to the current research. Seto [34] provides an excellent reference

on this topic and includes a good overview of several common, current methods

for autonomous underwater vehicle navigation and path planning. In addition

to algorithms specifically related to maritime navigation, Millington and Funge

[25] describes a number of methods useful for real-time and high speed planning

including a discussion of some of the algorithms presented below, and Russel and

Norvig [32] is a popular reference regarding the subject of Artificial Intelligence

in general.

20

Page 22: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

2.1 Graphs

A graph is a data structure that is used to represent the relationships between its

member elements or nodes. In the case of path planning, a graph is often used

as a convenient representation of an environment. Consider that any port on

an ocean or navigable river can eventually be accessed from any other port, but

in general major shipping lanes only connect the larger ports, whereas smaller

coastal ports are serviced by lighter shipping, and river ports are serviced by

barges and smaller riverine craft. Thus cargo’s travel from any port to any other

may require a number of transitions along typical shipping routes. Further,

transitioning from one port to another will incur some cost associated with the

distance traveled, the conditions en-route, and a number of other factors. This

type of problem lends itself well to representation by a weighted graph such as

the one in Figure 2.1.

“Graph” describes the data structure as consisting of a number of nodes tied

together via connections, and “weighted” refers to the fact that there is a cost

associated with a transition from one node to another [25]. The path planning

software described in Chapter 3 will use the positions of mission waypoints as

nodes and the time required to navigate along a straight line between them

will be the cost. Fortunately, graph problems are useful for many applications

beyond robotic mission planning, and there are a number of algorithms that have

been developed for their solution. Two of these that are particularly applicable

to the problem at hand will be described in the following section. Before moving

on, it should be noted that the AUV path planning problem’s graph is much

more complicated than the one shown because the cost of transitioning from

one node to another is not likely to be the same as transitioning back from that

21

Page 23: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 2.1: Visual representation of a simple graph where the numbered cir-cles represent the nodes and the lines connecting them the edges. Costs oftransitioning from one node to another are indicated by the numbers along theedges.

node, and in fact, all costs in the graph continually vary with time due to the

changing ocean currents. Thus a node is not simply fixed in space, but also

in time. This problem is addressed through dynamic modification of the graph

by the search software, but the details of that process are better reserved for

Section 5.2.

2.1.1 Dijkstra and A* Graph Search

There are a number of methods that have been developed to solve the problem

of efficiently navigating from one node in a graph to another, but only two are

22

Page 24: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 2.2: Example graph navigation problem that could be solved using A*or Dijkstra.

relevant to this this project. These are A* and Dijkstra’s algorithm.

Hart and Nils [19] introduce the A* algorithm for graph search problems, and

the technique is described well in several later publications including [32] and

[25]. A* and Dijkstra are both intended to find the minimum cost path through

a graph. A* works by choosing to explore nodes in an order based on a heuristic

function which estimates the total cost from the current node to the goal as the

cost of reaching a successor node plus some estimate of the remaining cost from

the successor node to the goal node. To ensure that an optimal path is planned,

the heuristic estimate must be “admissible.” An admissible heuristic is one

which predicts a cost that is always less than actual path cost. For instance,

in geographic problems, straight line distance from a successor node to the

goal is often used because whatever the best available path is, it can never be

shorter than the straight line distance. It is critical to note, that a heuristic’s

admissibility requirement is only necessary to ensure an optimal solution. If

a sub-optimal, “good” solution is acceptable, it may be advisable to relax the

admissibility requirement for a heuristic in the interest of speed of execution.

In robotic path planning, it is sometimes better to have good plan quickly than

an optimal plan that requires more time.

23

Page 25: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Dijkstra’s algorithm preceded the A* algorithm in publication [13], but is pre-

sented afterwards because it may be considered a special case of the A* ap-

proach. Dijkstra suggests expanding successor nodes in order of minimum cost,

so that no estimate of the total remaining distance to the goal is needed. In

practice, this can be faster than other search methods such as breadth-first, as

it is guided by cost, but is generally not as fast as A* since the path with the

total minimum cost is not always the same as the path with the minimum cost

between any two successors [25].

For comparison of the two methods, consider Figure 2.2, which shows a simple

graph problem. Here an agent needs to navigate from the start node to the goal

node, and must choose the shortest route. Clearly the shortest path lies along

the edges at the top, and this is the route that A* would explore first. It would

evaluate the cost of going from Node 0 to Node 1 and estimate the remaining

distance to Node 3. It would also evaluate the cost of traveling from Node 0

to Node 2 and estimate the remaining distance from Node 2 to Node 3. The

straight line distance from Node 2 to Node 3 is not much less than the distance

from Node 0 to Node 3, but Node 2 is very near Node 3, so A* chooses this

route.

Dijkstra’s algorithm on the other hand is based exclusively on the total cost

from the start node to the next node (not the goal). So Dijkstra’s algorithm

would evaluate the cost from Node 0 to Node 1 as well as the cost from Node 0

to Node 2. Seeing that the cost of reaching Node 2 was shorter, it would expand

Node 2, and begin the comparison again from that point. Dijkstra’s algorithm

would finally select the same route as A*, because eventually the cost incurred

along the lower route would become greater than the cost of moving to Node

24

Page 26: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

1, but the critical difference is that Dijkstra’s algorithm would require all of

the additional processing time to explore a portion of the longer route before

finding the shorter route.

The A* algorithm has been applied to AUV navigation by several research

groups [10, 17, 16, 35]. Carroll et al. [10] used A* with a gridded discretiza-

tion of the ocean to plan AUV paths that took advantage of ocean currents,

avoided obstacles, and allowed for temporally varying exclusion zones. Garau

et al. [17] revisited A*-based AUV path planning, and used a different assump-

tion of vehicle behavior that employed constant thrust power throughout the

mission, rather than constant vehicle velocity. Fernandez-Perdomo et al. [16]

developed another A* algorithm specifically tailored to AUV’s, which was called

Constant Time Surfacing A* (CTS-A*). This algorithm differed from previous

similar systems by discretizing the environment in a way that assured constant

time between vehicle surfacing events rather than discretizing with respect to

environmental geometry. Smith and Dunbabin [35] implemented an A* path

planner not for AUV’s, but for drift buoys, which could only control their verti-

cal positions in the water column, and thus needed to take advantage of natural

currents for all navigation. All of these examples demonstrate the viability of

A* as a candidate mission planner for AUV’s.

2.2 Artifical Potential Fields

Khatib [23] describes the use of artificial potential fields(APF) in robotic ma-

nipulator motion, and discusses its extension to the field of mobile robotic path

planning. One of the most important contributions of this paper is the sim-

25

Page 27: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

plification of the path planning problem, such that a simple agent could navi-

gate along collision-free trajectories without deliberative planning. Robots, and

robotic manipulators could then choose collision-free paths in real time instead

of relying on a separate deliberative process to pre-plan such paths off line. The

concept of simplifying the problem of navigation to one that is soluble in real

time is important to the algorithm that is proposed in Section 3.1.

To enable this simplified behavior, the method of APF models a robot’s envi-

ronment as a potential field in which obstacles carry a repulsive charge and the

objective carries an attractive charge. In this model, any point in the environ-

ment has an associated potential that acts to drive the agent from obstacles and

toward goals, thus the robot need not stop to deliberate about its next course

of action. It can simply move as if motivated by the artificial potential, and

assuming there are no local minima, it will eventually arrive at its objective

without collision.

However, for practical robot navigation in complex environments, the assump-

tion of no local minima is often invalid. Saska et al. [33] gives a simple example

of an obstacle that lies directly between a mobile robot and its objective. In

this case, the obstacle’s repulsive force is exactly balanced by the objective’s

attractive force, and a robot that is guided only by APF navigation will simply

stop at the balance point. The algorithm presented below bears some similar-

ities to APF, and in fact could be recast as a special case of APF navigation,

so the vulnerability to local minima or maxima is an important concern for the

agent behavior described in Section 3.1.

26

Page 28: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

2.3 Swarm Intelligence

Another variety of optimization and path planning methods relies on the use of

swarm intelligence. Before exploring path planning literature that implements

swarm intelligence some discussion of the core concept is in order.

Several papers suggest the use of Ant Colony Optimization (ACO) techniques

[37, 14, 40] while others use Particle Swarm Optimization (PSO) to solve robotic

path planning problems [33, 24, 28, 11]. None of these applications is applied to

the marine environment, and most are not tested at all aboard physical robots,

but validated through simulation. A notably different approach is used by Witt

and Dunbabin [39] who employed a PSO algorithm in combination with other

optimization programs to develop a mission planner that was successfully tested

aboard an AUV in an estuary.

2.3.1 MAKLINK Graph – Habib 1991

While not a path planning algorithm in itself, the development of the MAK-

LINK graph is important because it demonstrates one method of discretizing

a continuous environment and modeling it as a searchable graph, specifically

intended for robot navigation [18]. As such, it is employed as an initial step in

some of the following methods .

The MAKLINK graph begins by mapping all obstacles in an environment. Each

obstacle is then expanded by a margin of safety that is specified for the type of

robot that will be used. A line then connects each vertex of an obstacle to all

other vertices that are visible from a given vertex, such that the newly drawn

27

Page 29: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 2.3: Example MAKLINK graph discretizing navigable space for robotnavigation. Images directly reproduced from [18].

lines pass only from one vertex, through empty space to another vertex. Next,

a line is drawn from the midpoint of each of these new lines to the midpoints

of all other new lines. Finally, all of the initial construction lines are removed,

as are any lines that collide with objects, and the result is a graph that follows

through the center of the environment’s free space, thus avoiding collisions. An

example from [18] is shown in Figure 2.3.

2.3.2 Ant Colony Optimization – Bonabeau 1999

Bonabeau et al. [8] describe the principles of ACO. A large number of individ-

ual agents (“ants”) are initialized in a graph, and each independently wanders

between nodes. The agents in ACO choose successor nodes based on a prob-

abilistic choice between nodes that are close (like in the Dijkstra algorithm),

and nodes that have been commonly visited by other agents. The assumption

28

Page 30: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

is that better routes will be chosen more often than worse routes, and eventu-

ally the population will tend to converge to good solutions of the problem. It is

important that the preference for close nodes or heavily trafficked nodes is prob-

abilistic, because this ensures that some agents will wander off of the established

routes, which limits the tendency to settle into local minima. ACO has proven

to produce good solutions to computationally difficult problems, but there is no

guarantee that it will find the optimal solution to any given problem.

2.3.3 Intensified ACO – Fan 2003

Fan et al. [14] uses a black and white image as a map of an environment.

In this model, black pixels represent obstructions, and white pixels represent

free space. An agent is considered to occupy one pixel at a time and can

choose to move from its current position to any of the four pixels immediately

adjacent to it. The agents search for an optimal path by wandering through their

environment guided by an attraction to simulated pheromone, until reaching

their objective.

To encourage faster solution when no pheromone is present, an ant will choose

its next position based on an heuristic estimate of the remaining distance to

the goal. When suitable paths are found, they are post-processed to smooth

them, and the algorithm iterates until an acceptable solution is discovered. Fan

et al. [14] demonstrates that this process converges to short, collision-free paths

through obstructed environments after some time.

This algorithm implements principles of ACO for path planning, but the act of

reducing the problem space to a 4-connected graph, and then weighting nodes

29

Page 31: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

by heuristic estimates of remaining distance reduces the benefit of a stochastic

system like ACO. Since a heuristic estimate of distance is known for each node,

and the agent is only allowed to move along the four-connected grid, the prob-

lem is effectively collapsed to a graph search, and A* would find the shortest

unobstructed distance in a single iteration.

2.3.4 Dijkstra and ACO – Tan 2006

Tan et al. [37] uses a three-step approach to finding an optimal path through an

environment. First a MAKLINK graph of the environment is generated, so that

only paths through free space exist in the graph. Next, Dijkstra’s algorithm

is used to find the shortest path through the MAKLINK graph. Since the

MAKLINK graph keeps the path roughly centered between obstacles, it does

not permit an agent to take a shorter route that would pass closer to an obstacle.

So as a final step, the graph is expanded to include the unobstructed space on

either side of the chosen path, and the new graph of the free space is searched

using ACO.

The initial discretization according to the MAKLINK algorithm ensures the

safety of the robot, and Dijkstra’s algorithm, while less efficient than A*, does

find the shortest route though that graph. However, as in [14], the final formu-

lation of the environment is still a graph with known costs, and the reason for

abandoning proven graph searching algorithms is unclear.

30

Page 32: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

2.3.5 Hybrid ACO with APF – Xing2011

Xing et al. [40] uses a hybrid of Artificial Potential Fields and Ant Colony

Optimization. In this algorithm, the environment is initialized with a certain

amount of pheromone, where the pheromone is distributed according to distance

from a goal and from obstacles. This method violates a core concept of swarm

algorithms–specifically it is not biologically feasible. But it also reduces the

initial amount of random wandering required by a pure ACO. The environment

is then searched for a viable route using normal ACO, where at the end of

each iteration, the pheromone strengths of all successful routes are reinforced

in inverse proportion to their lengths. This process is repeated until a solution

is found.

Then when the robot is deployed, it navigates according to an APF algorithm.

However, instead of only having an attraction to the objective and a repulsion

from the obstacles, it also has an attraction to the ACO-defined path. This

modification helps to overcome some of the local extrema problems associated

with APF navigation. To further help the robot avoid oscillating between local

extrema, the navigation system maintains a taboo list of previously visited

locations [40].

Hybrid ACO with APF differs from the preceding two methods in that it does

not use a graph-based model of the environment. Potential fields and pheromone

strengths can be modeled as continuous functions of space, so that discretization

is not required.

31

Page 33: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

2.3.6 Particle Swarm Optimization – Kennedy 1995

Kennedy and Eberhart [22] introduces the PSO algorithm as a general function

optimization method. In this case, a large number of agents (“particles”) are

randomly distributed in problem space. Each of these agents is a potential

solution to the problem, and it is capable of moving in a space with the same

number of dimensions as the solution’s number of parameters. The fitness of

the agents is based on the quality of the solutions that they represent, and all

agents tend to move toward their more fit neighbors. The motion of the particles

in their space is roughly Newtonian such that each particle has a velocity and

acceleration in each dimension as well as some inertia. At each iteration, each

agent experiences an acceleration in the direction of the particle with the highest

fitness as well as an acceleration in the direction of the highest fitness solution

ever discovered by the group of agents. As time progresses, the particles tend

to settle into good solutions, and careful tuning of parameters of the algorithm

helps to avoid local minima. Like ACO, PSO does not guarantee an optimal

solution to a problem.

The papers reviewed here that use PSO for robotic path planning use some

variation of spline optimization. They construct a series of spline curves from

the robot’s position to its goal, and then use PSO to optimize the parameters of

the splines so that they minimize the overall cost. The cost generally includes

distance, time, or energy, as well as obstacle collision [33, 24, 28, 11]. One

problem common to all of these methods is that a spline can be of arbitrary

order, and the choice of order restricts the number of obstacles that can be

avoided. Some compromise is necessary, which either results in sub-optimal

paths, or unnecessarily long computation.

32

Page 34: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

2.3.7 Optimized Reactive Systems – Rhoads 2010

Purely reactive navigation systems have the advantage of speed because they

have very specific responses mapped to stimuli, so that no “thinking” is re-

quired. However, since reactive systems do not deliberate, they have no means

of compensating for unexpected circumstances, so it is important to ensure that

a reactive system is well optimized for its intended environment. One example

of an optimized reactive system can be found in control systems whose parame-

ters are adjusted for their expected environments. These are described in terms

of mechatronic applications by Oberschelp and Hestermeyer [27]. Similarly,

Rhoads et al. [31] mathematically derives a feedback control law that would

allow an AUV to navigate through a known ocean current field in minimum

time.

While tuned control systems do use deliberative optimization to develop well-

suited reactive systems, they employ a different implementation than the one

proposed below. Control systems are largely built on detailed mathematical

models, which allow optimization by solution of the model’s equations. Whereas

the agents used by the DORA algorithm are optimized through stochastic ex-

ploration of the problem space.

2.4 Iterative Optimization – Perdomo 2011

Fernandez-Perdomo et al. [15] suggest an algorithm in which an agent is intro-

duced to a modeled ocean environment, and navigates to its goal by choosing

a bearing. At each step of the algorithm, the agent is allowed to move through

33

Page 35: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

the forecast environment according to a simple kinematic particle model. The

agent travels in the direction of its bearing at a set speed until the next itera-

tion of the algorithm and its position and trajectory are evaluated. This process

is repeated until the agent arrives at its goal, and the sequence of bearings is

returned as the result. Iterative optimization is applied to improve the quality

of the solution. [15] find that this method works well, and provides realistic,

smooth trajectories for slow-moving AUV’s navigating in ocean currents.

2.5 Steering Behaviors – Reynolds 1999

Finally, one of the works that is most fundamental to this project is a description

of agent steering behaviors described by Reynolds [30]. Reynolds describes a

number of very simple, reactive behaviors that allow autonomous agents to

move through an environment in various useful ways without deliberation and

without careful scripting or planning by a human. These behaviors are an

expansion beyond his earlier work developing a system to model visually realistic

bird flocking by using a combination of simple steering behaviors [30]. The

emphasis throughout Reynolds’ work is on the development of mathematically

simplistic behaviors that can be used in combination to produce some desired

effect without user supervision. In particular in his work these are often applied

to cinematic rendering, and video game character behaviors [30].

Among the behaviors Reynolds’ describes are Seeking and Fleeing where an

agent attempts to either close on a position or evacuate from it, Pursuit and

Evasion, where an agent attempts to follow or escape from another agent, Arrival

where an agent decelerates on approach to an objective to avoid overshooting

34

Page 36: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

it, Path Following, Obstacle Avoidance, and even Flow Field Following, where

agents attempt to follow paths, avoid collisions, and align themselves with vector

fields.

Reynolds describes a number of ways these behaviors can be combined to

create desired effects, and provides demonstration applets via a website at

http://wwww.red3d.com. These behaviors form the fundamental basis for some

of the key mechanics used in the steering behavior described in Section 3. An

important note is that the basis of the DORA algorithm is more closely related

to these simplistic steering behaviors than to theoretical control systems such as

those described by [31]. While this method lacks the mathematical rigor of the

control systems approaches, it has the benefit of very simple implementation,

which makes it more versatile than some other alternatives. Since the only math

required is algebra and trigonometry, the parameters of the algorithm can be

easily modified for different applications.

2.6 Tracking and Homing

Two trivial solutions to the problem of navigating in an ambient current field

will serve as standards of comparison in the experiments of Section 7. Homing

is a technique that ignores the local current, and always orients the agent to face

directly toward its objective. This has the advantage of simplicity, but making

no attempt to compensate for currents increases the length of the trajectory

traveled by a homing agent when compared to tracking. Tracking attempts to

minimize the distance traveled by compensating for cross-track components of

ambient current. Orienting the agent so that it faces upstream from its objective

35

Page 37: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 2.4: Tracking and homing behaviors illustrated. Tracking compensatesfor local currents, while homing always orients directly toward its objective.

allows it to avoid being pushed off of a straight-line path by the current. These

two approaches are shown in Figure 2.4.

Since the ideas and algorithms of graph search, the bio-inspired aspects of swarm

intelligence, and the reactive steering behaviors of Reynolds’ work form the ba-

sis for the development of the algorithm described in the next chapter, these

preceding discussions provide useful context. However there is, as always, some

room for further exploration. Specifically, the swarming algorithms are con-

strained to biological feasibility, sometimes at the expense of optimal solutions.

The graph search algorithms require discretizing the environment into repre-

sentations that are amenable to graph-based methods, which may or may not

remain faithful to the original problem. The reactive systems are limited to

reactions to the stimuli for which they are designed. And all robotic naviga-

36

Page 38: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

tion algorithms, including the one presented here, are constrained to solving an

abstract mathematical model of the navigation problem, rather than the prob-

lem itself. It is the hope of the following chapter that the proposed algorithm,

while hardly optimal, provides a new and useful balance of some of the concerns

associated with the works described above.

37

Page 39: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Chapter 3

Software Description

3.1 Concept of Operation

The algorithm evaluated here takes inspiration from the aggregate spawning be-

havior of some species of fish. To reproduce, such fish must navigate over large

distances without overall maps of their routes, and arrive at a specific time of

year. Fish that fail to complete the journey, or fail to arrive within a temporal

window also fail to reproduce, so they do not contribute to the next genera-

tion’s population. So the constraints on these creatures suggest a framework

for development of an artificial agent that operates in a similar environment and

with similar goals. Specifically, the agent is subject to navigation in temporally

and spatially variable ocean currents. It must navigate from a starting point to

an objective within a limited amount of time while using a limited amount of

energy, and it is incapable of making a global plan of its route.

While biology provides inspiration and an interesting context for the problem,

38

Page 40: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

this system is not intended to model any actual organism, and biological feasibil-

ity is only taken as a guideline rather than a constraint. This work then could be

categorized as bio-inspired or swarm-based, but it is not truly biomimetic nor is

it a swarm intelligence method, as biomimicry is a prerequisite for swarms.

Toward this end, each agent will attempt to satisfy the four following prefer-

ences:

1. Avoid collisions with land.

2. Travel with strong currents.

3. Maintain heading.

4. Travel toward the objective.

Intuitively, each of these preferences will be more or less important depending on

the agent’s circumstances, so each is weighted based on some general conditions.

Then during optimization, the relative weighting of each preference is adjusted

to tailor the agent to the particular circumstances. Often these preferences will

conflict with one another, and while there are various schemes that can be used

to resolve such conflicts, the one chosen here is a simple weighted average of

vectors, which expresses the agent’s preferences. This summation is shown in

vector form in Equations 3.1-3.6.

39

Page 41: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

p = a + b + c + d (3.1)

a = A

(1

1 + eMtc−MT

)(1− 1

1 + etc

)∇F (3.2)

b = B

(V∞

Va + V∞

)〈sin θ∞, cos θ∞〉 (3.3)

c = C〈sin θa, cos θa〉 (3.4)

d = D〈sin θT , cos θT 〉 (3.5)

tc =h

∇F · 〈sin θa, cos θa〉(3.6)

p Agent’s Preferred Direction Vectora Collision Avoidance Preference Vectorb Local Current Preference Vectorc Stability Preference Vectord Homing Preference VectorM Slope Parameter for Collision AvoidanceT Temporal Horizon for Collision Avoidanceθ∞ Heading of Local CurrentV∞ Local Current VelocityVa Agent’s Velocitytc Estimated Time to Collision∇F Gradient of Ocean Floorθa Agent’s Current HeadingθT Heading to Agent’s TargetA,B,C,D Behavioral Tuning Coefficients

The preference weighting parameters A,B,C,and D are each constrained to

any number between -1 and positive 1 inclusive, but are normalized so that

|A| + |B| + |C| + |D| = 1. This way the total preference is expressed as a unit

vector oriented in the direction that the agent would like to steer. These are

the parameters that the algorithm will optimize during execution. It should be

40

Page 42: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

noted that a weight of zero eliminates a preference from consideration, and this

may be used to reduce the computation complexity. For instance in deep, open

water, constraining A to zero and searching in the space defined by the other

three preferences is perfectly viable, and is used as an early proof of concept

test for the algorithm.

Examining the preference vectors in turn, a represents the first preference of

the agent to avoid colliding with land. However, collisions with land are of little

concern in open water, or extremely deep water, so the term is weighted by two

sigmoid functions that reduce this preference’s consideration when the agent

considers collision to be unlikely. In particular, tc estimates a time to collision

based on the ocean floor’s gradient and depth and the agent’s speed as shown

in Equation 3.6.

The term ( 11+eMtC−MT ) is a sigmoid function that varies the weight of the prefer-

ence vector from zero to one. As the estimated time to collision becomes small,

the weight of the preference increases. The rate at which the weight changes

with respect to tc is determined by slope term M . Large values of M corre-

spond to vary rapid changes in weight, whereas small values of M correspond

to gradual changes. For all trials in presented here, M = 0.001. The temporal

horizon term, T is a measure, in seconds, of the value of tc for which the function

reaches 0.5. In general, a value of 2,700 seconds is used for these experiments.

This way, when the gradient of the ocean floor is small or distant, the agent has

very little preference to steer away from it, but as a threat becomes more im-

minent, or the margin of safety is reduced, steering to avoid collisions increases

in importance.

However, because of the definition of tc in Equation 3.6, if the ocean floor

41

Page 43: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

is dropping away from the agent, tc becomes negative, which would give the

collision avoidance term a large weight, when collision is least likely. Since this

is counter to the behavior that is needed, the second term, ( −11+etc

+1) is defined.

This drives the weight to zero for negative values of tc. For a more rapid change

in weight, the exponential tc can be multiplied by an arbitrarily large number,

but for the tests here, its coefficient is left at 1.

The second preference, b, is weighted by the relative strength of the current

compared to the agent’s speed by the term V∞Va+V∞

this way, as the local current

speed increases in comparison to the agent’s speed, the preference to either fight

or avoid the local current becomes more important. In practice, this is a good

example of a preference whose weight is often optimized to a negative value.

With a negative value of B, the agent will tend to turn into the current by

an amount related to the strength, which applies some compensation to being

pushed off course by the current.

The stability preference, c acts as a means of reducing (or promoting) the

agent’s tendency to change course. Since, for positive values of C, the vector

acts in the direction that the agent is already facing, this can introduce an

artificial sense of inertia, and prevent the agent’s changing course very rapidly

due to a rapidly changing current field.

The homing preference d is the most important of the four preferences because

it is the only one that represents a valid general solution to the problem of

navigation without other influences. Since this vector always points toward the

objective, the agent could travel all the way to its destination while ignoring all

other preferences–assuming the agent could maintain sufficient speed and had

no intervening obstacles. While the other three preferences are independent of

42

Page 44: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

one another, they all depend on the homing preference to drive the agent in the

direction of the goal.

Weighted averaging has the advantage of simplicity, but also has limitations for

navigation. The most significant limitation is that opposing preferences may be

averaged to a solution that does not sufficiently address either. If all other terms

are ignored and the weights for navigating with currents and toward the agent’s

objective are equal, then it easy to anticipate a situation where the local current

heads directly away from the objective, and the agent resolves this conflict of

preferences by setting a course perpendicular to the current and the objective.

Far from taking advantage of the benefits of either behavior, the agent simply

wanders off. Millington and Funge [25] discuss this phenomenon as well as some

solutions.

Carefully choosing preference weights and tailoring them to the environments

that the agent is likely to encounter should significantly reduce these occur-

rences, but will not completely eliminate them as possibilities. Since this al-

gorithm will employ a large number of agents, some anomalous behavior is

tolerable, but this limitation cannot be neglected.

One solution to the averaging problem is to execute a behavior based on only

one of the preferences at a time. This way the agent could, for example, take

immediate action to avoid a collision, and then when sufficiently far from a

threat could turn its attention to efficient navigation. Different schemes can be

used for prioritizing responses, and the averaging and prioritizing methods can

be combined to produce different behaviors.

An agent navigating with a prioritized response runs the risk of settling into

43

Page 45: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

oscillation about a local minimum on the boundary of priority between conflict-

ing behaviors. As an example, if other preferences are ignored and an obstacle

lies directly between an agent and its objective, the agent could be prompted

to head directly toward the objective until it approached within a critical range

of the obstacle. Then the priority would change to obstacle avoidance and the

agent would head directly away from the objective until it exceeded the criti-

cal range. Priority would shift back to objective homing, and this cycle would

repeat indefinitely.

Trapping in local minima will probably be infrequent because of the temporal

variability of the current field, and the natural complexity of the environment.

Indeed the variability of the currents alone ensures that the agent’s problem

space is never constant, and simply repeating the same action in the same

location will lead to varying results as time passes. While temporal variations

combined with the use of stochastic behaviors will reduce the probability of

an agent’s becoming trapped, these same properties prevent any mathematical

claim to optimality or even guaranteed solution of the problem.

For this reason, the algorithm can only be demonstrated to be practical in many

cases, but cannot be shown to be universally applicable. Conversely, using

an algorithm such as A* is guaranteed to find the minimum cost path in the

minimum time, however it is practically limited by its model of the environment.

A* searches a graph very effectively, but the physical world is not a graph, and

any attempt to model it as one necessarily sacrifices some detail. Increasing

the quality of the model–by increasing the spacial or temporal resolution of

the graph–impacts the size of the graph and the time required to search it.

Some experimental results of these compromises will be discussed in Section

44

Page 46: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

5.2.3.

The experiments in Section 7 will compare the proposed swarm-based stochastic

algorithm to A*, not in terms of absolute accuracy or mathematical optimality,

but in terms of practical implementation for real-world robotic path planning.

In this context the speed of solving the problem, and the ability to replan and

adapt to changing environments can quickly become more important than the

optimality of the solution. Likewise a small memory footprint can be criti-

cal to execution aboard a deployed vehicle with very limited computational

resources.

3.2 Detailed Description of Operation

This section details the internal structure of the present version of the software

used to test the DORA algorithm. Section 5 describes the progressive develop-

ment of the path planning software including iterative design modifications and

preliminary experiments, while Section 7 discusses trials and results.

The software is written the Python 2.7 programming language, and takes ad-

vantage of the third party libraries numpy, scipy[21], and matplotlib[20] as

well as the included Python standard libraries time, sys, and random.

The software begins by loading four-dimensional velocity and three-dimensional

bathymetric data for the operational area for a set of trials. These data are

stored in Velocity Field and Height Field data structures respectively. To be

successfully loaded, the data must be stored in numpy’s native compressed *.npz

format. Preprocessing to convert data from the NetCDF *.nc files that are

45

Page 47: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

commonly used by the scientific community into a compatible format is per-

formed using a separate script, nc2npz.py whose source code appears in the

Appendix.

In addition to the data, a list of waypoints is loaded from a comma separated

values (CSV) file. Each line in the file consists of two latitude and longitude

pairs representing the starting and ending points of a trial. For the trials used

in this research, the CSV files are generated with a separate Python script

GenWaypoints.py.

Once the data are loaded, a number of global variables are used to configure

the trial, and the names and values of the configuration variables are exported

to an XML File.

Following the initial configuration, the software executes the trials enumerated

in the waypoints CSV file.

Each trial consists of two phases: Exploration and Exploitation. The explo-

ration phase uses a large number of randomly initialized agents to search for a

valid solution to the navigation problem. As soon as a viable solution is found,

exploration ends and the exploitation phase begins. Exploitation uses the solu-

tion found during exploration to begin a simulated annealing process. Specif-

ically, the preference parameters of the best solution found during exploration

are used as the mean for normally distributing the preferences of successive

generations of agents. In each epoch of exploitation, the standard deviation of

the normal distribution is reduced in order to refine the solution.

The internal mechanisms of the exploration and exploitation phases are similar.

46

Page 48: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Both initialize a number of agents at the trial’s origin, and they model the

motion of the agents according to their preferences and the local current from

the velocity field data. At each time step, each agent chooses a heading based on

its preferences and the local conditions of the environment. The agent’s speed is

resolved into East and North components, and added to the local components of

current. The agent’s position is updated based on the resulting velocity vector,

and the process repeats until all agents have been eliminated by meeting some

stop condition. The conditions which eliminate agents from consideration are

enumerated below.

1. Traveling outside of region of interest .

2. Exceeding time allowed to complete mission.

3. Expending all available energy.

4. Colliding with terrain.

5. Arriving at objective.

At each timestep, Condition 1 is evaluated by a simple comparison of the agent’s

position with the minimum and maximum latitude and longitude in the region

of interest. If the agent’s position is outside of these bounds, it is eliminated

from consideration.

Conditions 2 and 3 are both evaluated by a simple check of the agent’s mis-

sion timer and available energy respectively. If either is exceeded, the agent is

eliminated.

Condition 4 is evaluated by estimating the depth of the ocean floor at the agent’s

47

Page 49: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

location using bilinear interpolation of the height field data set. If the agent’s

depth is greater than the estimated depth of of the ocean floor, then the agent

is eliminated.

Condition 5 is the only condition for which the agent’s preferences are recorded.

If the agent is within a specified range of its objective, it is considered to have

arrived. To ensure that the agent’s arrival is detected, the threshold is set to a

range equal to the sum of the agent’s speed and the maximum recorded ocean

current velocity, times the model’s timestep. If the range threshold was set

to a larger value, then it would be possible for the agent to “skip” across its

objective between timesteps. However, this range introduces a small error in

time tracking. To reduce the effect of the error, the agent’s travel time is not

simply recorded as the time required to reach the range threshold. Instead, the

agent’s rate of approach is calculated for the arrival timestep, and time required

to travel the exact remaining distance to the objective at the agent’s rate of

approach is added to the mission time when the arrival is first detected.

If the agent is the first to arrive at the objective, or if its travel time is less

than any previously recorded, the agent’s time and preference parameters are

recorded as the solution to the problem.

Once all agents have been eliminated from consideration, the epoch counter is

incremented. A new list group of agents is generated, and the modeling process

repeats for the specified number of epochs. Once all epochs have been com-

pleted, the relevant parameters from the trial are recorded in a CSV file.

48

Page 50: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

3.3 Supporting Software Utilities

A number of separate Python scripts were written in order to facilitate pre-

and post-processing of data used in the path planning experiments. Each of

the scripts is described below, and their source code is recorded in Appendix

C.

Each script is designed for use by the program developer rather than an end

user, and as such requires modification of its source code for use.

3.3.1 ncinfo

The ncinfo.py script is used to provide information regarding the internal

structures and information stored in a NetCDF file. In general, the data pro-

vided include the names of the variables, the minimum and maximum values,

the number of entries in each data set, and the units of the data stored in

each variable. This is one of the few utility scripts that can be operated via

a command line interface (CLI). The name of the NetCDF file is passed as a

command line argument to the script, and the script then prints the internal

information to the console. While there is a CLI to this script, the structure

of NetCDF files varies significantly between sources, so modification of source

code may still be necessary for effective use.

49

Page 51: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

3.3.2 nc2npz

The nc2npz.py script reads in a NetCDF (*.nc) file that stores four-dimensional,

regularly gridded ocean current data, and converts it into a numpy native for-

mat consisting of six arrays. Each array stores one of the following datasets:

latitudes, longitudes, times, depths, East components of velocity, and North

components of velocity. These arrays are then saved to a single compressed

file with the .npz file extension, which allows for easy import using the numpy

library.

3.3.3 toponc2npz

The toponc2npz.py script performs a function similar to the nc2npz.py script.

This script reads in topographic data saved in a NetCDF file, and converts it

to a set of numpy arrays, which store latitudes, longitudes, and depths, which

are then saved in a compressed .npz file.

3.3.4 GenWaypoints and GenWaypointsGulf

The GenWaypoints.py and GenWaypointsGulf.py scripts were used to gener-

ate a comma separated values (CSV) file consisting of pairs of latitude longitude

pairs. Modifying this file allows the programmer to specify the types of way-

points generated, and the CSV file is then taken as input to the path planner.

This allows for a randomized set of trials that can be repeated for consistency

between experiments.

50

Page 52: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

3.3.5 GrouperBehavior-P3d

The GrouperBehavior-P3d.py program is a visualization program, used only

as a verification of agent behavior. It uses the free 3-D engine, Panda 3d to

render a model of an agent navigating through a simple current field. The user

can modify the source code to choose the behavioral preference parameters, the

local current strength and direction and the positions of starting and ending

waypoints. This allows a visual verification that the agent is behaving as ex-

pected in the presence of the current field. During early development this script

was used to find and correct navigation errors including those with great-circle

navigation and local current compensation.

3.3.6 GrouperView-P3d

The GrouperView-P3d.py script is used to visualize the behavior of an agent us-

ing the full four-dimensional ocean current data. This is similar to the function

of GrouperBehavior-P3d.py, except that the latter includes full ocean current

data rather than a simple, constant, uniform field used in the former.

3.3.7 popro

The popro.py script is used for numerical post processing. It reads in the data

from a trial log, which is generated as described in Section 3.2. The data are

stored in a numpy array, which allows simple processing for analysis of results.

The script can be adjusted to report success and failure rates for each of the

different types of agents, as well as average, standard deviation, and extrema of

51

Page 53: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

any other parameters reported in the trial log. This is the software that is used

to compile the information reported in the tables in this dissertation.

3.3.8 HullProp

HullProp.py includes automation of the hull shape formulas described in Sec-

tion 6.1, such that design parameters can be changed in script, and a new hull

drawn and analyzed by the program. This script simplifies calculation of hull

shape and flotation properties as well as construction material requirements. It

also provides a table of offsets to facilitate building.

Having reviewed a conceptual model of the software, the following chapter ex-

plores its implementation. Much of engineering consists in the transition from

design to functional device, and the process is similar for software development.

So if this, Chapter 3 has been analogous to design, Chapter 5 documents the

initial stages of building a prototype and its iterative revisions and improve-

ments. However, a brief detour is in order. Before exploring the details of

DORA’s early development, a discussion of its intended operating environment

will provide useful context, and this is presented next in Chapter 4.

52

Page 54: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Chapter 4

Environmental Model and

Regions of Interest

Oceanographic data for the trials below are four-dimensional arrays of North

and East components of velocity. There are two regions of interest (ROI’s)

that are used in the experiments of Section 7, one in the Gulf of Mexico

that consists of entirely open water at the surface and one in the eastern

Caribbean, which is obstructed by a chain of islands. The time chosen for

these trials is the week beginning at midnight, May 5, 2012. This timeframe

was chosen because it showed a complex current field in the Gulf dataset,

which would provide a more challenging environment for navigation. The

data grid has a temporal resolution of three hours and a horizontal resolu-

tion of approximately 1/36 degree at standard depths from zero to 5000 meters

(http://www.northerngulfinstitute.org/edacNCOM AmSeas.php). The source

of the data was described in e-mail correspondence with Dr. John Harding of

the Northern Gulf Institute as follows:

53

Page 55: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

The U.S. Navy AMSEAS nowcast/forecast retrospective data was

provided through a NOAA/Navy cooperative effort including the

Naval Oceanographic Office, NOAA’s National Coastal Data Devel-

opment Center and the Northern Gulf Institute, A NOAA Cooper-

ative Institute.

Since the data are provided on a regular grid and the paths to be planned by the

software will not be thus constrained, a four-dimensional linear interpolation

is used to estimate the local magnitudes of the East and North components

of velocity at an agent’s location given by its latitude, longitude, depth, and

mission time. For simplicity, all missions are assumed to begin at the same time

as the data.

Figure 4.1: Gulf of Mexico Region of Interest with line segments showing therandomly generated navigation problems posed to competing navigation sys-tems. Image from Google Earth Software overlays from custom generated .kml

54

Page 56: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 4.2: Caribbean Region of Interest with line segments showing the ran-domly generated navigation problems posed to competing navigation systems.Image from Google Earth Software overlays from custom generated .kml

4.1 Bathymetric Data

Like the ocean current data, the bathymetric data used for collision detection

and agent preference are provided in the ETOPO1 gridded data set. These data

have a horizontal resolution of 1/60 degree. ETOPO1 is available in its entirety,

but only the data relevant to the regions of interest are used in each case, which

reduces the program’s memory requirement. These data are acquired through

ETOPO1’s project webpage at http://www.ngdc.noaa.gov/mgg/global/global.html.

Similar to the ocean current data, the depth at an arbitrary position is linearly

interpolated from the nearest points that lie on the grid.

55

Page 57: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

4.2 Regions of Interest

4.2.1 Gulf of Mexico

Figure 4.3: Gulf of Mexico Velocity Field at Initial Timestep. Image generatedby ERDAP Service at http://wwww.northerngulfinstitute.org

For open water trials in which the agents cannot collide with the ocean floor,

the region of interest lies between 23.1◦ and 26.1◦ North latitude, and between

83.5◦ and 88.5◦ West longitude. These data are used in the Experiments in

Chapter 7, as well as in the software prototype described in Section 5.4.

56

Page 58: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 4.4: Gulf of Mexico Velocity FieldGulf of Mexico ROI Bathymetric Map. Closely spaced contours on the rightof the image are at the boundary of Southwestern Florida’s continental shelf.Image generated using matplotlib Python library [20] with data from ETOPO1[7]

4.2.2 Caribbean

For obstructed water trials in which successful agents must avoid collisions with

land masses or bathymetric features, the region of interest is bounded by 14.5◦

and 16.5◦ North latitude and between −63.0◦ and −59.5◦ East longitude. This

region was chosen because there is a chain of islands running roughly North

to South in the center of the region. For these tests, all waypoint pairs are

oriented East-West, so that the island chain presents a series of obstacles, but

the starting and ending points are in open water.

57

Page 59: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 4.5: Caribbean Velocity Field at Initial Timestep. Image generated byERDAP Service at http://wwww.northerngulfinstitute.org

58

Page 60: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 4.6: Caribbean ROI Bathymetric Map, where white is at or above 0m el-evation, and gray is at or below 0m elevation. Image generated using matplotlibPython library [20] with data from ETOPO1 [7]

59

Page 61: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Chapter 5

Software Development

This project has seen much development over a fairly long period of time, with

early work on mission planners being conducted during summer internships of

2011 and 2012 at the Navy’s Space and Naval Warfare Systems Center Pa-

cific (SSC-Pac). This chapter provides a synoptic description of the software’s

progression from an early 2-D graph-based mission planner, which laid the foun-

dations for later work, through the 4-D A* planner that is used for this project’s

standard of comparison, and finally through the early concept tests of DORA.

As software development bares strong similarity to other varieties of engineering

design, there are some decisions that were made based on theory and calcula-

tion, some that were made as matters of reasoned assumptions and experiential

best guesses, and some that were absolutely arbitrary matters of preference.

This chapter documents the important milestones in the software development

and explains the motivations and justifications for decisions that were made

along the way.

60

Page 62: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

5.1 2-D A* Planner

An initial prototype of graph-based path planning software was developed dur-

ing an internship at the Space and Naval Warfare Systems Center (SSC) Pacific.

This software imported oceanographic current data and used it to construct an

8-connected graph of an AUV’s operational area. The weights of the graph’s

edges were set equal to the time the vehicle would require to traverse them as-

suming a constant vehicle speed. An arbitrarily high cost was assigned to edges

with opposing currents that were greater than the vehicle’s operational speed,

as well as edges that resulted in collisions with land. Another cost term was

included as an inverse distance from user-specified points. This was to allow

the user to specify specific points for the AUV to avoid, such as areas of high

risk or areas occupied by shipping traffic. Once this graph was generated, the

user was able to specify an arbitrary-length sequence of waypoints for the AUV

to visit. The planner then used an A* search to find the minimum cost route

between the initial waypoint and the final one. An example of a path planned

by the software is shown in Figure 5.1. In this example, the current data was

read from a file produced by the U.S. Navy’s COAMPS-OS forecasting software.

The vehicle speed was set to 1 m/s, which is comparable to the maximum local

currents. The figure shows that the software planned a path that significantly

diverged from a straight line route between the start point to the East and the

goal point to the West. The divergence from the straight line path took advan-

tage of the local currents and avoided collision with the island between the two

waypoints.

The path planner seemed to work very well in the cases tested. The planned

path would consistently avoid tracks that collided with land or opposed strong

61

Page 63: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 5.1: Output of 2-D A* path planning program showing velocity vectorsand planned route for a mission from East to West off the coast of California

62

Page 64: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

currents, and would navigate away from the user-specified points as required.

This software was developed to support the Unmanned Underwater Vehicle

(UUV) Control Center at SSC Pacific, so the program also included a means

to export the planned paths to a custom xml format, to Google Earth’s kml

format, to bitmap images, and to the Navy’s Composable FORCEnet(CFN)

software.

5.2 4-D A* Planner

While the two-dimensional planner successfully planned good routes through its

problem space, the basic assumptions upon which it was built neglected signifi-

cant aspects of the ocean environment. In order to address these limitations, the

planner was rewritten to include ocean current variations with depth and time.

Additionally, the new planner searches a dynamically generated graph, that is

independent of the data grid and allows for arbitrary graph geometry.

The new algorithm begins with a single root node at the vehicle’s initial posi-

tion. It then searches for a path by building a tree in accordance with the A*

algorithm. The cost to travel between any two points is calculated by mod-

eling the vehicle’s motion through the current field according to user-specified

parameters.

Since the graph is generated independently from the source data grid, it is un-

likely that current data will be available for the particular point in space and

time that the software would need for a cost calculation. The local ocean cur-

rents are linearly interpolated between points from the source data. Higher order

63

Page 65: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

interpolation schemes may be employed by later versions of the software.

An example of the algorithm’s dynamically generating a tree and choosing a

path to avoid collision with an island is shown in Figure 5.2..

Figure 5.2: Example of A* Search avoiding collision with an island using abranching factor of 3 and an included angle of 90◦. The left image shows con-struction of the graph, while the right image shows the route as returned.

5.2.1 4-D Grid Independent Design

The cost of transitioning between two nodes in the graph is calculated by model-

ing the agent’s motion through a velocity field defined by gridded ocean current

data. Specifically, the agent’s speed and heading is resolved into North and

East components independent of the ambient current. Then the ambient cur-

rent’s North and East components at the agent’s location and time are linearly

interpolated from gridded data and summed with the agent’s velocity. The

agent’s position is updated by calculating the action of the velocity over a user-

specified time step. The interpolation-calculation step is repeated until the

agent is within an acceptable range of its objective or until the agent reaches

another stop condition.

64

Page 66: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

The agent’s speed and heading are determined by its navigational behavior.

Preliminary tests have assumed that the vehicle homes toward its objective by

changing its heading to directly face its target at each time step. To model the

changes of depth during gliding, the agent is capable of descending at a specified

angle. The agent begins at zero depth, then upon reaching a specified maximum

depth, the agent begins an ascent at the same angle. This cycle repeats until

the agent reaches its objective.

By dynamically generating the graph during planning, and by using interpola-

tion of data to predict current velocities at arbitrary points in time and space,

the planner remains independent of the data grid. This has the advantage of

allowing for increased data resolution without increased calculation time. This

represents a substantial advantage over the original prototype of the software,

which directly tied the graph resolution to the data resolution and imposed sig-

nificant time penalties on using increased data resolutions for a given plan.

Additionally, the separation of the planning graph from the data grid will allow

the plan to be easily cross-referenced against other data sets. For instance,

during the cost calculation, the agent’s predicted motion can be checked against

a bathymetric data set to predict and prevent collisions with the sea floor, or

to allow other environmental factors to be considered during planning.

5.2.2 Testing of Prototype Assumptions

Initially the four-dimensional planner was used to test the assumptions of the

two-dimensional planner by constraining the new software to paths along the

surface and in a constant current field, effectively collapsing it to two dimensions.

65

Page 67: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

To ensure a fair comparison, the new planner was also restricted to planning

paths along cardinal and intercardinal directions, and at the same resolution as

the data grid.

These tests showed that the two-dimensional assumptions are wholly invalid.

The paths planned using only 2-D data often had higher costs than a simple

straight-line path. Ocean currents are highly variable by region, and it may

be feasible to use two-dimensional path planning in some areas, however, using

four-dimensional data provides a significant improvement in planner perfor-

mance. Indeed, in the cases tested, using no data at all improved performance

over using the 2-D approximation.

5.2.3 Speed of Planning

While v0.2 produced more cost effective plans than v0.1, the time required to

plan the routes was significantly longer. In order to make the planner practical

for real-world applications, a series of several paths were planned between the

same start and end points, and with the same environmental data. For each of

these trials, the branching factor, included angle, spacial resolution, or temporal

resolution varied.

Table 5.1 shows the results of these trials, which are based on the 72-hour

forecast of 18

degree resolution NCOM data for the period beginning at midnight,

July 2, 2012. All of these trials searched for routes from 32.9◦ N 119.2◦ W to

33.5◦ N, 119.2◦ W. All trials used a spacial resolution of 6.6 kilometers and a

speed of 0.5 m/s.

66

Page 68: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Table 5.1: First Performance Trade Study

Trial Branch Factor Angle Temp. Res. Spac. Res. Cost Comp. Time0 9 360.0 600.0 6.6 49.58 4275.531 3 90.0 600.0 6.6 49.58 59.672 3 90.0 60.0 6.6 49.58 166.713 3 90.0 600.0 3.3 48.76 11067.374 5 180.0 600.0 6.6 49.58 1240.085 5 90.0 600.0 6.6 49.58 312.34

The choices of branching factor, included angle, and temporal and spatial res-

olution clearly have significant impacts on the software’s performance both in

the time required to plan the path, and in the quality of the path that is

planned. Any specific choice of parameters is necessarily arbitrary, and the

parameters chosen for one particular problem may not be ideal for another.

However, this series of trials points to some useful, practical guidelines in pa-

rameter choice.

First, all seven trials planned paths with identical total costs with the exception

of Trial 3, whose higher resolution allowed for a better solution. However, in the

case of Trial 3, the predicted savings in mission time was 0.82 hours, or about

1.7%. A more meaningful observation arises from a comparison of the cost to

the benefit. While the higher resolution saved 0.82 hours of mission time, the

planner required 3.06 hours longer than Trial 1, which used exactly the same

parameters other than resolution. In other words, the higher resolution cost

substantially more time in planning than it saved in the mission.

Second, reducing the branching factor, or reducing the included angle reduces

the number of solutions that the planner considers and risks the exclusion of

the best route. To evaluate the loss of performance due to exclusion of optimal

67

Page 69: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

paths, the included angle was varied between 360.0◦, 180.0◦, and 90.0◦. Similarly

the branching factor was varied from 9 to 5 to 3. In all cases, the planner found

the same route, but the computation time varied from a maximum of 4275.53

seconds down to 59.67 seconds. For this reason, the path planner is set to a de-

fault branching factor of three with an included angle of 90◦. Spacial resolution

is set between about 110

and 17

of the distance between waypoints.

Temporal resolution has a less significant impact on the calculation time than

the other parameters, but it does have a critical limit. Since the cost function

models the motion of the agent through space and considers the agent to have

completed a transit when it is within a certain threshold, the temporal resolution

must at least be sufficient to prevent the agent’s skipping past its objective

between time steps. As an example, if the agent is required to navigate at

0.5 meters per second to within 100 meters of an objective, with an ambient

current magnitude of 0.25 meters per second, then the temporal resolution may

be at most 100/(0.25 + 0.5) = 133.33 seconds. A finer resolution improves the

accuracy of the model, but a coarser resolution would allow the vehicle to enter

and exit its threshold range between timesteps and could continually oscillate

across its objective.

One common solution to this problem is to allow an agent to track its range to

target, and to consider its objective to be attained when range stops decreasing

and starts increasing. This is a good solution in the absence of ambient currents,

or in any case where the agent can close on its objective at will. However, since

the agent can be pushed away from the objective by the local currents, this

method is not useful for the current problem.

68

Page 70: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

5.3 NetLogo Prototype

Figure 5.3: Example of Screenshots from the NetLogo prototype show the agentsconverging to a solution over time. The left frame shows the agents widelyscattered early during the experiment, while the frame to the right reflects theirnarrower dispersion about a valid solution.

To test the concept of iterative optimization of reactive agents using the agent

preference formula, a NetLogo[38] model was created, which generated a current

field either randomly or according to user preferences, and then modeled a

number of agents’ motion through the field. Figure 5.3 shows screenshots from

different stages of an experiment.

Since this program was only developed as a concept test, no real-world oceano-

graphic data were used. Instead, to model the current field, three nodes were

set at user-specified coordinates, and each node was assigned a vector that rep-

resented the current at that location. Then the current at any location in the

model space was then calculated as a distance-weighted average of the three

vectors. This ensured that the current measured at each node was equal to the

current specified at that node, and that the values transitioned smoothly from

69

Page 71: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

one node to another throughout the space.

For this model, no concept of epoch was used, and the program simply main-

tained the number of agents in the field by introducing a new agent at the start-

ing point any time an agent either arrived at the objective or was eliminated.

Simulated annealing was used to improve the solutions that were found from

the initial random search. This algorithm showed good results, with agents

often solving problems that were not soluble with simple tracking or homing

behaviors, and consistently producing comparable results in cases that were

soluble with the simpler behaviors. The original concept for the DORA al-

gorithm incorporated genetic algorithms as the optimization method, however

the positive results using simulated annealing led to its continued use in later

experiments.

5.4 Python Prototype in Gulf ROI

To test the basis of the algorithm for operations in real-world environments, a

Python program was developed, which functioned very similarly to the NetLogo

prototype, except that it used data supplied by the Northern Gulf Institute’s

oceanographic data repository for estimating ocean currents. To simplify the

initial tests, no bathymetric data were used, so a simplified preference equation

was employed, which neglected the first term. In practice, no first term was

implemented, but mathematically this was equivalent to using the full formula,

but setting A = 0. To ensure that neglecting the bathymetric data would not

impact the validity of the test, all trials were conducted in the Gulf of Mexico

test area, at a depth less than the minimum in the area. The results of these

70

Page 72: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

trials were positive with the DORA system solving difficult navigation problems

more frequently than simpler agents.

5.4.1 Agents

All experiments began with 50 randomly initialized DOR agents, as well as one

agent that used a homing behavior and one agent that used a tracking behavior.

The homing agent aligned its heading directly toward its objective. Since all

of its available speed was in the direction of its objective, it made no attempt

to compensate for local currents, which resulted in its traveling faster along its

path, but over greater distance than the tracking agent.

The tracking agent resolved the local current into vector components with one

oriented parallel to the agent’s path, and the other perpendicular. The parallel

component only acted to speed or slow the agent’s approach to its objective,

but did not alter the agent’s path. The perpendicular or cross-track component

pushed the agent off its course.

The tracking agent oriented itself to compensate for the cross-track component

when this component was slower than the agent’s speed. This compensation

minimized the distance that the agent was required to travel, and represented

an ideal solution in the case of a constant, uniform velocity field that was slower

than the agent’s speed. When the cross-track component was greater than or

equal to the agent’s speed, the agent used the homing behavior.

In some trials, these constraints on the tracking behavior produced the counter-

intuitive result that the agent would expend so much of its speed overcoming

71

Page 73: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

the cross-track current, that it would be pushed away from its objective. To

evaluate the impact that this would have on performance, the tracking behavior

was modified so that the agent would only compensate for cross-track current

in cases when it was still able to make progress toward its objective, and the

experiment was repeated. This trial showed a negligible decrease in overall

performance, and the behavior was restored to its original state. This lack

of apparent impact is likely a result of the nature of the trials. They were

conducted in fields that did not have guaranteed solutions, so in cases where

the agent could not make definitive progress toward its target, it was unlikely

to be successful regardless of its response. In cases when it had sufficient speed

to close on its objective, its contingency behavior was irrelevant.

Physical Model

The agents in this algorithm are intended to model the behavior of physical

vehicles deployed on a real-world mission, so the agents operate in a simplistic

kinematic model of an ocean current field. Each agent has a heading, a speed,

and a position stored as internal, local variables. At each time step the agent’s

speed is resolved into East and North components of velocity. These are then

added to the East and North components of local current, and the agent’s

position is updated by multiplying the resulting total velocity vector by the

model’s timestep.

In experiments to evaluate the agents’ ability to solve known problems, the

model is constrained to moving and navigating on a planar surface in a constant,

uniform velocity field. However, in the more practical experiments, motion is

modeled along a spherical approximation of earth that assumes a radius of

72

Page 74: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

6371.0 kilometers [9]. Agent position is tracked as latitude and longitude. Since

the kinematic model uses normal SI units, the agent’s change in position is

calculated in meters, then converted to a change in latitude and longitude,

where the conversion factor is based on the agent’s position before the update.

This introduces an error since the conversion from meters to degrees of longitude

changes continuously as latitude changes, but since the change in position during

a typical timestep of the model is small compared to the radius of Earth, the

error is also small.

5.4.2 Test Conditions

For these experiments, all trials were conducted using a fixed depth of 25.0

meters. Since the physical model and planner can run at an arbitrarily specified

resolution, four-dimensional linear interpolation is used to estimate the velocity

at any requested depth, time, and position within the data set. For any request

outside the valid range of the data, the interpolation function returns a non-

numeric error value. The first timestep is shown in Fig. 4.3.

The objective of each trial was to minimize the time required to navigate over

a 150 kilometer course. To ensure a wide variety of problems, the start and end

points of the test mission were selected using the programming environment’s

native pseudo-random number generator.

First, a latitude and longitude pair was chosen within a window of the data

set that was at least 150 kilometers from the boundary of the data. Next, a

heading was chosen, and the point that lay 150 kilometers from the first point in

that direction was selected. There was some small variation in the actual length

73

Page 75: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

between the start and end points, because the 150 kilometer range was estimated

using the planar range approximation at the mean latitude of the operational

area. To prevent a bias toward outward bound courses, the software randomly

selected one of the two points as the starting point, and the other was set as

the destination.

In each case, 50 agents were initialized at the trial’s starting point, and their

positions and headings were updated at each timestep of the physical model

according to their respective behaviors. If an agent moved outside the bound-

aries of the data or arrived at the destination, the agent was removed from the

trial. If an agent reached the destination, its time was recorded, and the overall

minimum time for each type of agent was tracked for each trial. Additionally,

the behavioral tuning parameters for the DOR agent with the minimum time

was recorded. For this project, the starting point and the objective were only

varied between trials, so all agents for a given trial attempted to solve the same

problem.

The criterion for successfully reaching the intended destination was that the

agent must be no farther from its destination than the spatial resolution of the

physical model, which was calculated by adding the maximum agent speed to the

maximum local current speed, and multiplying by the model’s time step. While

there is some error introduced by using this simple range threshold criterion,

the error can be kept to an acceptable minimum by choosing a sufficiently high

temporal resolution. The range threshold criterion is described in 5.1.

To reduce the error produced by the range threshold approximation, the agent’s

rate of change of distance to its objective was calculated for the timestep preced-

ing the agent’s crossing the range threshold, as in 5.2. The remaining distance

74

Page 76: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

to the objective was then divided by the rate of change to give an estimate of the

remaining time required to reach the destination, which is shown in 5.3. This

time estimate was then added to the time required to reach the range threshold

to produce the total estimated cost for a trial.

rT = (Va + Vmax)× δt (5.1)

sprev =(rprev − r)

δt(5.2)

tm = tT +

(r

sprev

)(5.3)

rT Range threshold for arrival criterionVa Magnitude of velocity of agentVmax Maximum magnitude of velocity in data setδt Incremental timestep of modelr Range to objectiverprev Range to objective at previous timestepsprev Average rate of change of range to objective

over the previous timesteptT Time when satisfaction of arrival criterion

is detectedtm Total mission time recorded for agent upon

arrival

A single epoch of a trial consisted of the span of time during which any agent was

still being modeled. Once all agents were removed due to success or failure, the

next epoch began with the introduction of another 50 DOR agents, one tracking

agent, and one homing agent. Each trial continued until it had executed seven

epochs.

75

Page 77: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

At the end of the trial, relevant data were recorded to a file including the trial

number, the starting and ending points, the minimum time achieved by each

class of agent, and the tuning parameters of the best DOR agent.

5.4.3 Results

Planar Earth with High Velocity

To test the performance of the algorithm and of the physical model, an exper-

iment was run in which all navigation and physical modeling was carried out

using a planar model of the mission area. The local velocity field was set to

a constant 2.0 m/s directly from West to East, and the agents’ speed was set

at 3.0 m/s. This choice of a simplistic field and a relatively high agent speed

ensured that all trials would have valid, analytically calculable solutions. The

results of these trials are shown in Table 5.4.3

Table 5.2: Results of Planar Earth TrialTracking Solved 100.00 %Homing Solved 100.00 %DORA Solved 100.00 %Tracking Best 97.5 %Homing Best 0.00 %DORA Best 2.50 %Average Tracking Error 2.23× 10−6 sAverage DORA v. Tracking 2.37× 10−2 %Average DORA v. Homing −13.39 %Average Epoch 4.41Average Time 163.10 s

As expected, all agents found solutions in this simple case. Analytically, the

tracking agents should arrive at a point objective in a predictable time, which

76

Page 78: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

is calculated by dividing the distance from the starting point to the objective

by the net speed remaining after correcting for a local current. Over the 200

trials in this experiment, the average absolute difference between the predicted

transit time and the modeled transit time was 2.23 microseconds for tracking

agents, while their average total mission duration was 18, 780.7 seconds.

The tracking agents implemented the best solution found in 97.5% of trials,

while the DOR agents found a better solution in 2.5% of trials. The reason

that the tracking solution was not the best possible solution found in all cases

is that the range threshold criterion for arrival permitted a slight improvement

in mission time by using more of the agents’ available speed to move toward

the objective, and accepting a small amount of leeward drift. Since the distance

remaining when an agent crosses the range threshold is based on the rate of

approach before crossing the threshold, the agent needs not contend with op-

posing currents for the final distance to the objective. This solution is shown

in Figure 5.4. Finding this slightly improved solution was a rare occurrence,

and on average the DOR agents performed about 2.37× 10−2% worse than the

tracking agents. Still this discovery by the software suggests a high degree of

optimality within the boundaries of the current trial, and also emphasizes the

importance of proper representation of the physical environment to ensure valid

solutions.

This result emphasizes some of the advantages and disadvantages of

the algorithm. Specifically, it consistently approached the optimal

solution, and produced results that were comparable to the tracking

agents’, but without any assurance of optimality.

The average amount of time required for each trial was 163.10 seconds for all

77

Page 79: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 5.4: DORA found a slightly better solution than tracking in a problemspace that does not apply to physical space. This solution is shown schematicallywith the origin to the left and the destination to the right.

seven epochs, however the average number of epochs required to reach the DOR

agents’ best solution was 4.41. This suggests that some time may be saved in

future iterations by concluding the trials after fewer epochs. However, some

trials required the full seven epochs to find their solutions, so using fewer epochs

will necessarily sacrifice some optimality. The sensitivity of the solution to the

number of epochs permitted is left for later experimentation, and all trials for

this project are limited to seven epochs.

Spherical Earth with High Velocity

To verify that the agents’ performance and the physical model transitioned well

from the simplistic planar approximation to the spherical model of Earth, the

same initial configuration from the previous trial was retained, but all navigation

and physical model functions were replaced with the spherical model. Table

5.4.3 summarizes the results of this experiment.

78

Page 80: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Table 5.3: Results of Round Earth with High Speed TrialTracking Solved 100.00 %Homing Solved 100.00 %DORA Solved 100.00 %Tracking Best 83 %Homing Best 0 %DORA Best 17 %Average DORA - Tracking 3.70× 10−2 %Average DORA - Homing −15.07 %Average Epoch 4.47Average Time (s) 120.90

Again all three classes of agent found solutions in every trial. In this experiment,

the tracking agents found the best solution in 90.5% of cases, while the DOR

agents found the best solution in the remaining 9.50%.

The DOR agents performed comparably to the tracking agents with an average

performance 3.70 × 10−2% worse than the tracking agents’, but 15.07% better

than the homing agents’.

The average time required for execution of all seven epochs was 120.90 seconds,

while the DOR agents’ required an average 4.47 epochs to reach their best

solutions.

These results are very similar to those from the planar approximation, as would

be expected. The tracking behavior still represents the best solution for nav-

igation to a point, but the DORA optimization is again able to find a better

solution for navigation to a small circle around the objective in a few cases. The

overall result is that the DOR agents are nearly equivalent to the tracking agents

in this experiment, as they were in the preceding planar experiment.

79

Page 81: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Spherical Earth with Low Velocity

The preceding experiments were conducted to validate the performance of the

DORA algorithm in fields with known solutions, and to compare the model’s

results with the analytical solution to the tracking problem. Direct comparisons

with the tracking agents showed that on average, the DOR agents found com-

parable solutions to the problem, however since a constant, uniform current is a

trivial case, there is hardly justification for using any optimization at all.

A more practical problem includes the variations observed in currents in a real

ocean. Here, currents vary significantly in space and time, and are not guaran-

teed to be less than the vehicle’s speed. This presents a much more challenging

environment for navigation and a much more reasonable approximation of the

environment that an AUV employing this steering behavior might face. So for

the next trial, the data described in Section 5.4.2 were used as the source for

ocean currents, and the vehicle speed was limited to 0.50 m/s.

Since the maximum local velocity in the data set was slightly larger than 2.0

m/s, there was no guarantee that the slow-moving agents would be able to

navigate to their objectives. To ensure a sufficient number of successful results

for evaluation, this experiment was run for 1200 trials rather than the 200 in

the preceding experiments. The fact that no solution was guaranteed to exist

allowed for evaluation of the DOR agents’ robustness and ability to be adapted

for use in situations when simple tracking or homing behaviors failed. The

success rates of the three classes of agents are shown in Table 5.4.

Of the 1200 trials conducted, only 524 were solved by any of the agents. DOR

agents found solutions to 43.5% of cases, which was almost twice the number of

80

Page 82: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Table 5.4: Success Rates for Round Earth with Low Speed TrialTracking Solved 24.92 %Homing Solved 21.25 %DORA Solved 43.50 %Average Epoch 4.38Average Time (s) 124.55

solutions found by either of the other classes of agents. This is instructive when

considering the robustness of the agents, but for comparison between them, only

the cases when at least one class of agent found a solution are considered, which

is the basis of the percentages shown in Table 5.5.

Table 5.5: Results of Solved Trials on Round Earth with Low SpeedTracking Homing DOR

Successful Trials 299 255 522Success Rate (%) 57.06 48.66 99.62Best Solution 31 0 493Best Solution (%) 5.92 0.00 94.08

For the 524 trials with successful solutions, the DOR agents had the best perfor-

mance with both the highest number of solutions found and the highest number

of best solutions found.

5.5 Modifications for Use in Obstructed

Environments

Incorporating a collision avoiding behavior for use in obstructed waters required

modification of both the agent preference equation, and the simulation environ-

ment from those used in the open water trials. First, the simulation environment

was modified to incorporate bathymetric data. Specifically, the ETOPO2[2]

81

Page 83: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

database was used for initial testing, then was replaced with the newer and

higher resolution ETOPO1 [7] database. These data were used to eliminate

agents that collided with terrain as well as to provide agents with local approx-

imations of bottom topography.

In the open water model, agents were eliminated from the simulation any time

that they strayed into areas that did not have good model data. This prevented

wandering out of bounds, and provided a crude means preventing agents’ travel

beneath the sea floor. However, since the ocean current model data were lower

resolution than the terrain data, a more accurate means of checking for colli-

sions was needed. First, if ocean current data were unavailable, the simulation

environment assumed that the local current was the same as the last valid cur-

rent information for the agent, so that an agent could maneuver in near-shore

areas without accurate current models and still survive, assuming the agent did

not collide with terrain.

To check for collisions with terrain, the agent’s depth was compared with

an interpolation-based estimate of ocean depth provided by the bathymetric

database. If the agent’s depth was greater than the depth estimate, the agent

was eliminated from consideration.

Since the agent preference formula was updated to include bathymetric data,

the simulation environment provided each agent an estimate of the local bottom

gradient vector as well as the depth at the agent’s position. To adhere to

the principle that agents only have access to local information, presumably

information that could be obtained in the field, the gradient calculation was

based on estimates of bottom depth within a 100-meter radius of the agent’s

position.

82

Page 84: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Finally, to keep agents restricted to the region of interest, each agent’s global

position was compared to the extrema of latitude and longitude in the data sets,

and agents outside of these bounds were eliminated.

5.6 Pseudo Scientific Tests and Modifications

Many of the design decisions during the development of the test software were

incidental to the operation of the algorithm itself. It was important to make

reasonable decisions, but these were matters of developer preference rather than

of scientific rigor. This section discusses a number of software design develop-

ments that were based solely on experience, intuition, statistically insignificant

trials, and whim. Nothing presented here is claimed to be optimal, ideal, or well-

tested, but it does provide insight to the justification for a number of design

decisions which are not, at their cores, the focus of this research.

Figure 5.5: “Piled Higher and Deeper” by Jorge Cham www.phdcomics.com

83

Page 85: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

5.6.1 Selecting Values of M and T

a = A

(1

1 + eMtc−MT

)(1− 1

1 + etc

)∇F

The slope and temporal horizon terms in the obstacle avoidance vector (shown

above for reference) can take on any positive value, and are not among the

parameters that are optimized during normal execution of the program. In an

attempt to measure the influence of these terms on the algorithm’s probability of

success, a series of experiments was conducted in which 20 trials were run with

every combination of slope values M = {1.0, 0.1, 0.001, 0.0001} and temporal

horizon values T = {0, 900, 1800, 2700, 3600}. The success rates of the algorithm

were monitored, and the one with the highest rate was chosen. This combination

corresponded to M = 0.001 and T = 2700.

To evaluate the robustness of the conclusions drawn from the preceding test,

another experiment was conducted in which the same values of M and T were

used in successive trials with the same number of agents as in the previous

experiment. This test showed that success rates between tests often varied by

up to 5 trials. The largest difference between successive tests in the earlier

experiment was far fewer. Therefore, it is unreasonable to conclude that the

chosen values of M and T correspond to an optimal solution even among the

limited values considered in the first trial. However, since the choice of values

for these parameters is arbitrary, and the values used here worked for several of

the trials, no further attempt to optimize M or T is considered.

Since the first stage of the algorithm is a random exploration, increasing the

number of agents increases the effective coverage of the search space, so as the

84

Page 86: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

number of agents is increased, the consistency between trials is also increased.

However, this also increases the computation time required for a given trial.

This trade between search space coverage and computation time led to the

developments described in Section 5.6.2.

5.6.2 Exploration vs. Exploitation

For all prior trials and for the first several hundred tests of the obstructed

environment agent preference formula, the number of agents in the environment

was kept constant. Fewer agents meant a lower likelihood of success, but also

less time required for execution. More agents improved the chances of success

at the expense of the amount of time required to execute.

Using this system, exactly the same amount of resources were used for explo-

ration as for exploitation; the algorithm spent exactly as much effort optimizing

its solutions as it did finding candidate solutions.

Since it is more important to have a valid solution than a good solution, a

short trial over the first ten waypoint pairs was conducted with a strong bias

toward exploration. The algorithm initialized with 600 agents, but on the epoch

following the discovery of a valid solution, the number of agents was reduced by

a factor of 10. Thus, 600 agents searched for a path, but once it was located,

the simulated annealing only operated with 60 agents per epoch. The results of

this trial were compared with a prior trial that was run on the same computer,

and the modification was found to slightly improve the number of solutions

discovered, but to reduce their average quality.

85

Page 87: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

This was expected, as the resources were allocated to emphasize discovery over

optimization. However, the more important difference between the two tests

was that the average time required was reduced by approximately 91%. This

suggested an important concept for moving forward: by managing the num-

ber of agents allocated to different phases of the search, the quality

of solutions can be effectively traded for savings in computation time.

This also implies that for a practical case of real-world operations, it may be-

hoove an operator to bias a search strongly toward early discovery of a viable

solution, so that the rough solution can be used, and then improved during the

course of a mission.

5.6.3 Simple Learning Does not Help

In hopes of improving the algorithm’s performance through experience, a sim-

ple means of learning from previous successes was incorporated. This learning

consisted simply of maintaining a vector, which stored the mean value of all

successful trials.

To evaluate the influence of this change, a trial was conducted in which the mean

values of all behavioral parameters were recorded by the software, and 1/5 of

the agents generated during the exploration phase were initialized in the region

of the average according to a gaussian distribution with a standard deviation

of 0.25. During this trial, there was a negligible increase in success rate, but

also an increase in the number of epochs required to find an initial solution.

This result was discouraging, and further development of a learning component

was suspended. However, this does not imply that learning from experience

86

Page 88: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

will not improve the algorithm’s performance. The small differences seen here

were only relevant to exactly the parameters of the trial, and it is possible that

a better learning system, or even the same system with more carefully chosen

parameters could perform quite well.

Further exploration of experiential learning is left for later research, as the

emphasis here is on the characterization of the algorithm itself.

5.7 Impact of DORA’s Individual Parameters

Before comparison against the A* algorithm, the necessity and influence of two

of the preference vectors were evaluated using the same 1.5 m/s speed and test

environment as the baseline trial described above.

5.7.1 The Influence of A

a = A

(1

1 + eMtc−MT

)(1− 1

1 + etc

)∇F

Since A increases the dimension of the search space for the algorithm, an ex-

periment was conducted in which A was constrained to zero. This trial showed

that exclusion of the terrain avoidance behavior had a small negative impact on

the algorithm’s ability to find a solution. In this trial, the success rate dropped

from about 67% to about 64%. However, this is a small difference, and the gains

associated with elimination of an ocean floor database or measuring from a de-

87

Page 89: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

ployed platform would likely warrant this sacrifice. Since all other behaviors can

be implemented using a single GPS module and a simple microcontroller, and

the terrain avoidance behavior would need either a bottom tracking SONAR

or a terrain database for its operational area, it is likely that any agent simple

enough to warrant the use of this algorithm would be too simple to warrant the

monetary, computational, and energy expenses associated with terrain avoid-

ance.

This method of neglecting the terrain avoidance behavior is used in the field

trials described in later sections, as the test craft uses only a microcontroller

with a GPS and compass to navigate.

5.7.2 The Need for C

c = C〈sin θa, cos θa〉

The ability of the course stability preference to take on a negative value can lead

to very counter-intuitive behavior. The entire reason for adding this preference

is that changing directions expends energy.

Moreover, including this behavior introduces hysteresis into the agent’s trajec-

tory. To know which direction an agent will attempt to travel, it is insufficient

to know only its position and preference values. Instead, the agent’s course

at the previous update is also important. But to know that course, the state

preceding that one is necessary, and so on back to the beginning of the agent’s

travel. So beyond being merely counter-intuitive, the inclusion of the course

88

Page 90: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

parameter is also mathematically inconvenient.

Three experiments were conducted with two sets of 20 trials each in the Eastern

Caribbean test area. One experiment was conducted with C constrained to the

normal range of −1 to 1 inclusive. Another experiment was conducted with

C constrained to positive values 0 to 1 inclusive, and in the final experiment,

the course stability preference was eliminated by setting C = 0. Eliminating

C reduced the algorithm’s ability to solve the navigation problems from a 75%

success rate down to 55%. Constraining C to the positive half of its range had

no apparent impact on success rates, but reduced the quality of the solutions,

by increasing the required transit time by about 4%. Given the small number of

trials, it is unlikely that this 4% holds any significance whatsoever, but without

a clear advantage to constraining C, it was left with its original definition, which

encompasses a larger variety of potential behaviors.

89

Page 91: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Chapter 6

Platform

To evaluate the potential use of the DORA system for real-world navigation,

a small-scale autonomous surface vehicle (ASV) was constructed, and was pro-

grammed to navigate according to the agent preference formula. This chapter

discusses the design, construction, and testing of the ASV, while the details of

its field trials are reserved for Section 7.6. Figure 6.2 shows the vehicle under

way.

6.1 Hull Design

The ASV design is built on a simple, wall-sided monohull. To facilitate analysis,

the hull shape was chosen so that it is represented by two geometric curves. The

forward section is described by a parabola, which is parallel to the vessel’s longi-

tudinal axis at the its widest point, and converges to a point on the longitudinal

axis at the bow. The aft section is described by a line that terminates at the

90

Page 92: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

outer corner of the transom, and is tangent to the parabola. Since the line can

only pass through a fixed point and be tangent to the parabola at one point,

the hull geometry is fully constrained. To simplify modeling, the functions de-

scribing the hull shape were derived in terms of the following design parameters:

Length of Forward Section, Length of Aft Section, Maximum Beam, and Width

of Transom. Equation 6.1 shows the formula that defines the hull geometry. If

drawn on a Cartesian grid, these formulas plot a hull with its widest point at

x = 0, the transom at x = −A, the bow at x = F , and the point of tangency of

the two curves at XFA. The formulas can be easily transformed to a different

position if needed, but since useful design data are obtained directly from the

formulas, those transformations are not carried out here.

hF = −B2

(x2

F 2+ 1

)(6.1)

hA =B

F 2(−XFAx− AXFA) +

BA

2(6.2)

XFA = −A+F 2√

BF 2

(BA2

F 2 −B +BA

)B

(6.3)

F Length of Forward Hull SectionA Length of Aft Hull SectionB Maximum BeamBA Beam at TransomXFA Point of TransitionhF Half Breadth from XFA to BowhA Half Breadth from stern to XFA

Using a mathematically described hull shape facilitates analysis by eliminating

approximation. By integrating the curve from the transom to the bow, the area

is exactly calculated, and the displaced volume is found by multiplying that area

91

Page 93: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

by the vessel’s draft. The formula for the waterplane area is given by Equation

6.4.

AW = AWA + AWF (6.4)

AWA = −BXFA

F 2

(X2

FA + AXFA + A2)

+BA (XFA + A) (6.5)

AWF = B

(X3

FA

3F 2−XFA +

F

3

)(6.6)

AW Waterplane areaAWA Waterplane area from −A to XFA

AWF Waterplane area from XFA to F

Similarly, the lengths of the side walls can be calculated using the formula for

arc length and suitable integration tables such as those given by Stewart [36].

The length of a side is shown in terms of design dimensions by Equation 6.7.

This length is useful for layout of material and for estimation of weight. Since

the sides of the hull are vertical, and are thin compared to the length and depth

of the vessel, the volume of material in the walls can be estimated by multiplying

the arc length of the side by the total height of the hull and the thickness of the

side walls.

92

Page 94: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

L = −F2

2B

[B

F

(XFA

F

√C0 −

√C1

)+ ln

(−B

F+√C1

−BXFA

F 2 +√C0

)]+

√(XFA + A)2C0 (6.7)

C0 = 1 +

(BXFA

F 2

)2

(6.8)

C1 = 1 +

(B

F

)2

(6.9)

The particulars of the hull constructed for the ASV are shown in Table 6.1,

and the hull plan is shown in Figure 6.1. These dimensions were chosen to

provide ample internal space for the ASV’s electrical and control systems, as

well as sufficient beam to support the later addition of solar panels to the deck

without compromising the vessel’s stability. The following section describes the

construction of the hull as well as testing of a simple prototype to evaluate the

buoyancy and stability of the chosen shape.

Table 6.1: ASV Hull Properties; Coefficients calculated according to formulasfrom Rawson and Tupper [29].

F 14.0 inchesA 12.0 inchesB 8.0 inchesBA 4.0 inchesLOA 26.0 inchesAW 151.4 inches2

CWP 0.728CB 0.728CP 0.732CM 1.0CV P 1.0

93

Page 95: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 6.1: ASV Hull Plan with scales shown in inches. Image generated usingmatplotlib[20]

Figure 6.2: ASV operating autonomously on the Indian River Lagoon duringfield trials.

94

Page 96: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

6.2 Hull Construction

To validate the calculations described in the preceding section as well as to

evaluate vessel stability, a one-to-one scale model of the hull was constructed

using foam board with a laminate of Duck Tape to seal against water. The test

hull was massed and ballasted to match the fully loaded configuration of the

vehicle. The hull was then evaluated in FIT’s Southgate Pool, and it floated

at the predicted draft. It was also submerged to 18 inches without damage or

flooding. Stability was examined by manually rolling the hull and releasing it to

verify its ability to right itself, which it did at heel angles up to approximately 90

degrees, however it also exhibited strong stability after inversion suggesting that

if the vessel capsizes it will be unlikely to recover in the present configuration. If

necessary, this tendency can be mitigated by adding a weighted spar below the

hull to lower the vessel’s center of gravity, and bias it further toward recovering

to an upright attitude. The successful tests of this prototype suggest that for

simple field testing, the Duck Tape over foam board construction method is a

viable and inexpensive alternative to more elaborate means.

Following construction of the initial model, a more resilient hull of the same

shape was built from 0.085-inch thick polycarbonate (PC). The components

were all cut from a single sheet measuring 28 x 30 inches, and were structurally

joined with hot melt adhesive (HMA). The seems were sealed with marine sili-

cone adhesive/sealant. To facilitate maintenance, an access hatch was cut in the

upper deck, and a cover plate was attached with a pattern of six 6-32 x 1/2-inch

machine screws, which were permanently bonded to the upper deck plate with

cyanoacrylate (CA) adhesive. The cover plate was sealed with silicone grease

to prevent water ingress and enable easy removal of the plate for access to the

95

Page 97: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

internal electronics.

Vertical structural supports were added at the center of the transom and on both

sides of the hull at 1/4, 1/2, and 3/4 of the vehicle’s length, which improved

the rigidity of the deck. This was largely a response to testing of the earlier

prototype hull, which used a single structural support at midship, and showed

minor buckling of the fore and aft decks after maintenance.

For steering, the ASV used a simple PC rudder, which was mounted to the

transom via a cabinet hinge, and fastened with 6-32 x 3/8-inch machine screws.

During pool and early field testing, the rudder’s 0.75-inch chord provided insuf-

ficient force to turn the ASV against ambient wind and currents, so an extension

was improvised, which increased the rudder’s chord to 1.25 inches and improved

its steering ability. A permanent improvement to the rudder will incorporate a

1.5-inch minimum chord below the waterline.

6.3 Electronics

The ASV uses a microcontroller and a number of peripheral systems to achieve

its autonomy. Each of these systems and modules will be discussed in turn,

but a wiring diagram of the entire electrical system appears in Figure 6.3, while

a schematic of the control board is shown in Figure 6.4. The description pre-

sented here represents a single snapshot in the ASV’s development, and many

improvements will be incorporated as the project continues. This section de-

scribes the vehicle in the state that it was during its field trials in the Indian

River Lagoon.

96

Page 98: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

The ASV is controlled by a microcontroller compatible with the Arduino Mega

2560 Rev. 3 specifications and the Arduino IDE. This microcontroller is used

because it has four onboard serial ports, which enable simple interfacing with

TTL serial instruments and peripherals. In addition to simple interfacing, the

Arduino board and programming environment are commonly used in the hobby

electronics community, so this gives the ASV a basis of familiarity among other

potential users of the platform. Finally, the Arduino board includes a simple

USB programming and serial interface, which reduces the number of external

components and equipment needed to modify the ASV’s behavior. In short

the Arduino is the control board of choice because of its common use and its

simplicity, rather than for any other design consideration.

A remote communication capability is provided by a Roving Networks WiFly

module, which is connected to the Arduino’s first serial port. The WiFly mod-

ule enables the ASV to send and receive serial communication over an 802.11

wireless network connection[26], so that the ASV can be remotely monitored or

controlled by any computer with a normal wireless network interface. Since the

Arduino uses its first serial port for reprogramming, the ASV cannot be repro-

grammed when the WiFly module is connected, so while all of its supporting

circuitry and software is included in the ASV, the WiFLy module was removed

during field testing. To address this problem, the most recent revision of the

control board includes a power cutoff switch, which shuts down the commu-

nication board, and uses relays to disconnect its communications lines to the

microcontroller.

Since this vehicle is intended to provide a useful development platform for other

projects, it should be noted that the WiFly module communicates over a net-

97

Page 99: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

work socket, and it is unclear whether it is capable of maintaining a connection

to more than one other similar module. This would be a significant limitation

for coordinated or swarm-oriented behaviors, and if necessary the WiFly can be

replaced with a Digi XBee wireless module. The XBee is pin-compatible with

the WiFly, and it supports a variety of network architectures [5], however it uses

a different wireless protocol that is not compatible with common wifi network

adapters. In either case, the range of the wireless connection will be short due

to the antenna’s proximity to the water’s surface.

For orientation the ASV uses a Devantech CMPS09 tilt-compensated compass,

which responds to TTL serial queries with the heading measured to 0.1 degrees

[1]. The compass is connected to the Arduino’s second serial port. When

queried, the compass provides heading pitch and roll, however only the heading

is currently used by the ASV’s control system.

During operation, the ASV records mission data to an onboard micro SD card,

which interfaces through a Sparkfun OpenLog serial adapter connected to the

Arduino’s third serial port. The current version of the ASV’s control software

does not allow for remote download of recorded data, so the deck panel must

be removed to physically access the card and retrieve recorded data.

The vessel fixes its position with a 50-channel GP-635T GPS module produced

by ADH Technology. The GPS provides updates of position at an interval of

1 Hz to an accuracy of 2.5 meters [3]. This GPS receiver was chosen for its

low cost and small size of 35 mm x 8 mm [3], which allows it to be mounted

directly to unused space on the control board. Since the DORA algorithm

requires heading updates at a rate that is slower than the time required for the

vehicle to maneuver, the relatively slow rate of 1 Hz does not cause a significant

98

Page 100: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

problem for this application.

Control of the ASV’s rudder is provided by a Traxxas 2080 waterproof servo,

which is recessed into a cutout on the ASV’s after deck. Since the servo is

waterproof, its joints to the hull are sealed with silicone grease, but no further

action is taken to protect it from spray or overwash. It is powered by the ASV’s

regulated 5-Volt bus, and is controlled with a pulse-width modulated (PWM)

signal from the Arduino.

The ASV’s thrust is provided by a DC motor rated for 9-18 Volts, which is pot-

ted in wax and sealed in a film canister. This method is used by the SeaPerch

educational outreach ROV, and is described in detail in the SeaPerch Construc-

tion Manual [6]. The thruster is mounted external to the hull, which reduces

the complexity and the required tolerance of fittings by eliminating the need

to pass a rotating shaft through the hull. This also allows for a very modular

design. Since the thruster is attached to the hull with cable ties and connected

to the ASV’s controlling electronics by disconnectable waterproof fittings, it

can easily be replaced by different thruster configurations, or the controlling

electronics can be transplanted to a different hull. The motor is connected to

the ASV’s battery power supply, but its speed is controlled by a an N-type

MOSFET transistor whose gate is driven by a PWM signal from the Arduino.

To prevent the motor’s speed from changing with decreasing battery voltage,

the battery voltage is measured by the Arduino, and the PWM signal is scaled

to maintain a constant signal.

During operation, power is supplied by a lithium polymer battery, which is

rated for 6 Amp-Hours of capacity. Most of the ASV’s onboard systems receive

power from a 5-Volt Murata 78-S/R-5/2-C switching regulator, which is rated

99

Page 101: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

for 2 Amps at up to 93% efficiency according to its datasheet. This switching

regulator replaced an LM78S05 linear regulator in order to improve the efficiency

of the system. The systems that are not powered by the 5-Volt regulator are the

wireless serial module, which is powered by an independent 3.3-Volt regulator,

and the Arduino, which has a built-in regulator and is powered directly from

the vehicle’s power supply. Also, the GPS module has an independent backup

power supply, which is provided by two AA batteries.

To allow for debugging and reprogramming the ASV without draining the bat-

tery, and to allow for recharging its battery without opening the deck access

panel, two power supply cables are passed through the hull. One provides shore

power directly the ASV’s systems, and the other provides power to the ASV’s

built-in battery charge controller. A center-off three-position switch mounted to

the vehicle’s deck allows the operator to shut down the ASV or select whether

to run from shore or battery power. To conserve power, the wireless commu-

nication board can be independently toggled using another switch mounted to

the deck. A final switch on the deck allows the user to reset the microcontroller

without toggling power to the entire ASV, which is helpful during operation

because it does not depower any of the peripheral systems. The GPS in par-

ticular is sensitive to cycling of its power, and often requires significant time to

reacquire a position fix after losing power.

All cables that penetrate the hull pass through waterproof cable fittings, and

all switches, fasteners, and temporary joints are potted with silicone grease to

prevent water from being admitted to the hull.

100

Page 102: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 6.3: ASV system wiring diagram.

101

Page 103: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

6.4 Control Software

The ASV’s software implements two layers of control: a reactive layer that

implements the DOR agent preference system, and a hardware control layer

that actuates the rudder to maintain a set heading. The reactive layer also

manages the mission by keeping track of the vehicle’s waypoints and its mission

completion criteria. Implementing the agent preference behavior and managing

the mission require data from the GPS, which can take more than a second to

acquire. Also, since DORA neglects the effects of vehicle dynamics, it is built

on the assumption that its update cycle is longer than the time required for the

vehicle to turn to a set heading. For this reason, the vehicle’s set heading is

updated at a much slower rate than its heading control system.

To maintain a heading that is set by the DOR preference calculation, the ASV

software uses a simple proportional feedback controller for the rudder servo,

where the rudder’s angle of deflection is proportional to the difference in degrees

between the ASV’s preferred heading and its measured heading (Kp = −3.75).

During each pass through the hardware control loop the ASV polls the compass

for heading, sets the deflection of the rudder, updates a mission timer and an

update timer, and checks the latter against the heading update interval.

If the timer has exceeded the update interval, the software acquires a new GPS

fix. It then estimates the local current by checking its GPS position against

its dead reckoning position, calculates a new preferred heading, logs relevant

mission parameters, and resets the update timer. This process repeats until the

ASV measures its range to its next waypoint as less than a preset threshold. At

this point, the update function carries out some completion action. The details

102

Page 104: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

of the mission behavior will be described with the field test in Section 7.6.

6.5 Estimated Speed

Since the agent’s speed is an important parameter in the DORA navigation

system, the ASV’s still-water operating speed was measured in pool tests. The

size of a swimming pool was estimated from Google Earth images at 12 meters,

and the ASV was set to traverse the length of the pool using heading hold

autopilot, and the time required for the transit was recorded. To compensate for

directional influences, the ASV was then timed on its return, and the average of

the two speeds was recorded. This method is admittedly rough, but gave results

that were at least accurate enough for the algorithm to be successful. Certainly

much more careful measurement and characterization would be needed before

any current measurements by the ASV could be considered useful data, but for

simplistic estimates this method was sufficient.

Table 6.2: ASV performance data recorded from pool and tank tests.PWM (8-bit) Speed (m/s) Current (A) Duration (h) Range (km)

0 0.00 0.56 8.93 0.0064 0.32 0.84 5.95 6.8685 0.44 0.98 5.10 8.08

100 0.48 1.04 4.81 8.31128 0.56 1.26 3.97 8.00170 0.64 1.57 3.18 7.34191 0.66 1.76 2.84 6.75255 0.79 2.24 2.23 6.35

The ASV’s motor controller is open-loop, so its actual shaft speed is unknown.

For this reason the test data are recorded in terms of the PWM signal sent to

the motor. The results of the trials are shown in Table 6.2. Two useful results

103

Page 105: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

of these experiments are shown in Figures 6.5 and 6.6. The first shows the

power required to achieve a given speed, and the second shows an estimate of

the range that is attainable at that speed. It is apparent from the figure that

the greatest range–at least within the values tested–corresponds to a speed of

0.48 m/s. Slower speeds require less power, but also cover less distance per

time, whereas higher speeds cover more distance but at a higher energy cost.

All of the trials conducted in the Indian River Lagoon were carried out at 50%

power for a speed setting of 0.56 m/s, which is a slightly higher speed than

for the maximum range, but also allows the ASV to maneuver against stronger

currents.

This chapter has provided an overview of the design, construction, and program-

ming of the ASV that is used for the field trial of the DORA algorithm, and

thus concludes all of the background information necessary before discussing

actual experiments. The next chapter describes the different cases examined to

evaluate DORA’s performance both in realistic simulations, and in a small-scale

real-world deployment.

104

Page 106: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 6.4: ASV Control board schematic.

Figure 6.5: ASV Power Vs. Speed.

105

Page 107: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 6.6: ASV Range Vs. Speed.

106

Page 108: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Chapter 7

Experiment

This chapter describes the experiments conducted using simulations of the

agents’ behaviors and the navigation system’s optimization as well as the field

test carried out using the ASV. While many experiments were conducted dur-

ing the course of the software’s development–some of which were documented

in Chapter 5–the experiments of this chapter were designed to provide the most

consistent comparison of the aspects under consideration and thus the most

useful insight regarding DORA’s benefits and liabilities as well as the circum-

stances under which it might or might not be a good candidate. Throughout,

four different methods are compared: DORA, A*-based navigation, Tracking,

and Homing.

107

Page 109: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

7.1 Performance Evaluation

For each of the experiments, the following properties were examined: Success

Rate, Mission Time, Effective Speed, Computation Time, Successful Computa-

tion Time.

Success Rate is the fraction of trials that are solved by the specified method in

a given experiment. Since in some cases no solution is certain to exist, and since

in all cases the DORA algorithm is stochastic, this fraction gives a measure of

an algorithm’s robustness.

Mission Time is the average number of hours required to execute the success-

fully planned missions. This is a rough estimate of the quality of a solution.

Since it is important to not only find solutions that are possible, but to find

solutions that are as efficient as possible this measure provides some insight.

However, since path lengths are not standardized, a low mission time may only

reflect that an algorithm is better at solving easier problems. The effective

speed is introduced to address this concern.

Effective Speed is the average great circle distance between start and end

points divided by the mission time. This eliminates the potential for an algo-

rithm to prefer shorter problems and thus bias its results toward an appearance

of higher quality.

Computation Time is the average amount of time required for each algorithm

to complete its search for a solution, whether successful or not. For robotics and

applications that require real-time or short-term use of a mission plan, compu-

tation time is critically important. A plan is of no use to a robot if the data

108

Page 110: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

upon which the plan is built have expired before the plan is developed. A related

concern is that computers aboard robotic vehicles in general, and AUV’s in par-

ticular, often have very limited processing power, so computational efficiency is

critical for any algorithm that will be deployed aboard a vehicle.

Successful Computation Time is the average amount of time required for

each algorithm to complete a successful search. This gives some measure of

comparison between the time taken to look for a solution, and the time re-

quired to actually find a solution. For instance, since it is complete, A* needs

to exhaustively search a graph before declaring that solution does not exist.

However, since DORA is based on a stochastic model, it searches for some spec-

ified amount of time or for the life of a population of agents, and has then either

succeeded or failed, which allows some control of the amount of time allocated

to searching for a solution.

7.2 Choice of Parameters

The tracking and homing behaviors are purely deterministic, and there is no

room for modification, so their cases are fairly trivial. However, both the A*

search algorithm and DORA have a great many factors that influence their

likelihood of success, the quality of their solutions, and the computational time

they require. Since an exhaustive study of all possible combinations of param-

eters is impossible, a specific subset is chosen based on experience, and these

parameters are used consistently throughout the different trials. This attempts

to capture the performance of not DORA vs. A* in general, but specifically, of

this implementation of DORA vs. this implementation of A*. So it is important

109

Page 111: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

to remember that conclusions drawn from these trials cannot establish absolute

boundaries on either algorithm, but instead only aspire to provide some useful

insights about their suitability for problems that are fundamentally similar to

the ones discussed here.

7.2.1 A*

Since DORA is not complete or optimal, it was compared against the 4-D A*

planner described in Section 5.2. For all of the tests with A*, it is important

to remember that A* is guaranteed to find the best possible solution in the

minimum possible time for its search space. Since it is a graph search method,

its ability to find a path through a continuous, real-world environment as well

as the quality of that path and the time required to find it, are all very subject

to the environmental model, and the algorithm parameters that are used.

Increasing A*’s resolution either by increasing its branching factor or by re-

ducing the distance traveled during each branch improves the quality of the

solution that is produced, but at the expense of computation time. Therefore,

to be practical for use in the field, resolution must be reduced to a level that

allows for solution in a reasonably short amount of time, but it must be kept

high enough to ensure sufficient quality.

Because of these considerations, it would not be unreasonable to find cases in

which the A* search required more computation time or produced a higher cost

path than an alternative algorithm that was searching in a different problem

space. Choice of parameters for the comparison with A* are somewhat ar-

bitrary, and are based entirely on experience with the software that is being

110

Page 112: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

tested. Caution must be used here, as the parameters could be artificially cho-

sen to prevent A*’s finding a solution, or finding one in a reasonable amount of

time.

For these tests, the parameters used are a branching factor of 3 with one branch

directly ahead of the agent, and one branch 45◦ to port and the other 45◦ to

starboard. To attempt to capture the influence of resolution on quality and

performance, two resolutions are tested for each trial, one at 10 kilometers of

length, and the other at 20 kilometers of length based on the planar earth

approximation at the point where the node branches. To ensure consistency,

the cost function uses the same simple kinematic simulation that is used in the

DORA software, and with the same time step.

7.2.2 DORA

The DORA system also depends on many parameters for operation. Two of

the most important are the size of the population, and the number of epochs

allowed for searching for a solution. Since it is built on a stochastic, rather

than a deterministic process, increasing the population or the duration of the

test increases the chances that a solution will be found if one exists, because

the random distribution of agents covers a larger portion of the search space.

However like A*, a practical limit is reached when the computation required to

process the search is greater than what is permissible for a particular mission.

Also, since DORA proceeds in two stages, one searching for an initial solution,

and the other attempting to improve on that solution, the number of agents

and epochs allocated to each phase is critically important. For the initial tests

111

Page 113: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

of DORA, 500 agents are allocated to the initial search and are given 5 epochs

to find a solution. 50 agents are allocated to the improvement of the original

solution, and are allowed 10 epochs for their search. Next, to evaluate the

importance of the terrain avoidance term, two trials are conducted with its

preference coefficient constrained to zero, and a limit of fifty agents.

Typically, only varying one parameter at a time is better for identifying its

influence on a particular problem, however removal of one of the preference

terms simultaneously reduces the dimension of the search space.

7.2.3 All Agents

All agents are allocated a speed at the beginning of each trial, and maintain that

speed in their own frames of reference throughout the experiment. Three speeds

are chosen at 0.4 m/s, 1.5 m/s, and 3.0 m/s. Intuitively choosing speeds that

are related to the maximum speed of ocean current in a velocity field seems

like the most reasonable approach for evaluation of the algorithms, however

that is not the scheme chosen here. Rather, these speeds are chosen to more

accurately represent a practical application for this algorithm. There is little

physical value in exploring the software’s ability to solve a problem that only

exists for a hypothetical system.

The slowest speed is chosen to be in the range of operation for a common AUV

glider [12]. The next speed at 1.5 m/s, corresponds to the speed of a typical

thruster-driven AUV, or of a ship surveying with a tow-fish. Finally, 3.0 m/s

is more closely related to the velocity field, as it is faster than the maximum

speed in the Gulf region of interest. In general, an unobstructed environment

112

Page 114: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

with a speed slower than the agents’ is the only condition where a solution is

guaranteed to exist for all four behaviors. Therefore, to evaluate the software’s

ability to find known solutions, a speed higher than the local current velocity

is necessary. While artificially based on the need to explore the algorithm’s

relationship to the current field, 3.0 m/s corresponds to roughly six knots, so it

is well within a range that is typical of ship operations.

7.3 Program Environment Specifications

All trials were conducted using the Python 2.7 programming language employ-

ing the included modules time, random, and sys. The third-party modules

numpy and scipy were used for most mathematical and data access operations.

The experiments were executed on a laptop computer with an Intel i7-2630QM

2.0GHz quad-core processor and 8 GB of RAM running the 64-bit Ubuntu 12.04

initially, and later upgraded to Ubuntu 12.10 operating system.

7.4 Gulf of Mexico ROI

The Gulf of Mexico test area has no obstructions but relatively high current

speeds, so it is a good field for evaluating the performance of the different

navigation systems under the influence of strong currents alone. It presents a

problem, which is in some ways simpler than the Caribbean test area, which

has intervening obstacles, but the relatively stronger currents in this region still

produce some cases in which there is no solution for slower agent speeds.

113

Page 115: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

7.4.1 Trial at 3.0 m/s

This test evaluated the different navigation methods in an environment where a

solution was certain to exist for all of them. Since the agent velocity was faster

than the maximum local current, there are no cases where a path connecting

the starting point to the ending point could not be found. The results of the

experiment are shown in Table 7.1.

Table 7.1: Results of Gulf Trials at 3.0 m/sDORA DORA 0A-5 DORA 0A-15 A*(10km) A*(20km) Track Home

Success Rate 97% 94% 95% 100% 100% 100% 99%Mis. Time (hrs) 14.98 15.08 15.00 15.00 15.05 15.09 15.12Eff. Spd (m/s) 2.99 2.97 2.99 2.97 2.96 2.97 2.96Comp. Time (s) 534.8 246.3 399.9 553.8 30.6 0.8 0.7Suc. Cmp. Time (s) 519.7 254.3 401.4 553.8 30.6 0.8 0.7

This test provides a good baseline for discussion of the computational require-

ments and success rates of the different methods. Since there are no obstructions

and the local currents are slower than the agent speeds, the trivial solutions of

homing and tracking are both successful. Both provide good results and re-

quire less than one second of computation time. Similarly, the A* approaches

produce good results and a 100% success rate, but since they are deliberative

processes, they require substantial computation time. Finally, all three of the

DORA solutions produce results of similar quality and with high success rates,

but since they are stochastic processes, they lose the benefit of 100% success.

For this case, the overall best solution is tracking since it has 100% success,

requires less than one second of computation time, and achieves the highest

quality of solution.

114

Page 116: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Table 7.2: Results of Gulf Trials at 1.5 m/sDORA DORA 0A-5 DORA 0A-15 A*(10km) A*(20km) Track Home

Success Rate 92% 89% 90% 100 % 100% 90% 86%Mis. Time (hrs) 31.12 30.87 31.04 32.08 32.67 31.81 31.73Eff. Spd (m/s) 1.50 1.50 1.50 1.47 1.46 1.47 1.45Comp. Time (s) 588.1 300.2 484.6 2757.7 105.1 0.9 0.8Suc. Mis. Time (s) 535.32 318.2 483.5 2757.7 105.1 0.8 0.7

7.4.2 Trial at 1.5 m/s

In this experiment, the agents are running slower than the maximum current,

so no solution is certain to exist, but the high rate of success demonstrates that

this problem is still not particularly difficult for any of the algorithms. However,

there is a drop in DORA’s success rate, and the computation time required for

A* increases dramatically in both cases, which shows that the problem is no

longer trivial. The average solution found by DORA is minimally better than

the average solution found by either A* algorithm. However, the success rates

of DORA are comparable to the success rates of tracking and homing, which

concluded in less than one second of computation time.

7.4.3 Trial at 0.4 m/s

This experiment has all of the agents operating much slower than the maximum

local current, and it represents the only trial for which DORA showed a distinct

advantage over A*. For these problems, the 10-kilometer A* test was aborted

after the first six trials, which required in excess of eight hours to complete.

The time required for each search was considered too great to be of practical

use, and it was dropped from consideration. It is possible that the higher

resolution would have been more successful and found higher quality solutions,

115

Page 117: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

but at the expense of prohibitively long computation times. This underscores

the importance of careful parameter choice, to which A* is more sensitive.

Since the time required for A* to find a solution is related to the number of

paths that it must explore during its search, changing properties of the flow

field can dramatically impact the time required to solve the same search prob-

lem. Whereas DORA’s computational requirements are most dependent on the

number of agents and the agents are constrained to a finite lifespan, so chang-

ing the flow field does not have as significant of an impact on its computation

requirements. This independence comes at the price of completeness. When A*

fails to find a solution, it is because no solution exists for its current parameters,

and this is known conclusively. However, when DORA fails to find a solution, it

could be a simple matter of insufficient exploration, and nothing can be inferred

about the existence of a solution that is not found.

As mentioned above, this trial is remarkable for DORA’s advantage over A*.

Again this does not imply, even in this flow regime, that a stochastic algorithm

is better suited than a graph-based one, but that this version of A* with it’s

model of the environment loses its advantage in this flow field. It is quite con-

ceivable that the same A* algorithm with a different resolution, or with different

parameters would exceed the performance of DORA in this environment, but

in the interest of consistency for comparison, the parameters are maintained for

all trials.

Another interesting observation of these trials is that the average effective speed

of all of the successful solutions is higher than the agents’ own speed, which

suggests that, on average, taking advantage of favorable currents was important

to the solution of these trials. With higher vehicle speeds, simply overcoming a

116

Page 118: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Table 7.3: Results of Gulf Trials at 0.4 m/sDORA DORA 0A-5 DORA 0A-15 A*(10km) A*(20km) Track Home

Success Rate 31% 26% 29% – 23% 14% 12%Mis. Time (hrs) 91.77 93.89 87.94 – 106.28 94.10 102.39Eff. Spd. (m/s) 0.59 0.57 0.62 – 0.46 0.59 0.58Comp. Time (s) 730.5 155.6 293.2 – 291.4 0.8 0.7Suc. Cmp. Time (s) 436.1 289.7 329.1 – 255.7 0.6 0.6

current is often sufficient, and this is reflected by the earlier Sections’ effective

speeds that were at most equal to the agents’ own speeds. These are only average

results, and all of these trials showed standard deviations that represented a

large fraction of the agents’ speed, so there were many exceptions, but this is

an example of an intuitive concept that is well supported by the results.

7.5 Caribbean ROI

The Caribbean trials represent a field where the velocities are lower, but since

islands obstruct many of the paths, the tracking and homing solutions are not

valid for all cases with high agent speeds.

7.5.1 Trial at 3.0 m/s

The tests at both 3.0 m/s and at 1.5 m/s are faster than the local velocities

in these fields, so a path should always exist, and this provides some measure

of the algorithms’ abilities to find paths when one is certain to exist, but when

intervening islands can obstruct the path. Therefore, this is most useful in the

context of measuring collision avoidance capabilities.

The A* algorithm’s synoptic view of the environment shows a great advantage

117

Page 119: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

in the context of collision avoidance, where it is able to maintain a 100% success

rate for both the 3.0 m/s and 1.5 m/s experiments. The quality of the solutions

here is very slightly lower, which is almost certainly due to the inclusion of large

detours to avoid islands. Since A* shows a much higher success rate than the

others, it naturally includes many routes that were not found by any of the other

algorithms. Since the trivial solutions work for clear, straight-line paths, their

few successes would only include cases where no detour was necessary, and thus

would have shorter path lengths. Similarly, the DORA solutions have very high

success rates for straight-line paths, and lower for other cases, so their solutions

are likely to be weighted toward simplicity.

This is also a convenient time to discuss the tracking and homing behaviors’

surprisingly low success rates. Since they typically work well for straight-line

paths, at first it seems that the two should have similar numbers of success

that should align well with the probability of a collision free path between is-

lands. However, since the homing agents make no correction for local currents,

they have the possibility of drifting into islands while attempting to travel to-

ward their objectives, whereas the tracking agents maintain straight-line (or

more accurately great circle) routes directly from their starting points to their

objectives.

Another interesting result from this trial is that the full DORA solution has

almost twice the success rate of the versions that ignore the collision avoidance

term. This is probably because at this speed, the local currents can be dom-

inated by the agents’ own speed, so the ocean current preference is much less

important.

118

Page 120: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Table 7.4: Results of Caribbean Trials at 3.0 m/sDORA DORA 0A-5 DORA 0Z-15 A*(10km) A* (20km) Track Home

Success Rate 62% 37% 40% 100% 100% 22% 13%Mis. Time (hrs) 15.783 14.59 14.48 15.73 15.89 14.70 14.41Eff. Spd. (m/s) 2.73 2.90 2.93 2.67 2.66 2.89 2.96Comp. Time (s) 762.5 177.2 369.3 168.8 12.6 0.4 0.3Success Time (s) 479.7 276.9 371.7 168.8 12.6 0.7 0.7

Table 7.5: Results of Caribbean Trials at 1.5 m/sDORA DORA 0A-5 DORA 0A-15 A*(10km) A*(20km) Track Home

Success Rate 64% 42% 50% 100% 100% 22% 14%Mis. Time (hrs) 31.55 30.73 31.11 31.27 32.53 31.56 31.80Eff. Spd. (m/s) 1.39 1.44 1.42 1.33 1.38 1.42 1.40

Comp. Time (s) 858.6 204.0 371.9 455.7 23.4 0.4 0.3Suc. Cmp. Time (s) 661.6 308.8 388.7 455.7 23.4 0.8 0.7

7.5.2 Trial at 1.50 m/s

Many of the observations described in Section 7.5.1 still apply in this section

because the agents still have a travel speed greater than the maximum local

currents. In fact, in the original formulation of this experiment, no 3.0 m/s trial

was included because of its redundancy, however it was added later to keep the

trial speeds equal for both regions of interest.

Tracking, homing, and both A* algorithms performed almost identically for this

trial as for the last–with the exception of greater computation time required for

A*. However the DORA algorithms are more successful in this case than before.

At first, this seems counter intuitive because this is a more difficult planning

problem. However, recall that this is closer to the regime for which DORA was

designed–one with agent velocities that are lower than or close to local current

velocities. Since the ocean current term diminishes with decreased local velocity,

it is effectively eliminated from consideration for very low current speeds. As its

weight is increased here, it expands the range of solutions and thus the flexibility

of the algorithm.

119

Page 121: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Table 7.6: Results of Caribbean Trials at 0.4 m/sDORA DORA 0A-5 DORA 0A-15 A*(10km) A*(20km) Track Home

Success Rate 36% 35% 34% 68% 54% 9% 9%Mis. Time (hrs) 111.46 110.41 110.64 100.81 93.33 65.10 114.901Eff. Spd. (m/s) 0.44 0.44 0.44 0.47 0.49 0.67 0.40Comp. Time (s) 914.6 171.7 302.0 1516.3 46.1 0.5 0.3Suc. Cmp. Time (s) 585.6 284.6 326.7 1626.1 39.6 0.4 0.7

7.5.3 Trial at 0.40 m/s

This final experiment presents a very difficult environment to all of the planning

methods. The speed is lower than the local current, and the paths are often

obstructed by islands. A* still has the highest success rate over all. It is

very interesting to note that the performance of all three DORA algorithms

is almost identical at this speed. The performance is not improved; in fact

it is dramatically reduced, but it shows results that are very similar to those

from the slow speed trials in the Gulf ROI. Here it seems that the faster ocean

currents make the current-following preference dominant to the point that the

obstacle avoidance term does not meaningfully contribute to the likelihood of

success.

In addition to the general results described above, this trial gave rise to an

interesting emergent behavior that supports the need for the full range of values

of the course preference behavior as described in Section 5.7.2. In about thirty

percent of the successful trials from this experiment, a negative value of C was

chosen, which caused the agent to dramatically change its heading at every

timestep. The entire trajectory from Trial 1 is shown as the Northern line

around the island in Figure 7.1, and a closer view of a segment of this trajectory

is shown in the right frame. Since this is a counter-intuitive solution, it was

investigated further.

120

Page 122: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Examination of the ocean currents along the trajectory show that the currents

are generally from East to West, which should assist the agent along its course,

so the behavior was not an attempt to overcome an adverse current. Manually

constraining the value of C to small positive values or to zero consistently led

to the agent’s collision with the intervening island, so the choice of a negative

value of C was not simply a settlement into a successful, but poor solution.

Rather, the negative value of C–assuming the same combination of A,B, and

D–was critical to the success of the trial. This zigzagging pattern effectively

reduced the agent’s speed relative to the water, so that it drifted along the local

current, which naturally bypassed the island.

This result is intriguing because it demonstrates two useful properties. First, it

shows that DORA is able to find counter-intuitive solutions to the problem and

explains the observed improvement in performance when C is not constrained

to zero or positive values. Second, it suggests that rather than fix the agents’

speed as a constant in the test, a future version of this algorithm would benefit

from being able to adjust an agent’s speed directly, since this oscillation is

an effective,albeit indirect, means of throttling the agent. Indeed, the original

reason for disallowing negative values of C was an attempt to improve agent

speed.

7.6 Field Trials

The feasibility of implementing the agents’ steering behavior as a navigation

system was tested by deploying the ASV on the Eau Gallie River and the Indian

River Lagoon, and tasking it with simple point to point missions. Upon reaching

121

Page 123: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 7.1: Left: Paths generated from Trial 0 in the Gulf of Mexico ROI. TheNorthern route was produced by DORA, the Southern route by the 20-km A*.Right: A closer view of DORA’s route shows the zigzagging trajectory thatresults from a negative value of C. Images from Google Earth

its destination, the vehicle was to return to the position of its initial GPS fix.

Since the current and wind fields were unknown, the NetLogo software prototype

was used to optimize preferences for agents acting in a constant, uniform field

with the current directly perpendicular to the agents’ path, and at 75% of the

agents’ speed. The preferences found from this optimization are shown below,

and were programmed into the ASV’s control software.

B = −0.422

C = 0.097

D = 0.491

While running the missions, a new GPS fix was acquired and a new preferred

heading was set every thirty seconds. During these updates, a number of pa-

rameters were recorded to the vessel’s onboard memory card for post-mission

122

Page 124: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

analysis (PMA). One set of parameters that the ASV recorded was its GPS

positions, and the path that they define is shown in Figure 7.2, which is a

screenshot from Google Earth software.

The missions that the ASV executed were generated by the ASV during its

initial startup in the field. The ASV was programmed to acquire a GPS fix and

compass heading upon activation. It set this initial position as its home location,

and projected a waypoint 250 meters head of it. It then attempted to navigate

to this waypoint using the DORA preference formula with the parameters shown

above. Upon arrival within 15 meters of its target waypoint, it was programmed

to reset its objective back to its initial position. The mission was to be completed

when the ASV had returned to within 15 meters of its original position, and shut

down its drive motor. Since DORA makes no claim to symmetry, each mission

constituted two separate trials of the DORA preference-based navigation.

Early tests failed due to a mathematical error in the navigation software, where

the heading is calculated using arctan of the preference vector components.

In code, the X and Y components had been accidentally transposed, which

obviously returned an erroneous heading. Once this problem was corrected, the

ASV successfully completed a mission in the Eau Gallie River.

7.6.1 Impromptu Experiment

While the transposition error was destructive for the field tests, it inspired a

simple experiment with the deliberative software. Since the agent chose its

123

Page 125: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 7.2: Path recorded from the ASV’s mission log. The Northern segmentrepresents the outbound leg and the Southern segment the return. Plotted usingGoogle Earth software.

124

Page 126: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

heading based on the erroneous calculation, clearly it wouldn’t be able to suc-

cessfully navigate. However, the chosen heading was still a function of the local

current, so in principle the algorithm might still be able to find a solution, given

an accurate model of the agent’s behavior.

To test the potential for versatility, the NetLogo optimization software was

modified so that all of the agents would incorrectly calculate their headings

based on the transposed arguments to the arctangent function. However, the

physical model was unchanged, so that the incorrect agents were operating in a

correct model of the environment.

In this experiment, the software still converged to a solution that allowed the

agent to effectively navigate to its objective. This has potential implications

for DORA’s fault tolerance and versatility, since it appears that agents act-

ing on profoundly incorrect information can still be tuned by the deliberative

optimization system to solve the problem. This is not immediately useful, as

the deliberative system must first be able to model the problem that the agent

is experiencing before it can correct for it, and that implies a self awareness

that is well beyond the ambitions of this project. Further, fault management in

instrumentation and calculation would typically be addressed by some vehicle

or software system other than the path planning software, but the implication

that the path planner itself could be used to address some system malfunctions

is interesting in its own right.

This simple, single-run, virtual experiment is not included here as rigorous

evidence to support any claim to versatility, but merely as a sidenote that

some potential may exist to expand beyond the range of tasks that are more

traditionally within the purview of path planning software.

125

Page 127: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

7.6.2 Design Improvements

After the first successful mission in the Eau Gallie River, the ASV was upgraded

with a higher voltage battery, and the motor power supply was rerouted so that

it drew power directly from the battery. These design modifications brought

the ASV up to the specifications described in Section 6.5.

Following these improvements, the ASV was deployed for a series of five missions

of two trials each in the Indian River Lagoon, just North of the Eau Gallie

Causeway. Using the same mission profile as in the initial field test, the ASV

successfully completed all trials by navigating to an objective 250-meters from

its starting point and then returning. Figure 7.3 shows a screenshot from Google

Earth, which displays the ASV’s mission log from one of the missions.

Since all twelve field trials of the DORA algorithm were successfully completed

by the ASV, it is apparent that implementing the steering behavior aboard a

physical platform is feasible. Further, since the NetLogo software was used to

optimize the parameters with no actual data regarding the physical velocity

fields that the ASV encountered, some measure of versatility can be implied. It

is unreasonable to assume that in all twelve trials, the velocity fields were similar

to the constant, steady, perpendicular field assumed during the optimization, so

it is a straight-forward conclusion that under some circumstances the algorithm

can tolerate environments that vary from those that were assumed, but there

is no reasonable way to quantify or predict that tolerance or variation without

much further investigation.

126

Page 128: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Figure 7.3: Path recorded from the ASV’s mission log during its third field trial.Track plotted using Google Earth software.

127

Page 129: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

This chapter discussed the experiments used to evaluate DORA’s performance

as well as to explore the feasibility of deploying DORA’s steering behavior

aboard a practical vehicle. During the course of these experiments, the ca-

pabilities and limitations of the algorithm were demonstrated, and a number of

conclusions may be drawn from these results. Similarly, some areas for potential

further exploration and improvement were discovered. Both of these: questions

answered and questions raised will be discussed in the following chapter.

128

Page 130: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Chapter 8

Future Developments and

Conclusions

This final chapter summarizes some of the useful conclusions that can be drawn

from the results presented above. These outcomes will hopefully provide useful

information to support further development in artificially intelligent planners

and systems. Further, there are many questions that were raised during the

course of the research that remain for future development and characterization

of the DORA system, and some of these are discussed below.

8.1 Future Development

While this project has addressed its original hypothesis, it has also laid the

foundations for other future investigations. Following are some of the more

interesting areas, which bear further investigation.

129

Page 131: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

• The agent preference formula has much potential for improvement. The

collision avoidance term in particular could be implemented in other ways

that may or may not improve its performance. Two such means would be

to shift it from downward-looking to forward-looking or some combination

of those, or to completely rewrite the collision avoidance behavior so that

instead of relying on the gradient to choose its direction, it could apply a

reduction to the stability term. This way it would not need to measure the

actual gradient of the ocean floor, but only log the change in depth with

time, and tend to shy away from directions that were leading to shallow

water. This latter option has the advantage of simplifying the equation

to three vectors rather than four, and reducing the amount of information

that is required for an agent to act. It is unclear whether this would help

or hurt an agent’s performance, and trials would need to be conducted to

evaluate both options.

• The algorithm should be able to benefit from experience. The simple

learning trial tested during the course of this project did not show a signif-

icant advantage, but that was only one method with one choice of param-

eters. A greater focus on tailoring the learning algorithm to the problem

might prove to be beneficial, particularly by way of reducing computa-

tion time. Since the initial random search takes a great deal of time, it

is reasonable to expect that a search that is focused in the area of prior

successes might dramatically reduce this time. At this point, however this

is mere conjecture, and needs to be validated through experimentation.

• The parameters to the collision avoidance term, M and T , were chosen

based on statistically insignificant optimization, as was the ratio of agents

130

Page 132: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

used in exploration to those used in exploitation. The performance of the

software may be improved by a more rigorous study of the relationships

between each of these parameters and the outcome of the search.

• Simulated annealing was used for optimization of the parameters in the

exploitation phase of the search, however there are a great many other op-

timization algorithms that may prove better for this particular problem.

An additional study of optimization with respect to this problem may im-

prove the quality of the solutions that are produced, as well as the time

required to produce them. One potential candidate that is of particular

interest is genetic algorithms (GA). Genetic algorithms are good at find-

ing creative solutions to problems, and their random mutation helps to

minimize the tendency of an algorithm to become trapped in local min-

ima. The simulated annealing method used here does little to prevent

trapping in local minima, so this could represent a substantial improve-

ment. Also, the initialization of a random population would provide a

convenient means to implement the ability to learn from experience as

described above.

• The agents could be allowed to interact with one another. One of the

ways in which DORA departs from most true swarming algorithms is that

no agent directly influences any other agent. In most swarms, multiple

interactions are very beneficial to propagating useful information to other

agents. Adding this communication architecture to the system would

add computational overhead, but it is possible that it would improve the

performance of the system enough to offset that expense.

• The ASV shows promise as a lab-scale development platform, and the

131

Page 133: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

hull form has potential for use in other field experiments, however it needs

further improvement and characterization. The hull is designed to be easy

to construct, but it sacrifices any attempt at hydrodynamic optimization.

Further studies to characterize the flow around the hull could improve

its performance. Also, using manufacturing processes that are capable of

producing three-dimensional curves would allow for another dimension of

optimization and the elimination of the sharp corners at the joint between

the side wall and the bottom plate of the vessel.

All of these areas for further exploration and improvement represent questions

raised by this work, but the next and final section aspires to briefly summarize

the questions that have been successfully addressed.

8.2 Conclusions

The original hypothesis that began this project was that the behaviors of sim-

ple agents using the preferences in Equation 3.1, could be optimized so that

they would navigate along good trajectories, which could then be used as path

plans for AUV’s. The experiments performed here showed that as a long range

path planner for an AUV, this algorithm failed to reliably solve the problem

of navigation, and when it did solve the problem, the paths found were often

of poorer quality than even a very rough A*-based algorithm. In some cases,

DORA performed quite well, finding good solutions in short times, but it was

not consistent enough to supplant existing techniques that are more robust.

Thus, as a competitor against modern path planners, DORA is not a favorable

alternative, particularly in complex environments with obstructions.

132

Page 134: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

However, some interesting outcomes were discovered during the experimentation

and optimization phases of this work, which suggest some useful contributions

to navigation, and are considered to be the primary accomplishments of this

research.

First, the agents that were modeled here were not originally intended to rep-

resent any physical vehicle, but rather were intended as intermediaries used to

generate the trajectory that would be returned as the output of the algorithm.

However, since they were subject to the influence of a model of realistic ocean

currents and used only locally measurable information, it became feasible to

physically implement them as very simple unmanned vehicles with the agent

preference formula as a navigation system. This concept was tested with the

construction of an ASV, and in the limited cases that it was evaluated, the sys-

tem was very successful. The one series of field tests is not enough to draw any

conclusions beyond proving that physical implementation of the DORA system

is practical, but this practicality has strong implications for the merit of the

algorithm.

Second, the ASV itself provides an inexpensive, field-ready test bed for fur-

ther development. It is primarily built of inexpensive commercial off-the-shelf

technology that is already common in academic and hobby communities, and is

designed to be modular so that individual components can be used in other de-

signs. These features make it a viable platform for high volume construction, for

additional experiments in behaviors and navigation, as well as for swarm-based

projects.

Third, the agent preference formula itself, while not a strong basis for a long

range path planner, did prove to be much more versatile and optimal as the

133

Page 135: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

basis for a steering behavior than simple homing or tracking. In general, homing

was not as reliable as tracking, and tracking was only successful in a fraction

of the cases for which tuned agent preferences were used. Further, the agent

preference-based navigation allows for much more complex paths than either of

these, and so fills a gap between control systems and full-fledged path planners

and navigation systems. The fact that it outperformed established steering

behaviors is significant, as its flexibility might allow for efficient graph-based

path planning with much more sparse graphs. If DORA can find very good

solutions to unobstructed point-to-point navigation problems, it may relieve an

A*-based algorithm of much of its need for resolution. Since A*’s computation

time increases dramatically with increased resolution, it is easy to envision a

system that would use a very sparse graph, searched by A* to find collision

free paths that could then be very efficiently navigated by DOR agents. This

however, is a question that is left for later explorations.

134

Page 136: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

References

[1] CMPS09 Documentation. URL http://www.robot-electronics.co.uk/

htm/cmps09doc.htm.

[2] ETOPO2v2 Global Gridded 2-minute Database,. URL http://www.ngdc.

noaa.gov/mgg/global/etopo2.html.

[3] Data Sheet / GP-635T. URL http://www.adh-tech.com.tw/?12,

gp-635t.

[4] L78S05 Datasheet, 2006. URL https://www.jameco.com/Jameco/

Products/ProdDS/889461.pdf.

[5] XBee / XBee Pro Manual, 2007.

[6] SeaPerch Construction Manual. AUVSI Foundation, Arlington, VA, 2011.

URL http://www.seaperch.org/.

[7] C. Amante and B.W. Eakins. ETOPO1 1 Arc-Minute Global Relief Model:

Procedures, Data Sources and Analysis. Technical Report 11, National

Geophysical Data Center, Boulder, CO, 2009.

135

Page 137: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

[8] Eric Bonabeau, Marco Dorigo, and Guy Theralaus. Swarm Intelligence:

From Natural to Artificial Systems. Oxford University Press, Santa Fe,

New Mexico, USA, 1999. ISBN 978-0-19-513159-8.

[9] Nathaniel Bowditch. THE AMERICAN PRACTICAL NAVIGATOR.

Number 9. National Imagery and Mapping Agency, Bethesda, MD, 2002.

[10] K.P. Carroll, S.R. McClaran, E.L. Nelson, D.M. Barnett, D.K. Friesen, and

GN William. AUV path planning: an A approach to path planning with

consideration of variable vehicle speeds and multiple, overlapping, time-

dependent exclusion zones. In Autonomous Underwater Vehicle Technol-

ogy, 1992. AUV’92., Proceedings of the 1992 Symposium on, pages 79–

84. IEEE, 1992. ISBN 0780307046. URL http://ieeexplore.ieee.org/

xpls/abs_all.jsp?arnumber=225191.

[11] Xin Chen and Yangmin Li. Smooth path planning of a mobile robot using

stochastic particle swarm optimization. In Mechatronics and Automation,

Proceedings of the 2006 IEEE International Conference on, pages 1722–

1727. IEEE, 2006. ISBN 1424404665. URL http://ieeexplore.ieee.

org/xpls/abs_all.jsp?arnumber=4026352.

[12] Russ E Davis, Charles C Eriksen, and Clayton P Jones. 3. autonomous

buoyancy-driven underwater gliders. 2003.

[13] E. W. Dijkstra. A Note on Two Problems in Connexion with Graphs.

Numerische Mathematik, 1(1):269–271, 1959.

136

Page 138: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

[14] X Fan, Xiong Luo, Sheng Yi, and S Yang. Optimal path planning for

mobile robots based on intensified ant colony optimization algorithm. In

Proceedings of the 2003 IEEE International Conference on Robotics, In-

telligent Systems and Signal Processing, number October, pages 131–136,

2003. ISBN 078037925X. URL http://ieeexplore.ieee.org/xpls/abs_

all.jsp?arnumber=1285562.

[15] E Fernandez-Perdomo, D Hernandez-Sosa, J Isern-Gonzalez, J Cabrera-

Gamez, A C Dominguez-Brito, and V Prieto-Maranon. Single and multiple

glider path planning using an optimization-based approach. In OCEANS,

2011 IEEE - Spain, pages 1–10, June 2011. doi: 10.1109/Oceans-Spain.

2011.6003641.

[16] E Fernandez-Perdomo, J Cabrera-Gandmez, D Hernandndez-Sosa, J Isern-

Gonzandlez, A C Domiandnguez-Brito, A Redondo, J Coca, A G Ramos,

E A Fanjul, and M Garci anda. Path planning for gliders using Regional

Ocean Models: Application of Pinz #x00F3;n path planner with the ES-

EOAT model and the RU27 trans-Atlantic flight data. In OCEANS 2010

IEEE - Sydney, pages 1–10, May 2010. doi: 10.1109/OCEANSSYD.2010.

5603684.

[17] B Garau, M Bonet, A Alvarez, and S Ruiz. Path Planning for Autonomous

Underwater Vehicles in Realistic Oceanic Current Fields: Application to

Gliders in the Western Mediterranean Sea. Journal of Maritime Research,

VI(Ii):5–22, 2009. URL http://www.jmr.unican.es/pub/00602/00602.

pdf#page=7.

137

Page 139: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

[18] Maki K. Habib and Hajime Asama. Efficient Method to Generate Colli-

sion Free Paths for an Autonomous Mobile Robot Based on New Free Space

Structuring Approach. In IEEE/RSJ International Workshop on Intelli-

gent Robots and Systems, number 91, 1991. URL http://ieeexplore.

ieee.org/xpls/abs_all.jsp?arnumber=174534.

[19] Peter E Hart and J Nils. A Formal Basis for the Heuristic Determina-

tion of Minimum Cost Paths. IEEE Transactions of Systems Science and

Cybernetics, (2):100–107, 1968.

[20] J D Hunter. Matplotlib: A 2D graphics environment. Computing In Science

& Engineering, 9(3):90–95, 2007.

[21] Eric Jones, Travis Oliphant, Pearu Peterson, and Others. {SciPy}: Open

source scientific tools for {Python}. URL http://www.scipy.org/.

[22] James Kennedy and R. Eberhart. Particle swarm optimization. In Neu-

ral Networks, 1995. Proceedings., IEEE International Conference on, vol-

ume 4, pages 1942–1948. IEEE, 1995. URL http://ieeexplore.ieee.

org/xpls/abs_all.jsp?arnumber=488968.

[23] O Khatib. Real-time obstacle avoidance for manipulators and mobile

robots. In Robotics and Automation. Proceedings. 1985 IEEE Inter-

national Conference on, volume 2, pages 500–505, March 1985. doi:

10.1109/ROBOT.1985.1087247.

138

Page 140: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

[24] Li Lu and Dunwei Gong. Robot Path Planning in Unknown Environments

Using Particle Swarm Optimization. 2008 Fourth International Conference

on Natural Computation, pages 422–426, 2008. doi: 10.1109/ICNC.2008.

923. URL http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.htm?

arnumber=4667318.

[25] Ian Millington and John Funge. Artificial Intelligence for Games. Morgan

Kaufmann Publishers, Burlington, MA, second edition, 2009.

[26] Roving Networks. User Manual and Command Reference WiFly GSX and

WiFly EZX, 2011.

[27] O Oberschelp and Thorsten Hestermeyer. Design of self-optimizing agent-

based controllers. In 3rd International Workshop on Agent-Based Simu-

lation, 2002. URL http://citeseerx.ist.psu.edu/viewdoc/download?

doi=10.1.1.11.8918&rep=rep1&type=pdf.

[28] Y.Q. Qin, D.B. Sun, N. Li, and Y.G. Cen. Path planning for mobile

robot using the particle swarm optimization with mutation operator. In

Machine Learning and Cybernetics, 2004. Proceedings of 2004 Interna-

tional Conference on, volume 4, pages 2473–2478. IEEE, 2004. URL

http://ieeexplore.ieee.org/xpls/abs_all.jsp?arnumber=1382219.

[29] K. J. Rawson and E. C. Tupper. Basic Ship Theory. International Marine

/ McGraw-Hill, Woburn, MA, 2001.

139

Page 141: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

[30] Craig W Reynolds. Steering Behaviors For Autonomous Characters. In

Game Developers Conference, pages 763–782, San Jose, California, 1999.

Miller Freeman Game Group. URL http://www.red3d.com/cwr/papers/

1999/gdc99steer.html.

[31] Blane Rhoads, I Mezic̀I, and Andrew Poje. Minimum time feedback

control of autonomous underwater vehicles. In 49th IEEE Conference

on Decision and Control, pages 5828–5834, Atlanta, GA, 2010. IEEE.

ISBN 9781424477463. URL http://ieeexplore.ieee.org/xpls/abs_

all.jsp?arnumber=5717533.

[32] Stuart Russel and Peter Norvig. Artificial Intelligence: A Modern Ap-

proach. Prentice Hall, Upper Saddle River, NJ, second edition, 2002.

[33] Martin Saska, Martin Macas, Libor Preucil, and Lenka Lhotska. Robot

Path Planning using Particle Swarm Optimization of Ferguson Splines.

2006 IEEE Conference on Emerging Technologies and Factory Au-

tomation, pages 833–839, September 2006. doi: 10.1109/ETFA.2006.

355416. URL http://ieeexplore.ieee.org/lpdocs/epic03/wrapper.

htm?arnumber=4178249.

[34] Mae L Seto, editor. Marine Robot Autonomy. Springer. URL http://www.

springer.com/engineering/robotics/book/978-1-4614-5658-2.

140

Page 142: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

[35] R.N. Smith and Matthew Dunbabin. Controlled Drift: An Investiga-

tion into the Controllability of Underwater Vehicles with Minimal Actu-

ation. Technical report, Queensland University of Technology, Brisbane,

Australia, 2008. URL http://robotics.usc.edu/~ryan/Publications_

files/acra.pdf.

[36] James Stewart. Calculus. Thompson Brooks/Cole, Belmont, CA, 6th edi-

tion, 2008.

[37] Guan-zheng Tan, Huan He, and Sloman Aaron. Global optimal path plan-

ning for mobile robot based on improved Dijkstra algorithm and ant system

algorithm. Journal of Central South University of Technology, 13(1):80–

86, February 2006. ISSN 1005-9784. doi: 10.1007/s11771-006-0111-8. URL

http://www.springerlink.com/index/10.1007/s11771-006-0111-8.

[38] U. Wilensky. NetLogo, 1999. URL http://ccl.northwestern.edu/

netlogo.

[39] Jonas Witt and Matthew Dunbabin. Go with the Flow : Optimal

AUV Path Planning in Coastal Environments Autonomous Systems.

Technical report, 2008. URL http://www.mendeley.com/research/

go-flow-optimal-auv-path-planning-coastal-environments/.

141

Page 143: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

[40] Li-Ning Xing, Philipp Rohlfshagen, Ying-Wu Chen, and Xin Yao. A Hy-

brid Ant Colony Optimization Algorithm for the Extended Capacitated

Arc Routing Problem. IEEE transactions on systems, man, and cyber-

netics. Part B, Cybernetics : a publication of the IEEE Systems, Man,

and Cybernetics Society, pages 78–88, February 2011. ISSN 1941-0492.

doi: 10.1109/TSMCB.2011.2107899. URL http://www.ncbi.nlm.nih.

gov/pubmed/21896393.

142

Page 144: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Appendix A

Arduino Source Code

// Autonomous Surface Vehicle Control Software

// Anthony Jones

// April 15, 2013

/////////////////

// Import libraries and define global variables

#include<Servo.h>

#include<math.h>

#define REarth 6378000.0 //Typographical error--should be 6371000.0

#define Declination -6.36222222222 //Local Declination at

// the center of zip code 32935

// Declare buffers to store serial data

char s0buff[256];

char s1buff[6];

char s2buff[256];

char s3buff[256];

// Set range for length of leg

const float tgtRange=250.0; //Meters to Target

int upInterval=30; //Navigation Update Interval in Seconds

const float spd=12.0/18.0; //ASV Speed in m/s at full speed

float Heading,Pitch,Roll,SetHdg,t0,t1=0.0;

float B,C,D=1.0;

float Lat,Lon,LatPrev,LonPrev,tgtLat,tgtLon,hmeLat,

hmeLon,hh,mm,ss=0.0;

unsigned char SSpd=0;

Servo Rudder;

// Declare and initialize variables to store timing information

float misTime,startTime,iterTime=0.0;

// Set the range for successful arrival at objective

float RngThresh=15.0;

///////////////////////////////////////////////////////////////

//////////////// SETUP ////////////////////////////

///////////////////////////////////////////////////////////////

void setup()

143

Page 145: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

{

// Configure Input and Output hardware

pinMode(3,OUTPUT);

Rudder.attach(2);

Serial.begin(9600); // XBee

if (Serial)

{

Serial.println("Serial Initialized");

}

Serial1.begin( 9600); // Compass

Serial1.setTimeout(100);

if(Serial1)

{

Serial.println("Serial1 Initialized");

}

Serial2.begin(9600); // OpenLog

if (Serial2)

{

Serial.println("Serial2 Initialized");

}

Serial3.begin(9600); // GPS

if(Serial3)

{

Serial.println("Serial3 Initialized");

} Serial.println("Getting GPS");

// Get initial GPS and Compass readings

getGPS();

Serial.println("Getting Heading");

Heading=getH();

Serial.println( Heading);

SetHdg=Heading;

hmeLat=Lat;

hmeLon=Lon;

float hmeLatM=REarth*rads(hmeLat);

float hmeLonM=REarth*cos(rads(hmeLat))*rads(hmeLon);

float tgtLatM=hmeLatM+tgtRange*cos(rads(SetHdg));

float tgtLonM=hmeLonM+tgtRange*sin(rads(SetHdg));

// Set target position

tgtLat=degs(tgtLatM/REarth);

tgtLon=degs(tgtLonM/(REarth*cos(rads(hmeLat))));

Serial.print(tgtLat,5);

Serial.print(’\t’);

Serial.println(tgtLon,5);

//////// Set the preference values//////////////

B=-0.422;

C=0.087;

D=0.491;

float BCDsum=B+C+D;

B=B/(BCDsum);

C=C/(BCDsum);

D=D/(BCDsum);

// Start Timers

144

Page 146: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

startTime=millis();

iterTime=millis();

// Set motor drive PWM to full

SSpd=255;

}

///////////////////////////////////////////////////////////////

//////////////// CONTROL LOOP ///////////////////////////

///////////////////////////////////////////////////////////////

void loop()

{

while(Serial1.available()>0)

{

Serial1.read();

}

Serial1.write(0x23);

t0=millis();

while(Serial1.available()<4 and millis()-t0<100);

if (Serial1.available()==4)

{

Serial1.readBytes(s1buff,4);

}

// Update Vehicle Heading

Heading=getH();

misTime=millis()-iterTime;

// Check update timer

if(misTime>upInterval*1000)

{

// If the update timer has expired, use DORA steering

// behavior to calculate new heading, unless within

// RngThresh of the objective.

if (planeRTT()<RngThresh)

{

if(tgtLat==hmeLat && tgtLon==hmeLon)

{

SSpd=0;

}

tgtLat=hmeLat;

tgtLon=hmeLon;

}

LatPrev=Lat;

LonPrev=Lon;

// Update Position

getGPS();

// Calculate new preferred heading

float dt=(millis()-iterTime)/1000.0;

float LatCalc=(LatPrev+Lat)/2.0;

float LonM=cos(rads(LatCalc))*REarth*rads(Lon);

float LatM=REarth*rads(Lat);

float LatPrevM=REarth*rads(LatPrev);

float LonPrevM=cos(rads(LatCalc))*REarth*rads(LonPrev);

float dLonM=cos(rads(LatCalc))*REarth*rads((Lon-LonPrev));

float dLatM=REarth*rads(Lat-LatPrev);

145

Page 147: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

float asvU=sin(rads(SetHdg))*spd;

float asvV=cos(rads(SetHdg))*spd;

float asvMag=sqrt(asvU*asvU+asvV*asvV);

float expLatM=LatPrevM+asvV*dt;

float expLonM=LonPrevM+asvU*dt;

float lclU=(expLonM-LonM)/dt;

float lclV=(expLatM-LatM)/dt;

float lclMag=sqrt(lclU*lclU+lclV*lclV);

float tgtLatM=REarth*rads(tgtLat);

float tgtLonM=REarth*cos(rads(LatCalc))*rads(tgtLon);

float tgtX=tgtLonM-LonM;

float tgtY=tgtLatM-LatM;

float tgtMag=sqrt(tgtX*tgtX+tgtY*tgtY);

tgtX=tgtX/tgtMag*D;

tgtY=tgtY/tgtMag*D;

float asvX=asvU/asvMag*C;

float asvY=asvV/asvMag*C;

float lclX=lclU/lclMag*B*(lclMag/(lclMag+spd));

float lclY=lclV/lclMag*B*(lclMag/(lclMag+spd));

float totalX=lclX+asvX+tgtX;

float totalY=lclY+asvY+tgtY;

// set heading to newly calculated direction

SetHdg=degs(atan2(totalX,totalY));

if (SetHdg<0)

{

SetHdg=360.0+SetHdg;

}

// update the mission log

Serial2.print((millis()-startTime)/1000.0,3);

Serial2.print(’,’);

Serial2.print(hh);

Serial2.print(’,’);

Serial2.print(mm);

Serial2.print(’,’);

Serial2.print(ss);

Serial2.print(’,’);

Serial2.print(Lat,5);

Serial2.print(’,’);

Serial2.print(Lon,5);

Serial2.print(’,’);

Serial2.print(tgtLat,5);

Serial2.print(’,’);

Serial2.print(tgtLon,5);

Serial2.print(’,’);

Serial2.print(dt,3);

Serial2.print(’,’);

Serial2.print(lclU,3);

Serial2.print(’,’);

Serial2.print(lclV,3);

Serial2.print(’,’);

Serial2.print(lclMag,3);

Serial2.print(’,’);

146

Page 148: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Serial2.print(RngThresh,3);

Serial2.print(’,’);

Serial2.print(SetHdg);

Serial2.print(’,’);

Serial2.print(Heading);

Serial2.print(’,’);

Serial2.print(SSpd);

Serial2.print(’,’);

Serial2.println(planeRTT(),3);

// Reset the update timer

iterTime=millis();

}

// Set steering direction and shaft speed

Rudder.write(Steering(Heading,SetHdg));

analogWrite(3,SSpd);

// check for any incoming commands on Serial

while(Serial.available()>0)

{

char con=Serial.read();

if(con==’r’)

{

Rudder.write(Serial.parseInt());

}

if(con==’s’)

{

SSpd=Serial.parseInt();

}

else

{

Serial.read();

}

}

}

//////////////////////END CONTROL LOOP/////////////////////////

// Calculate a rudder steering angle implementing

// P-Controller

int Steering(float NowHdg,float SetHeading)

{

int Signal;

float dHeading=Heading-SetHdg;

if (abs(dHeading)>360.0)

{

dHeading=dHeading-360.0;

}

Signal=90+(int)((dHeading)*-3.75);

if (Signal>125)

{

Signal=125;

}

if (Signal<55)

{

147

Page 149: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Signal=55;

}

return Signal;

}

// Update the global heading variable with a new

// reading from the CMPS09 compass

float getH()

{

char H[4];

float Hfloat=0.0;

while(Serial1.available()>0)

{

Serial1.read();

}

Serial1.write(0x23);

t0=millis();

while(Serial1.available()<4 and millis()-t0<100);

if (Serial1.available()==4)

{

Serial1.readBytes(H,4);

Hfloat=((byte)H[0]*256+(byte)H[1])/10.0;

Hfloat+=Declination;

if (Hfloat<0)

{

Hfloat+=360.0;

}

}

else

{

Hfloat=Heading;

}

return Hfloat;

}

// Update the global position variables with a new

// reading from the GPS

void getGPS()

{

char ID[]="GPXXX";

boolean KeepRun=true;

while(KeepRun==true)

{

while(ID[0]!=’$’)

{

ID[0]=Serial3.read();

}

Serial3.readBytes(ID,5);

if ( (String)ID=="GPGLL")

{

Serial3.readBytesUntil(13,s3buff,200);

if (s3buff[38]==’A’)

148

Page 150: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

{

float LatDeg=(c2f(s3buff[1])*10.0+c2f(s3buff[2]));

float LatMin=(c2f(s3buff[3])*10.0+c2f(s3buff[4]));

LatMin+=c2f(s3buff[6])/10.0+c2f(s3buff[7])/100.0

+c2f(s3buff[8])/1000.0+c2f(s3buff[9])/10000.0

+c2f(s3buff[10])/100000;

Lat=LatDeg+LatMin/60.0;

if (s3buff[12]==’S’)

{

Lat=Lat*-1.0;

}

float LonDeg=(c2f(s3buff[14])*100.0+c2f(s3buff[15])*10.0+

c2f(s3buff[16]));

float LonMin=c2f(s3buff[17])*10.0+c2f(s3buff[18])

+c2f(s3buff[20])/10.0+c2f(s3buff[21])/100.0

+c2f(s3buff[22])/1000.0+c2f(s3buff[23])/10000.0

+c2f(s3buff[24])/100000.0;

Lon=LonDeg+LonMin/60.0;

if (s3buff[26]==’W’)

{

Lon=Lon*-1.0;

}

hh=c2f(s3buff[28])*10+c2f(s3buff[29]);

mm=c2f(s3buff[30])*10+c2f(s3buff[31]);

ss=c2f(s3buff[32])*10+c2f(s3buff[33])+c2f(s3buff[34])/10.0

+s3buff[35]/100.0;

KeepRun=false;

}

}

}

}

// Convert from an ASCII character representation of a number to

// its decimal equivalent

float c2f(char inchar)

{

return inchar-48.0;

}

// Convert from radians to degrees

float degs(float rads)

{

return rads/PI*180.0;

}

//Convert from degrees to radians

float rads(float degs)

{

return degs/180.0*PI;

}

// Calculate the planar approximated range in meters from

149

Page 151: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

// current GPS fix to objective

float planeRTT()

{

float LatCalc=(tgtLat+Lat)/2.0;

float LatM=rads(Lat)*REarth;

float tgtLatM=rads(tgtLat)*REarth;

float LonM=cos(rads(LatCalc))*REarth*rads(Lon);

float tgtLonM=cos(rads(LatCalc))*REarth*rads(tgtLon);

return sqrt( pow((tgtLonM-LonM),2) +pow(tgtLatM-LatM,2));

}

150

Page 152: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Appendix B

NetLogo Source Code

; NetLogo Prototype of DORA software (written for NetLogo v5.0.3)

; Anthony Jones

; April 15, 2013

;

; Configure Global Variables

globals [ pidt pide crabt crabe groupert groupere grouperspd

threshrng bestA bestB bestC bestD bestF bestBeta bestSpd

mortality success total nowsdev]

; Establish custom breeds for waypoints, stations, and different

; types of agents

breed [stations station]

stations-own [u v]

breed [waypoints waypoint]

breed [crabs crab]

;Crabs use tracking behavior to close on objective

crabs-own [speed pwr energy time stick ]

breed [pigeons pigeon]

;Pidgeons use homing behavior to close on objective

pigeons-own[speed pwr energy time stick ]

breed [groupers grouper]

; Groupers use DORA behavior to close on objective

groupers-own[ B C D alpha beta speed stick pwr energy time ]

breed [markers marker]

markers-own[u v]

; Setup Configures the space according to preferences set in

; the GUI

to setup

clear-all

; Initialize preferences to zero

set bestB 0

151

Page 153: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

set bestC 0

set bestD 0

set threshrng 0.5

ask patches [ set pcolor white ]

reset-ticks

set nowsdev sdev

create-stations 1

[

set xcor s0x

set ycor s0y

set u s0u

set v s0v

set color blue

set size 2.0

set shape "arrow"

ifelse u = 0 and v = 0 [ set heading 0][set heading atan u v]

]

create-stations 1

[

set xcor s1x

set ycor s1y

set u s1u

set v s1v

set color blue

set size 2.0

set shape "arrow"

ifelse u = 0 and v = 0 [ set heading 0][set heading atan u v]

]

create-stations 1

[

set xcor s2x

set ycor s2y

set u s2u

set v s2v

set color blue

set size 2.0

set shape "arrow"

ifelse u = 0 and v = 0 [ set heading 0][set heading atan u v]

]

create-waypoints 1

[

set shape "circle"

set color green

set size 1.5

set xcor wp0x

set ycor wp0y

]

create-waypoints 1

[

set shape "circle"

set color red

set size 1.5

152

Page 154: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

set xcor wp1x

set ycor wp1y

]

create-crabs 0

[

set color red

set size 2.0

set xcor wp0x

set ycor wp0y

set speed spd

set pwr EnRate

set energy energycap

]

create-pigeons 0

[

set color yellow

set size 2.0

set xcor wp0x

set ycor wp0y

set speed spd

set pwr EnRate

set energy energycap

facexy wp1x wp1y

]

end

; Main Loop

to go

; Reduce the standard deviation for simulated annealing every

; 100 ticks

if (ticks mod 100 = 0)

[

set nowsdev nowsdev * 0.8

]

; create randomly initialized groupers until a solution is found

; then use simulated annealing to initialize new ones centered on

; best solution so far

create-groupers agents - ( count groupers )

[ set total total + 1

set energy energycap

set stick ticks

ifelse bestB = 0 and BestC = 0 and BestD = 0 and

BestF = 0 and bestBeta = 0

[

set B random-float 2.0 - 1

set C random-float 2.0 - 1

set D random-float 2.0 - 1

set speed spd ;random-float spd

]

[

set B random-normal bestB sdev

set C random-normal bestC sdev

153

Page 155: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

set D random-normal bestD sdev

set speed spd ;random-normal bestSpd ( sdev * spd )

if speed > spd [set speed spd]

if speed < 0 [set speed 0]

]

if JustBest [ set B bestB set C bestC set D bestD ]

let bcdsum abs B + abs C + abs D

set B B / bcdsum

set C C / bcdsum

set D D / bcdsum

set color green

set size 2.0

set xcor wp0x

set ycor wp0y

set pwr EnRate * ( speed / spd ) ^ 3

facexy wp1x wp1y

]

; implement homing behavior by pidgeons

ask pigeons

[

set time ticks * tempres

if time mod upInt = 0

[

facexy wp1x wp1y

]

let pidu speed * sin heading

let pidv speed * cos heading

let tgtx xcor + (pidu + lclu) * tempres

let tgty ycor + (pidv + lclv) * tempres

set energy ( energy - ( pwr * tempres) )

if energy <= 0 [die]

ifelse (min-pxcor < tgtx and tgtx < max-pxcor and min-pycor <

tgty and tgty < max-pycor) [setxy tgtx tgty][ die ]

if distancexy wp1x wp1y < threshrng [ set pidt ticks *

tempres set pide energy die]

]

ask crabs

[

set time ticks * tempres

let xwind lclspd * sin ( subtract-headings wphdg lclhdg )

let hdgoff 0

ifelse abs ( xwind ) < speed [ set hdgoff asin ( xwind / speed)

][set hdgoff 0]

if time mod upInt = 0

[

set heading wphdg + hdgoff

]

show heading

let crabu speed * sin heading

let crabv speed * cos heading

let tgtx xcor + (crabu + lclu) * tempres

let tgty ycor + (crabv + lclv) * tempres

154

Page 156: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

set energy ( energy - ( pwr * tempres) )

if energy <= 0 [die]

if distancexy wp1x wp1y < threshrng [ set crabt ticks * tempres

set crabe energy die]

ifelse (min-pxcor < tgtx and tgtx < max-pxcor and

min-pycor < tgty and tgty < max-pycor) [setxy tgtx tgty][ die ]

]

; implement DROA behavior

ask groupers

[

set time ( ( ticks - stick ) * tempres )

let reciphdg ( ( wphdg + 180 ) mod 360 )

if time mod upInt = 0

[

let Bu B * ( lclspd / (lclspd + speed) ) * sin lclhdg

let Bv B * ( lclspd / (lclspd + speed) ) * cos lclhdg

let Cu C * sin heading

let Cv C * cos heading

let Du D * sin wphdg

let Dv D * cos wphdg

let totU Bu + Cu + Du

let totV Bv + Cv + Dv

set heading atan totU totV

]

let grouperu speed * sin heading

let grouperv speed * cos heading

let tgtx xcor + (grouperu + lclu) * tempres

let tgty ycor + (grouperv + lclv) * tempres

set energy energy - ( pwr * tempres)

if energy <= 0 [die]

ifelse (min-pxcor < tgtx and tgtx < max-pxcor and

min-pycor < tgty

and tgty < max-pycor)

[setxy tgtx tgty][ set mortality mortality + 1 die ]

let outlist ( list "B: " B " C: " C

" D: " D " beta: " beta )

ifelse TnotE

[

if ( distancexy wp1x wp1y <= threshrng ) and

( ( time < groupert ) or ( groupert = 0 ) )

[

set groupert time

set groupere energy

set grouperspd speed

set bestB B

set bestC C

set bestD D

set bestBeta beta

file-open "netlogo.log"

file-print ( list bestB bestC bestD bestF )

file-close

155

Page 157: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

]

]

[

if ( distancexy wp1x wp1y <= threshrng ) and

( ( energy > groupere ) or ( groupert = 0 ) )

[

set groupert time * tempres

set groupere energy

set grouperspd speed

set bestB B

set bestC C

set bestD D

set bestBeta beta

file-open "DisModel.log"

file-write ( list bestB bestC bestD bestF )

file-close

]

]

if distancexy wp1x wp1y <= threshrng [ set success

success + 1 show ( time ) die ]

]

tick

end

to-report lclu

let lu 0.0

let sumu 0.0

let sumd 0.0

foreach sort stations

[

set sumd sumd + 1.0 / (distancexy [xcor] of ? [ycor] of ?)

]

foreach sort stations

[

set lu lu + ( 1.0 / distancexy [xcor] of ? [ycor] of ? ) /

(sumd) * [u] of ?

]

report lu

end

; return the local u an v components of velocity based on

; weighted average of the stations’ values, weights by range

to-report lclv

let lv 0.0

let sumv 0.0

let sumd 0.0

foreach sort stations

[

set sumd sumd + 1.0 / (distancexy [xcor] of ? [ycor] of ?)

]

foreach sort stations

[

156

Page 158: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

set lv lv + ( 1.0 / distancexy [xcor] of ? [ycor] of ? ) /

(sumd) * [v] of ?

]

report lv

end

; return the local heading of the currents

to-report lclhdg

if lclu = 0 and lclv = 0 [report 0]

report atan lclu lclv

end

; return the magnitude of the local current

to-report lclspd

report sqrt (lclu ^ 2 + lclv ^ 2)

end

; return the heading from the agent to the objective

to-report wphdg

report atan ( wp1x - xcor ) ( wp1y - ycor )

end

157

Page 159: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Appendix C

Python Source Code

C.1 environment.py

import sys

sys.path.append(’./’)

import scipy.io.netcdf as ncio

import numpy as np

import navigation as nav

#ALL DISTANCES ARE IN METERS,

#VELOCITIES IN METERS PER SECOND, TIMES IN HOURS

REarth=6378.0*1000 #Radius of Earth in meters

# Data structure to store velocity field data

class VField:

def __init__(self,ncfname):

# Parse .nc file

if ncfname.split(’.’)[len(ncfname.split(’.’))-1]==’nc’:

ncfile=ncio.netcdf_file(ncfname,’r’)

self.u=np.array(ncfile.variables.get(\

’water_u’).data,np.float)

self.v=np.array(ncfile.variables.get(\

’water_v’).data,np.float)

self.lats=np.array(ncfile.variables.get(\

’lat’).data,np.float)

self.lons=np.array(ncfile.variables.get(\

’lon’).data,np.float)

self.times=np.array(ncfile.variables.get(\

’time’).data,np.float)

self.depths=np.array(ncfile.variables.get(\

’depth’).data,np.float)

for i in xrange(self.u.shape[0]):

for j in xrange(self.u.shape[1]):

for k in xrange(self.u.shape[2]):

158

Page 160: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

for l in xrange(self.u.shape[3]):

if self.u[i,j,k,l]==-30000:

self.u[i,j,k,l]=np.nan

if self.v[i,j,k,l]==-30000:

self.v[i,j,k,l]=np.nan

#self.u=self.u*(1.0/1000.0)

#self.v=self.v*(1.0/1000.0)

self.times=self.times-np.nanmin(self.times)

# Parse a data set that is already converted to a zipped set of

# numpy arrays. Note: npz must contain u,v,lats,lons,times, and

# depths arrays

elif ncfname.split(’.’)[len(ncfname.split(’.’))-1]==’npz’:

vfdata=np.load(ncfname)

self.u=vfdata[’u’]

self.v=vfdata[’v’]

self.lats=vfdata[’lats’]

self.lons=vfdata[’lons’]

self.times=vfdata[’times’]

self.depths=vfdata[’depths’]

# Record some useful global properties for use in algorithms

non_nan=np.count_nonzero(np.isnan(self.u))

self.maxmag=np.nanmax(np.sqrt(np.nanmax(np.abs(self.u))**2+\

np.nanmax(np.abs(self.v))**2))

self.meanmag=np.sqrt( (np.nansum(np.abs(self.u))/non_nan)**2+\

(np.nansum(np.abs(self.v))/non_nan)**2)

self.meanVmag=np.sqrt( (np.nansum(self.u)/non_nan)**2+\

((np.nansum(self.v))/non_nan)**2)

self.minLat=np.nanmin(self.lats)

self.minLon=np.nanmin(self.lons)

self.minD=np.nanmin(self.depths)

self.minT=np.nanmin(self.times)

self.maxLat=np.nanmax(self.lats)

self.maxLon=np.nanmax(self.lons)

self.maxD=np.nanmax(self.depths)

self.maxT=np.nanmax(self.times)

self.minu=np.nanmin(self.u)

self.minv=np.nanmin(self.v)

self.maxu=np.nanmax(self.u)

self.maxv=np.nanmax(self.v)

self.latcount=self.lats.shape[0]

self.loncount=self.lons.shape[0]

self.tcount=self.times.shape[0]

self.dcount=self.depths.shape

# Append another VField to the end of this one

def App(self,nVF):

print ’App’

’Concatenates along the Time Dimensions’

self.u=np.concatenate((self.u,nVF.u))

self.v=np.concatenate((self.v,nVF.v))

self.times=np.concatenate((self.times,nVF.times))

self.minLat=np.nanmin(self.lats)

self.minLon=np.nanmin(self.lons)

159

Page 161: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

self.minD=np.nanmin(self.depths)

self.minT=np.nanmin(self.times)

self.maxLat=np.nanmax(self.lats)

self.maxLon=np.nanmax(self.lons)

self.maxD=np.nanmax(self.depths)

self.maxT=np.nanmax(self.times)

self.minu=np.nanmin(self.u)

self.minv=np.nanmin(self.v)

self.maxu=np.nanmax(self.u)

self.maxv=np.nanmax(self.v)

self.latcount=self.lats.shape[0]

self.loncount=self.lons.shape[0]

self.tcount=self.times.shape[0]

self.dcount=self.depths.shape[0]

# Linearly interpolate the velocity at a requested time,depth,lat,lon

def linterpTDLL(self,time,depth,lat,lon):

if lon<0:

lon=360.0+lon

# Check to make sure requested data is in dataset

if ( (self.minT <= time <=self.maxT)==False) or \

( (self.minD <= depth <=self.maxD)==False) or \

( (self.minLat <= lat <=self.maxLat)==False) or \

( (self.minLon <= lon <=self.maxLon)==False):

return np.array([np.nan,np.nan])

istep=(self.maxT-self.minT)/(self.tcount-1)

i = np.floor( (time-self.minT)/istep )

if i == self.times.shape[0]-1:

i=i-1

j = 0

while(self.depths[j]<depth):

j+=1

jstep=self.depths[j+1]-self.depths[j]

kstep=(self.maxLat-self.minLat)/(self.latcount-1)

k = np.floor( (lat-self.minLat)/( kstep ) )

lstep = (self.maxLon-self.minLon)/(self.loncount-1)

l = np.floor( (lon-self.minLon)/(lstep) )

# Do the quadrilinear interpolation for u

ulo0d0t0=(self.u[i,j,k+1,l]-self.u[i,j,k,l])*(lat\

-self.lats[k])*kstep+self.u[i,j,k,l]

ulo1d0t0=(self.u[i,j,k+1,l+1]-self.u[i,j,k,l+1])*(lat\

-self.lats[k])*kstep+self.u[i,j,k,l+1]

ud0t0=(ulo1d0t0-ulo0d0t0)*(lon\

-self.lons[l])/lstep+ulo1d0t0

ulo0d1t0=(self.u[i,j+1,k+1,l]\

-self.u[i,j+1,k,l])*(lat-self.lats[k])*\

kstep+self.u[i,j,k,l]

ulo1d1t0=(self.u[i,j+1,k+1,l+1]-self.u[i,j+1,k,l+1])*\

(lat-self.lats[k])*kstep+self.u[i,j+1,k,l+1]

ud1t0=(ulo1d1t0-ulo0d1t0)*(lon-self.lons[l])/lstep+ulo0d1t0

ut0=(ud1t0-ud0t0)*((depth-self.depths[j])/jstep)+ud0t0

ulo0d0t1=(self.u[i+1,j,k+1,l]\

-self.u[i+1,j,k,l])*(lat-self.lats[k])*\

160

Page 162: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

kstep+self.u[i+1,j,k,l]

ulo1d0t1=(self.u[i+1,j,k+1,l+1]-self.u[i+1,j,k,l+1])*\

(lat-self.lats[k])*kstep+self.u[i+1,j,k,l+1]

ud0t1=(ulo1d0t1-ulo0d0t1)*(lon-self.lons[l])/lstep+ulo1d0t1

ulo0d1t1=(self.u[i+1,j+1,k+1,l]-self.u[i+1,j+1,k,l])*\

(lat-self.lats[k])*kstep+self.u[i+1,j,k,l]

ulo1d1t1=(self.u[i+1,j+1,k+1,l+1]-self.u[i+1,j+1,k,l+1])*\

(lat-self.lats[k])*kstep+self.u[i+1,j+1,k,l+1]

ud1t1=(ulo1d1t1-ulo0d1t1)*(lon-self.lons[l])/lstep+ulo0d1t1

ut1=(ud1t1-ud0t1)*((depth-self.depths[j])/jstep)+ud0t1

u=(ut1-ut0)*(time-self.times[i])/istep+ut0

# Do the quadrilinear interpolation for v

vlo0d0t0=(self.v[i,j,k+1,l]-self.v[i,j,k,l])*(lat\

-self.lats[k])*kstep+self.v[i,j,k,l]

vlo1d0t0=(self.v[i,j,k+1,l+1]-self.v[i,j,k,l+1])*(lat\

-self.lats[k])*kstep+self.v[i,j,k,l+1]

vd0t0=(vlo1d0t0-vlo0d0t0)*(lon-self.lons[l])/lstep+vlo1d0t0

vlo0d1t0=(self.v[i,j+1,k+1,l]-self.v[i,j+1,k,l])*(lat\

-self.lats[k])*kstep+self.v[i,j,k,l]

vlo1d1t0=(self.v[i,j+1,k+1,l+1]-self.v[i,j+1,k,l+1])*\

(lat-self.lats[k])*kstep+self.v[i,j+1,k,l+1]

vd1t0=(vlo1d1t0-vlo0d1t0)*(lon-self.lons[l])/lstep+vlo0d1t0

vt0=(vd1t0-vd0t0)*((depth-self.depths[j])/jstep)+vd0t0

vlo0d0t1=(self.v[i+1,j,k+1,l]-self.v[i+1,j,k,l])*(lat\

-self.lats[k])*kstep+self.v[i+1,j,k,l]

vlo1d0t1=(self.v[i+1,j,k+1,l+1]-self.v[i+1,j,k,l+1])*\

(lat-self.lats[k])*kstep+self.v[i+1,j,k,l+1]

vd0t1=(vlo1d0t1-vlo0d0t1)*(lon-self.lons[l])/lstep+vlo1d0t1

vlo0d1t1=(self.v[i+1,j+1,k+1,l]-self.v[i+1,j+1,k,l])*\

(lat-self.lats[k])*kstep+self.v[i+1,j,k,l]

vlo1d1t1=(self.v[i+1,j+1,k+1,l+1]-self.v[i+1,j+1,k,l+1])*\

(lat-self.lats[k])*kstep+self.v[i+1,j+1,k,l+1]

vd1t1=(vlo1d1t1-vlo0d1t1)*(lon-self.lons[l])/lstep+vlo0d1t1

vt1=(vd1t1-vd0t1)*((depth-self.depths[j])/jstep)+vd0t1

v=(vt1-vt0)*(time-self.times[i])/istep+vt0

return np.array([u,v])

class HField:

def __init__(self,HFName):

alldata=np.load(HFName)

self.lats=alldata[’y’]

self.lons=alldata[’x’]

self.z=alldata[’z’]

self.minz=np.nanmin(self.z)

self.maxz=np.nanmax(self.z)

self.minlat=np.nanmin(self.lats)

self.maxlat=np.nanmax(self.lats)

self.minlon=np.nanmin(self.lons)

self.maxlon=np.nanmax(self.lons)

def linterpll(self,latlon):

lat=latlon[0]

lon=latlon[1]

161

Page 163: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

if self.minlat <= lat <= self.maxlat == False:

return np.nan

if self.minlon <= lon <= self.maxlon == False:

return np.nan

latstep=(self.maxlat-self.minlat)/(self.lats.shape[0]-1)

lonstep=(self.maxlon-self.minlon)/(self.lons.shape[0]-1)

k=int( (lat-self.minlat)/latstep) #Sometimes Wrong

l=int( (lon-self.minlon)/lonstep) #Sometimes Wrong

#Correct to here

k=int( (lat-self.minlat)/latstep+.000000000001)

l=int( (lon-self.minlon)/lonstep+.000000000001)

if k>self.z.shape[0]-2 or l>self.z.shape[1]-2:

return np.nan

zla0lo0=self.z[k,l]

zla0lo1=self.z[k,l+1]

zla1lo0=self.z[k+1,l]

zla1lo1=self.z[k+1,l+1]

zla0=(lon-self.lons[l])/lonstep*(zla0lo1-zla0lo0)+zla0lo0

zla1=(lon-self.lons[l])/lonstep*(zla1lo1-zla1lo0)+zla1lo0

z=(lat-self.lats[k])/latstep*(zla0-zla1)+zla0

dbgthresh=-2

return z

def grad(self,latlon,rng):

’return gradient in meters/meter with a resolution of\

\’range\’ in meters’

llW=np.array((latlon[0],latlon[1]-nav.pFromD(latlon[0],rng)))

llE=np.array((latlon[0],latlon[1]+nav.pFromD(latlon[0],rng)))

llN=np.array((latlon[0]+nav.lFromD(rng),latlon[1]))

llS=np.array((latlon[0]-nav.lFromD(rng),latlon[1]))

hW=self.linterpll((llW[0],llW[1]))

hE=self.linterpll((llE[0],llE[1]))

hN=self.linterpll((llN[0],llN[1]))

hS=self.linterpll((llS[0],llS[1]))

return np.array( ( (hE-hW)/rng,(hN-hS)/rng ) )

C.2 navigation.py

import numpy as np

REarth=6371008.770 #Radius of Earth in meters

def mxFromp(dlondegs,latdegs):

’Meters of Longitude from Degrees of Longitude’

return REarth*np.cos(np.radians(latdegs))*np.radians(dlondegs)

def myFroml(dlatdegs):

’Meters of Latitude from Degrees Lattitude’

dlat=np.radians(dlatdegs)

return REarth*dlat

def lFromD(dlatMeters):

’Degrees of Latitude from Meters of Latitude’

162

Page 164: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

return np.degrees(dlatMeters/REarth)

def pFromD(Lat,dlonMeters):

’Degrees of Longitude from meters of Longitude’

return np.degrees(dlonMeters/(REarth*np.cos(np.radians(Lat))))

def CFromlp(l,p):

’Course From Change in Latitude and Change in Longitude’

return np.arctan2(p,l)

def lFromDC(D,C):

’Change in latitude due to traveling D meters on \

course C from current position’

return D*np.cos(C)

def pFromDC(D,C):

’Change in longitude due to traveling D meters on \

course C from current position’

return D*np.sin(C)

def gcRng(pt0,pt1):

’Calculate Great Circle Distance Between pt0 and pt1’

L1=np.radians(pt0[0])

L2=np.radians(pt1[0])

DL0=np.radians(pt1[1]-pt0[1])

if DL0>=np.pi:

DL0=2.0*np.pi-DL0

if DL0<= -1.0*np.pi:

DL0=DL0+2*np.pi

InTheFunc=(np.sin(L1)*np.sin(L2)+np.cos(L1)*np.cos(L2)*np.cos(DL0))

try:

D=REarth*np.arccos(InTheFunc)

except RuntimeWarning:

print InTheFunc, np.arccos(InTheFunc)

return D

def gcCrs(pt0,pt1):

’Calculate Initial Great Circle Course from pt0 to \

pt1 Returns Course in Radians’

L1=np.radians(pt0[0])

L2=np.radians(pt1[0])

DLo=np.abs((np.radians(pt1[1]-pt0[1])))

if DLo==0:

if L1>L2:

return np.pi

else:

return 0.0

A=np.pi/2-L2

B=np.pi/2-L1

cosC=np.cos(A)*np.cos(B)+np.sin(A)*np.sin(B)*np.cos(DLo)

C=np.arccos(cosC)

cosa=(np.cos(A)-np.cos(B)*np.cos(C))/(np.sin(B)*np.sin(C))

a=np.arccos(cosa)

if pt1[1]<pt0[1]:

C=2*np.pi-a

else:

C=a

return C

163

Page 165: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

def plRngEq(pt0,pt1):

dl=pt1[0]-pt0[0]

dp=pt1[1]-pt0[1]

dl=np.radians(dl)

dp=np.radians(dp)

dl=dl*REarth

dp=dp*REarth

return np.sqrt(dl**2.0+dp**2.0)

def subHeadings(h1,h2):

hDiff=h1-h2

return hDiff

def headingUV(uv):

return np.degrees(np.arctan2(uv[0],uv[1]))

C.3 agents.py

import numpy as np

import navigation as nav

class Agent:

def __init__(self,pos,depth,heading,speed,tgt,rngThresh,energy,\

power,preuv=np.array((0.0,0.0)) ):

self.pos=pos

self.prepos=self.pos

self.heading=heading

self.speed=speed

self.tgt=tgt

self.rngThresh=rngThresh

self.energy=energy

self.power=power

self.posAge=0

self.headingAge=0

self.depth=depth

self.predepth=depth

self.preuv=preuv

self.headingAge=0.0

self.posAge=0.0

def subHeadings(self,h1,h2):

hDiff=h1-h2

return hDiff

def GCheadingTo(self,latlon):

return np.degrees(nav.gcCrs(self.pos,latlon))

def headingTo(self,latlon):

lp=np.array(latlon)-np.array(self.pos)

return np.degrees(np.arctan2(lp[1],lp[0]))

def headingUV(self,uv):

return np.degrees(np.arctan2(uv[0],uv[1]))

def updatePos(self,lcluv,bz,tstep):

if np.any(np.isnan(lcluv)):

164

Page 166: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

lcluv=self.preuv

if self.depth*-1 < bz:

self.pos=(np.nan,np.nan)

else:

self.prepos=self.pos

aghdg=np.radians(self.heading)

aguv=np.array( (self.speed*np.sin(aghdg),\

self.speed*np.cos(aghdg)))

lcluv=np.array(lcluv)

uv=lcluv+aguv

dxdy = uv*tstep

self.pos=self.pos+np.array(( nav.lFromD(dxdy[1])\

,nav.pFromD(self.pos[0],dxdy[0])))

self.preuv=lcluv

def updateAll(self,lcluv,dzdll,bz,tstep,hRate):

if np.isnan(lcluv[0]) or np.isnan(lcluv[1]):

lcluv=self.preuv

if self.headingAge >= hRate:

self.updateHeading(lcluv,dzdll,bz)

self.updatePos(lcluv,bz,tstep)

self.headingAge=self.headingAge+tstep

self.energy=self.energy-self.power*tstep

class crabAgent(Agent):

’An agent that uses an angle to compensate for local currents’

def __init__(self,pos,depth,heading,speed,tgt,\

rngThresh,energy,power):

Agent.__init__(self,pos,depth,heading,speed\

,tgt,rngThresh,energy,power)

def updateHeading(self,lcluv,dzdll,bz):

if np.any(np.isnan(lcluv)):

lcluv=self.preuv

lclspd=np.sqrt(lcluv[0]**2+lcluv[1]**2)

dhdg=self.subHeadings(self.headingUV(lcluv), \

self.GCheadingTo(self.tgt))

xwind=-1*lclspd*np.sin(np.radians(dhdg))

offset=0.0

if np.abs( xwind ) < self.speed:

offset=np.degrees( np.arcsin( xwind/(self.speed ) ))

self.heading = self.GCheadingTo(self.tgt) + offset

self.headingAge=0.0

class pigeonAgent(Agent):

’An agent that homes in on its target’

def __init__(self,pos,depth,heading,speed,\

tgt,rngThresh,energy,power):

Agent.__init__(self,pos,depth,heading,\

speed,tgt,rngThresh,energy,power)

def updateHeading(self,lcluv,dzdll,bz):

if np.any(np.isnan(lcluv)):

lcluv=self.preuv

self.heading=self.GCheadingTo(self.tgt)

self.headingAge=0.0

165

Page 167: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

class grouperAgent(Agent):

’An agent that chooses its heading based \

on preferences B,C, and D’

def __init__(self,A,B,C,D,pos,depth,heading,\

speed,tgt,rngThresh,energy,power):

Agent.__init__(self,pos,depth,heading,\

speed,tgt,rngThresh,energy,power)

ABCDsum= np.abs(A)+np.abs(B)+np.abs(C)+np.abs(D)

self.A=A / ABCDsum # Collision Coefficient

self.B=B / ABCDsum # Local Current Coefficient

self.C=C / ABCDsum # Agent Heading Coefficient

self.D=D / ABCDsum # Target Coefficient

def updateHeading(self,lcluv,dzdll,bz):

if np.any(np.isnan(lcluv)):

lcluv=self.preuv

’Update Grouper\’s heading’

spdwt=( 1 - self.speed/(self.speed +\

np.sqrt(lcluv[0]**2+lcluv[1]**2)) )

lclHdg=np.radians(self.headingUV(lcluv))

agentHdg=np.radians(self.heading)

tgtHdg=np.radians(self.GCheadingTo(self.tgt))

lclxy=np.array( (self.B*np.sin(lclHdg), \

self.B*np.cos(lclHdg)))

lclxy=lclxy*spdwt

agentxy=np.array( (self.C*np.sin(agentHdg), \

self.C*np.cos(agentHdg)) )

tgtxy = np.array((self.D*np.sin(tgtHdg),\

self.D*np.cos(tgtHdg)))

total=lclxy+tgtxy+agentxy

self.heading=self.headingUV((total[0],total[1]))

self.headingAge=0.0

class dorAgent(Agent):

’An agent that chooses its heading based\

on preferences A,B,C, and D’

def __init__(self,A,B,C,D,pos,depth,heading,\

speed,tgt,rngThresh,energy,power,M,T):

Agent.__init__(self,pos,depth,heading,\

speed,tgt,rngThresh,energy,power)

ABCDsum= np.abs(A)+np.abs(B)+np.abs(C)+np.abs(D)

self.A=A / ABCDsum # Collision Coefficient

self.B=B / ABCDsum # Local Current Coefficient

self.C=C / ABCDsum # Agent Heading Coefficient

self.D=D / ABCDsum # Target Coefficient

self.T=T #Temporal Threshold for Collision avoidance in seconds

self.M=M #Slope

def updateHeading(self,lcluv,dzdll,bz):

’Update Grouper\’s heading’

if np.any(np.isnan(lcluv)):

lcluv=self.preuv

lclmag=np.sqrt(lcluv[0]**2+lcluv[1]**2)

spdwt=(lclmag/(self.speed + lclmag) )

lclHdg=np.radians(self.headingUV(lcluv))

166

Page 168: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

agentHdg=np.radians(self.heading)

tgtHdg=np.radians(self.GCheadingTo(self.tgt))

lclxy=np.array( (self.B*np.sin(lclHdg), \

self.B*np.cos(lclHdg)))

lclxy=lclxy*spdwt

agentxy=np.array( (self.C*np.sin(agentHdg), \

self.C*np.cos(agentHdg)) )

gradmag=dzdll[0]*np.sin(agentHdg)+dzdll[1]*np.cos(agentHdg)

t_c=(-1*self.depth-bz)/gradmag/self.speed #Time to Collision

gradwt0=1.0/(1.0+np.exp(self.M*t_c-self.T*self.M))

gradwt1=-1.0/(1+np.exp(1.0*t_c))+1.0

gradwt=gradwt0*gradwt1*self.A

gradxy=(dzdll)/(np.sqrt(dzdll[0]**2+dzdll[1]**2))

gradxy=gradxy*gradwt

tgtxy = np.array((self.D*np.sin(tgtHdg), \

self.D*np.cos(tgtHdg)))

total=gradxy+lclxy+agentxy+tgtxy

self.heading=self.headingUV((total[0],total[1]))

self.headingAge=0.0

C.4 asearch.py

import numpy as np

import environment as env

from navigation import *

# Class to record nodes in a planning graph

class pathNode:

def __init__(self,depth,lat,lon,heading=0,cost=0.0,parent=None):

self.root=False #True if node is the root.

#Currently unused

if parent==None:

self.root=True

self.time=None #This node’s time. Currently unused

self.cost=cost #The cost of reaching this node

self.heur=0.0 #This node’s heuristic value.

#Currently unused

self.depth=depth

self.lat=lat

self.lon=lon

self.heading=heading

self.parent=parent #Pointer to node’s parent.

#None if node is root

self.exp=False #True if this node has been expanded

self.childs=[] #Pointers to this node’s children

# Expands a node uniformly in space within the

# specified angle of heading and

# the given branching factor. All other arguments are

167

Page 169: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

# used to calculate costs of reaching new nodes

def uExpSpace(self,depth,rng,spd,VF,tstep,tree,nodes=3,angle=90.0,\

descDepth=0.0,profiling=False,descAngle=0.0,HF=None):

’Expand Uniformly in space by adding <nodes> children \

distributed every 360/<nodes> degrees at <depth> meters\

from the surface and <range> horizontal meters from \

current location’

self.exp=True

for i in xrange(nodes):

course=self.heading-angle/2+i*angle/(nodes-1)

newlat=self.lat+lFromD(lFromDC(rng,np.radians(course)))

newlon=self.lon+pFromD(self.lat,pFromDC(rng,\

np.radians(course)))

newdepth=self.depth

addit=True

for j in xrange(len(tree)):

if(gcRng([newlat,newlon],[tree[j].lat,\

tree[j].lon])<0.25*rng):

if(self.cost+costT([[self.lat,self.lon,self.depth],\

[tree[j].lat,tree[j].lon,\

self.depth],spd,VF,tstep,\

self.cost,descDepth,\

profiling,descAngle,\

HF])>tree[j].cost):

addit=False

if addit==True:

newc=self.cost + costT([[self.lat,self.lon,self.depth],\

[newlat,newlon,newdepth],spd,VF,\

tstep,self.cost,descDepth,\

profiling,descAngle,HF])

if np.isnan(newc)==False:

self.childs.append(pathNode(newdepth,newlat,\

newlon,course,\

newc,self))

# Expand a node by adding a single node on a direct course

#between self and objective

def dExp(self,rng,spd,VF,tstep,target,descDepth,profiling,\

descAngle,HF):

tgtLat=target[0]

tgtLon=target[1]

tgtDep=target[2]

course=np.degrees(CFromlp(tgtLat-self.lat,tgtLon-self.lon))

newlat=self.lat+lFromD(lFromDC(rng,np.radians(course)))

newlon=self.lon+pFromD(self.lat,pFromDC(rng,np.radians(course)))

newc=self.cost + costT([[self.lat,self.lon,self.depth],\

[newlat,newlon,tgtDep],spd,VF,\

tstep,self.cost,descDepth,\

profiling,descAngle,HF])

if np.isnan(newc)==False:

self.childs.append(pathNode(tgtDep,newlat,newlon\

,course,newc,self))

168

Page 170: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

# Calculate the cost of traveling from one node to another

def costT(args):

start=args[0] # Starting lat,lon,depth

end=args[1] # Ending lat,lon,depth

auvspd=args[2] # Vehicle Speed in m/s

VF=args[3] # Velocity Field

tstep=args[4] # time step in seconds

t0=args[5] # starting time in hours

descDepth=args[6] # Operating depth

profiling=args[7] # Boolean to control profiling

descAngle=args[8] # Descent Angle When Profiling

HF=args[9]

thresh=(auvspd+VF.maxmag)*tstep*np.sqrt(2) # Range from objective

# to consider objective reached

lcluvpre=np.array((0.0,0.0))

timeout=72.0*3.0 # Arbitrary timeout in hours

# to prevent getting stuck

# in infinite loop

stlat=np.float(start[0])

stlon=np.float(start[1])

# stdep=np.float(start[2])

enlat=np.float(end[0])

enlon=np.float(end[1])

#endep=np.float(end[2])

descent=True

zspd=auvspd*np.tan(np.radians(descAngle))

if profiling==True:

nowdep=0.0

else:

nowdep=descDepth

nowlat=stlat

nowlon=stlon

cost=0.0

prevrng=gcRng([stlat,stlon],[enlat,enlon])

RNG=prevrng

# Vehicle homes from start to objective in the current

# field, with model running at a temporal resolution of tstep

while(RNG > thresh):

prevrng=np.float(RNG)

auvC=CFromlp(enlat-nowlat,enlon-nowlon)

lcluv=VF.linterpTDLL(t0+cost,nowdep,nowlat,nowlon)

if np.any(np.isnan(np.array(lcluv))):

lcluv=lcluvpre

else:

lcluvpre=lcluv

###################################################

lclspd=np.sqrt(lcluv[0]**2+lcluv[1]**2)

dhdg=subHeadings(headingUV(lcluv),np.degrees(gcCrs((nowlat,\

nowlon),(enlat,enlon))) )

xwind= -1*lclspd*np.sin(np.radians(dhdg))

offset=0.0

if np.abs( xwind ) < auvspd:

169

Page 171: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

offset=np.degrees( np.arcsin( xwind/(auvspd) ))

heading = np.degrees(gcCrs((nowlat,nowlon),\

(enlat,enlon))) + offset

auvC=np.radians(heading)

###########################

auvuv=[auvspd*np.sin(auvC),auvspd*np.cos(auvC)]

nowlon=nowlon+pFromD(nowlat,(auvuv[0]+lcluv[0])*tstep)

nowlat=nowlat+lFromD((auvuv[1]+lcluv[1])*tstep)

if profiling==True:

if nowdep==0:

descent=True

if nowdep==descDepth:

descent=False

if descent==True:

nowdep=np.min((nowdep+zspd*tstep,descDepth))

else:

nowdep=np.max((nowdep-zspd*tstep,0))

else:

nowdep=descDepth

cost=cost+tstep/3600.0

if np.isnan(lcluv[0]) or np.isnan(lcluv[1]) or np.any(\

np.isnan(np.array((nowlat,nowlon)))):

return np.nan

elif cost>timeout:

return np.nan

if HF!=None:

if HF.linterpll((nowlat,nowlon))>-1*nowdep:

return np.nan

RNG=gcRng([nowlat,nowlon],[enlat,enlon])

# Correct for the last bit of distance remaining when loop terminates

pmr=prevrng-RNG

if pmr!=0:

cost=cost+RNG/((pmr)/tstep)/3600.0

return cost

# Search for a direct route between st and en

def DirectSearch(st,en,legDist,t0,spd,tstep,VF,descDepth=0.0,\

HF=None,Q=None,profiling=False,descAngle=0.0):

Root=pathNode(st[2],st[0],st[1],np.degrees(\

CFromlp(en[0]-st[0],en[1]-st[1])))

Graph=[Root]

RNG=gcRng([Root.lat,Root.lon],[en[0],en[1]])

thresh=200

if RNG>thresh:

loopcon=True

else:

loopcon=False

while(loopcon):

Graph[len(Graph)-1].dExp(np.nanmin(np.array([legDist,RNG]))\

,spd,VF,tstep,en,descDepth,profiling,descAngle)

Graph=Graph+Graph[len(Graph)-1].childs

170

Page 172: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

tgt=Graph[len(Graph)-1]

RNG=gcRng([tgt.lat,tgt.lon],[en[0],en[1]])

if RNG < thresh:

loopcon=False

tgt=Graph[len(Graph)-1]

tcost=tgt.cost

log=list()

while tgt.parent!=None:

log.append([tgt.lat,tgt.lon,tgt.depth])

tgt=tgt.parent

log.append([tgt.lat,tgt.lon,tgt.depth])

log.reverse()

log=np.array(log)

return [log,tcost]

# Dynamically build a tree and search it for a minimum time route

# between st and en

def AstarSearchTime(st,en,legDist,t0,spd,tstep,VF,descDepth=0.0,\

HF=None,timeout=None,Q=None,branchFactor=3,branchAngle=90.0\

,profiling=False,descAngle=0.0):

# Create the root of the tree

Root=pathNode(st[2],st[0],st[1],np.degrees(CFromlp(\

en[0]-st[0],en[1]-st[1])))

# Use a list to hold all of the nodes that are created

Graph=[Root]

# Use another list to hold the nodes that have not yet been visited

Candidates=[Root]

RNG=gcRng([Root.lat,Root.lon],[en[0],en[1]])

log=list()

thresh=200 # Arbitrary threshold of 200m from objective

# to stop planning

if RNG>thresh:

loopcon=True

else:

loopcon=False

while(loopcon):

# Create a list and store the cost+heuristic values of each

# of the candidate nodes

costs=list()

for i in Candidates:

costs.append(i.cost+gcRng([i.lat,i.lon],\

[en[0],en[1]])/(spd+VF.maxmag)/3600.0 )

try:

if np.nanmin(costs)*3600.0>=timeout:

print ’Timeout Exceeded’

return [None,None]

exploc=costs.index(np.nanmin(costs))

except (ValueError):

print ’No route found.’

return [None,None]

# Choose to expand the node with the lowest cost+heuristic value

tgt=Candidates[exploc]

171

Page 173: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

RNG=gcRng([tgt.lat,tgt.lon],[en[0],en[1]])

# Add the current node to the interprocess

# queue to update the GUI

if Q!=None:

Q.put([’<currentNode>’,tgt])

if RNG > thresh:

loopcon=True

Candidates[exploc].uExpSpace(tgt.depth,\

np.nanmin(np.array([legDist,RNG])),\

spd,VF,tstep,Candidates,branchFactor,\

branchAngle,descDepth,profiling,descAngle,HF)

# When the search is close to the end, always

# consider a direct route to the objective

if RNG<legDist:

Candidates[exploc].dExp(np.nanmin(\

np.array([legDist,RNG])),spd,VF,tstep,\

en,descDepth,profiling,descAngle,HF)

# Add the children of the visited node to the Candidates

# and Graph

# lists, and remove the visited node and its cost from the

# candidate list

Candidates=Candidates+Candidates[exploc].childs

Graph=Graph+Candidates[exploc].childs

Candidates.pop(exploc)

costs.pop(exploc)

else:

loopcon=False

tcost=tgt.cost

if timeout!=None and tcost*3600.0>timeout:

print ’Timeout Exceeded’

return [None, None]

#log.append([en[0],en[1],en[2]])

while tgt.parent!=None:

log.append([tgt.lat,tgt.lon,tgt.depth])

tgt=tgt.parent

log.append([tgt.lat,tgt.lon,tgt.depth])

#log.append([st[0],st[1],st[2]])

log.reverse()

log=np.array(log)

if Q!=None:

Q.put([’<intRoute>’,log])

np.save(’plotlog.npy’,log)

return [log,tcost]

# Export routes as kml for display in Google Earth

def Route2KML(routes,filename):

xmlout=open(filename,’w’)

xmlout.write(’<?xml version="1.0" standalone="yes"?>\n’)

xmlout.write(’<kml xmlns="http://earth.google.com/kml/2.2">\n’)

xmlout.write(’ <Document>\n’)

xmlout.write(’ <Folder id="Tracks">\n’)

for j in xrange(len(routes)):

xmlout.write(’ <Placemark>\n’)

172

Page 174: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

xmlout.write(’ <MultiGeometry>\n’)

xmlout.write(’ <LineString>\n’)

xmlout.write(’ <altitudeMode>clampToGround</altitudeMode>\n’)

xmlout.write(’ <coordinates>\n’)

for i in xrange(0,len(routes[j])):

outstring=str(routes[j][i,1])+’,’+str(routes[j][i,0])+’,’+\

str(0)+’\n’

xmlout.write(outstring)

xmlout.write(’ </coordinates>\n’)

xmlout.write(’ <tessellate>1</tessellate>\n’)

xmlout.write(’ </LineString>\n’)

xmlout.write(’ </MultiGeometry>\n’)

xmlout.write(’ <Snippet></Snippet>\n’)

xmlout.write(’ <Style>\n’)

xmlout.write(’ <LineStyle>\n’)

if j==0:

xmlout.write(’ <color>FF00FF00</color>\n’)

else:

xmlout.write(’ <color>FF00FF00</color>\n’)

xmlout.write(’ <width>3</width>\n’)

xmlout.write(’ </LineStyle>\n’)

xmlout.write(’ </Style>\n’)

xmlout.write(’ </Placemark>\n’)

xmlout.write(’ <name>Tracks</name>\n’)

xmlout.write(’ <open>0</open>\n’)

xmlout.write(’ <visibility>1</visibility>\n’)

xmlout.write(’ </Folder>\n’)

xmlout.write(\

’<Snippet><![CDATA[Path planned by APPv0.2]]></Snippet>\n’)

xmlout.write(’ <name><![CDATA[Planned Path]]></name>\n’)

xmlout.write(’ <open>1</open>\n’)

xmlout.write(’ <visibility>1</visibility>\n’)

xmlout.write(’ </Document>\n’)

xmlout.write(’</kml>’)

xmlout.close()

C.5 nc2npz.py

import sys

import environment as e

import numpy as np

# Read the NetCDF file passed as the command line argument and save it

# as a numpy zip file

VF0=e.VField(sys.argv[1])

np.savez(sys.argv[2],u=VF0.u,v=VF0.v,lats=VF0.lats,\

lons=VF0.lons,depths=VF0.depths,times=VF0.times)

173

Page 175: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

C.6 toponc2npz

import scipy.io.netcdf as ncio

import numpy as np

# Read topographic data in a NetCDF file from the

# following line and save it as a numpy zip file

ncfname=’/home/aj/Dropbox/EclipseWorkspace/GrouperNav/etopo1_gulf0.nc’

ncfile=ncio.netcdf_file(ncfname,’r’)

z=np.array(ncfile.variables.get(’Band1’).data,np.float)

z=np.flipud(z)

print z.shape

lats=np.zeros(181)

lons=np.zeros(301)

for i in xrange(301):

lon=-88.5+1.0/60.0*i

lons[i]=-88.5+1.0/60.0*i

for i in xrange(181):

lat=23.1+1.0/60.0*i

lats[i]=lat

np.savez(’etopo1_gulf0.npz’,x=lons,y=lats,z=z)

C.7 ncinfo.py

import sys

sys.path.append(’./’)

sys.path.append(’./resources’)

import scipy as sp

import numpy as np

import scipy.io.netcdf as ncio

if len(sys.argv)!=2:

print "Usage: python ncinfo.py FILENAME.nc"

else:

ncname=sys.argv[1]

ncfile=ncio.netcdf_file(ncname, ’r’)

ncvars=ncfile.variables.items()

print ’dimensions: ’+str(ncfile.dimensions)

print ncname+’ Contains The Following Variables:’

print ’Name\t|\tEntries\t|\tMin\t|\tMax\t#|\tUnits’

print ’------------------------------------------------------’

for i in xrange(0,len(ncvars)):

varentry=list(ncfile.variables.get(ncvars[i][0]))

print ncvars[i][0]+’\t|\t’+str(sp.size(varentry))+\

’\t|\t’+str(sp.nanmin(varentry))+’\t|\t’+str(sp.nanmax(varentry))

174

Page 176: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

C.8 GenWaypoints.py

import random

import numpy as np

import environment as env

import navigation as nav

WPFileName=’EWCarib150.csv’

VFileName=’Carib2012_05_01.npz’

HFileName=’ETOPO2.npz’

VF=env.VField(VFileName)

HF=env.HField(HFileName)

depth=2.0

trial=0

trials=100

fll=np.array( (14.5,-63.0,16.5,-59.5) )

Range=150000.0 # meters

Doff=np.degrees(Range/(nav.REarth*np.cos(\

np.radians((fll[0]+fll[2])/2.0))))

while(trial<trials):

valid=False

while valid==False:

temp0=(random.uniform(0.0,1.0)*(\

fll[2]-fll[0]) + fll[0],(fll[1]+\

fll[3])/2.0-Doff/2.0 ) #lat lon

temp1=(temp0[0],temp0[1]+Doff) # lat lon

if HF.linterpll(temp0)<(-1.0*depth) \

and HF.linterpll(temp1)<(-1.0*depth)\

and np.isnan(VF.linterpTDLL(0,depth,\

temp0[0],temp0[1])[0])==False\

and np.isnan(VF.linterpTDLL(0,depth\

,temp1[0],temp1[1])[0])==False:

valid=True

print "Depth at temp0: ",HF.linterpll(temp0),\

" Depth at temp1: ",HF.linterpll(temp1)," LCLUV Temp0: ",\

VF.linterpTDLL(0,depth,temp0[0],temp0[1])," LCLUV Temp1: ",\

VF.linterpTDLL(0,depth,temp1[0],temp1[1])

if random.uniform(0.0,1.0) >= 0.5:

spawn=temp0

target=temp1

else:

spawn=temp1

target=temp0

trial+=1

print trial

WPFile=open(WPFileName,’a’)

WPFile.write(str(spawn[0])+’,’+str(spawn[1])+’,’+str(target[0])+\

’,’+str(target[1])+’\n’)

WPFile.close()

175

Page 177: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

C.9 GenWaypointsGulf.py

import random

import numpy as np

import environment as env

import navigation as nav

WPFileName=’Gulf100.csv’

VFileName=’SEGulf2012_05_01.npz’

HFileName=’etopo1_gulf0.npz’

VF=env.VField(VFileName)

HF=env.HField(HFileName)

depth=2.0

trial=0

trials=100

fll=np.array( (23.1,-88.5,26.1,-83.5 ))

Range=150000.0 # meters

Doff=np.degrees(Range/(nav.REarth*np.cos(np.radians(fll[2]))))

while(trial<trials):

valid=False

bearing=random.uniform(0.0,180.0)

DegLons=Doff*np.sin(np.radians(bearing))

DegLats=Doff*np.cos(np.radians(bearing))

while valid==False:

temp0=(random.uniform(23.1,26.1-DegLats),\

random.uniform(-88.5,-83.5-DegLons) )

DCrs=np.degrees(Range/(nav.REarth*np.cos(\

np.radians(temp0[0]))))

DegLons=DCrs*np.sin(np.radians(bearing))

DegLats=DCrs*np.cos(np.radians(bearing))

temp1=(temp0[0]+DegLats,temp0[1]+DegLons)

if HF.linterpll(temp0)<(-1.0*depth) \

and HF.linterpll(temp1)<(-1.0*depth)\

and np.isnan(VF.linterpTDLL(0,depth,\

temp0[0],temp0[1])[0])==False\

and np.isnan(VF.linterpTDLL(0,\

depth,temp1[0],temp1[1])[0])==False:

valid=True

print "Depth at temp0: ",HF.linterpll(temp0),\

" Depth at temp1: ",HF.linterpll(temp1)," LCLUV Temp0: ",\

VF.linterpTDLL(0,depth,temp0[0],temp0[1])," LCLUV Temp1: ",\

VF.linterpTDLL(0,depth,temp1[0],temp1[1])

if random.uniform(0.0,1.0) >= 0.5:

spawn=temp0

target=temp1

else:

spawn=temp1

target=temp0

trial+=1

print trial

WPFile=open(WPFileName,’a’)

WPFile.write(str(spawn[0])+’,’+str(spawn[1])+’,’+str(target[0])+’,’+\

176

Page 178: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

str(target[1])+’\n’)

WPFile.close()

C.10 GrouperBehavior-P3d.py

import direct.showbase.ShowBase as psb

import direct.task as pt

from direct.gui.DirectGui import *

from panda3d.core import *

import random

import time as t

import numpy as np

import pylab as pl

import pygame

import agents

import environment as env

import navigation as nav

spawn=(-45.5,-0.5) # lat lon

target=(45.5,45.5) # lat lon

uv=(0.0000001,0.000001)

spd=130

hdg=np.degrees(nav.gcCrs(spawn,target))

energy=24*7.0*3600.0

power=1.0

timestep=60.0 # seconds

thresh=(spd+np.sqrt(uv[0]**2+uv[1]**2))*timestep # meters

missiontime=0.0

timeout=24*7.0*3600

Params=(0.0,-1.0,0.0,0.5)

tagent=agents.Agent()

tagent.pos=spawn

gcr=nav.gcRng(spawn,target)

meanlat=(spawn[0]+target[0])/2.0

meanlon=(spawn[1]+target[1])/2.0

gcr=gcr/nav.REarth*180.0/np.pi

fll=np.array((meanlat-gcr/2*1.5,meanlon-gcr/2*1.5,\

meanlat+gcr/2*1.5,meanlon+gcr/2*1.5))

fxy=np.array((-50,-50,50,50))

def lalo2xy(flalo,fxy,lalo):

x = (lalo[1] - flalo[1]) / (flalo[3]-flalo[1]) *\

(fxy[2]-fxy[0])+fxy[0]

y = (lalo[0] - flalo[0]) / (flalo[2]-flalo[0]) *\

(fxy[3]-fxy[1])+fxy[1]

return (x,y)

class Model(psb.ShowBase):

def __init__(self):

global k

psb.ShowBase.__init__(self)

177

Page 179: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

render.setAntialias(AntialiasAttrib.MAuto)

self.setBackgroundColor(VBase3(1.0,1.0,1.0))

light=DirectionalLight(’Sun’)

light.setColor(VBase4(1.0,1.0,1.0,1.0))

lightNode=self.render.attachNewNode(light)

lightNode.setHpr(45.0,-45.0,0.0)

render.setLight(lightNode)

self.bestg=np.nan

self.bestp=np.nan

self.bestc=np.nan

self.startMark=self.loader.loadModel(’/home/aj/markerGreen.x’)

self.startMark.reparentTo(self.render)

txy=lalo2xy(fll,fxy,(spawn[0],spawn[1]))

self.startMark.setPos(txy[0],txy[1],0.0)

self.endMark=self.loader.loadModel(’/home/aj/markerRed.x’)

self.endMark.reparentTo(self.render)

self.bestgLabel=DirectLabel(text=’G: ’+\

str(self.bestg/3600),text_bg=(0.0,0.0,0.0,0.0),\

text_fg=(0.0,0.0,1.0,1.0),frameColor=(0.0,0.0,0.0,0.0)\

,text_scale=(0.08,0.08),text_pos=(-0.5,0.8))

self.bestpLabel=DirectLabel(text=’P: ’+str(self.bestp/3600),\

text_bg=(0.0,0.0,0.0,0.0),text_fg=(0.7,0.7,0.0,1.0),\

frameColor=(0.0,0.0,0.0,0.0),text_scale=(0.08,0.08),\

text_pos=(0.0,0.8))

self.bestcLabel=DirectLabel(text=’C: ’+str(self.bestc/3600),\

text_bg=(0.0,0.0,0.0,0.0),text_fg=(1.0,0.0,0.0,1.0),\

frameColor=(0.0,0.0,0.0,0.0),text_scale=(0.08,0.08),\

text_pos=(0.5,0.8))

txy=lalo2xy(fll,fxy,(target[0],target[1]))

self.endMark.setPos(txy[0],txy[1],0.0)

self.alist=list()

self.alist.append([agents.pigeonAgent(spawn,hdg,spd,target,\

thresh,energy,power),self.loader.loadModel(\

’/home/aj/agentYellow.x’)])

self.alist.append([agents.crabAgent(spawn,hdg,spd,\

target,thresh,energy,power),self.loader.loadModel(\

’/home/aj/agentRed.x’)])

self.alist.append([agents.grouperAgent(0,Params[1],\

Params[2],Params[3],spawn,hdg,spd,\

target,thresh,energy,power),\

self.loader.loadModel(’/home/aj/agentBlue.x’)])

self.taskMgr.add(self.moveAgents,"moveAgents")

self.missiontime=0

for i in self.alist:

i[1].reparentTo(self.render)

xy=lalo2xy(fll,fxy,i[0].pos)

i[1].setPos(xy[0],xy[1],0.0)

def moveAgents(self,task):

self.camera.setPos(0,0,200)

self.camera.setP(-90.0)

global bestg, bestParams, bestp, bestc,epoch,epochs,stdev

178

Page 180: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

if self.missiontime>timeout or len(self.alist)==0:

self.alist.append([agents.pigeonAgent(spawn,hdg,spd,target,\

thresh,energy,power),self.loader.loadModel(\

’/home/aj/agentYellow.x’)])

self.alist.append([agents.crabAgent(spawn,hdg,spd,target,\

thresh,energy,power),self.loader.loadModel(\

’/home/aj/agentRed.x’)])

self.alist.append([agents.grouperAgent(0,Params[1],\

Params[2],Params[3],spawn,hdg,spd,target,thresh,\

energy,power),self.loader.loadModel(\

’/home/aj/agentBlue.x’)])

for i in self.alist:

i[1].reparentTo(self.render)

xy=lalo2xy(fll,fxy,i[0].pos)

i[1].setPos(xy[0],xy[1],0.0)

self.missiontime=0.0

else:

self.missiontime=self.missiontime+timestep

kills=list()

for i in range(len(self.alist)):

killi=False

self.alist[i][0].updateAll(uv,timestep,0)

if self.alist[i].__class__==agents.pigeonAgent or \

self.alist[i].__class__==agents.crabAgent:

print self.alist[i][0].heading,self.alist[i][0].pos

if (np.isnan(self.alist[i][0].pos[0]) or \

self.alist[i][0].energy <= 0.0 or ( ((fll[0]<\

self.alist[i][0].pos[0]<fll[2]) and \

(fll[1]<self.alist[i][0].pos[1]<fll[3]))!=True )):

killi=True

elif nav.gcRng(self.alist[i][0].pos,\

self.alist[i][0].tgt)<thresh:

drdt=(nav.gcRng(self.alist[i][0].prepos,\

self.alist[i][0].tgt)-nav.gcRng(\

self.alist[i][0].pos,self.alist[i][0].tgt))\

/timestep

dt=nav.gcRng(self.alist[i][0].pos,\

self.alist[i][0].tgt)/drdt

if self.alist[i][0].__class__==agents.crabAgent and\

((self.missiontime+dt < self.bestc) or \

(np.isnan(self.bestc))):

self.bestc = self.missiontime+dt

self.bestcLabel[’text’]=’C: ’+str(np.round(\

self.bestc/3600,2))

if self.alist[i][0].__class__==agents.pigeonAgent \

and ( (self.missiontime +dt < self.bestp ) or \

(np.isnan(self.bestp)) ):

self.bestp = self.missiontime+dt

self.bestpLabel[’text’]=’P: ’+str(np.round(\

self.bestp/3600,2))

179

Page 181: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

if self.alist[i][0].__class__==agents.grouperAgent \

and ( (self.missiontime+dt < self.bestg ) or\

(np.isnan(self.bestg)) ):

self.bestg = self.missiontime+dt

self.bestgLabel[’text’]=’G: ’+str(\

np.round(self.bestg/3600,2))

killi=True

if killi==False:

xy=lalo2xy(fll,fxy,self.alist[i][0].pos)

self.alist[i][1].setH(-1*self.alist[i][0].heading)

self.alist[i][1].setPos(xy[0],xy[1],0.0)

else:

self.alist[i][1].removeNode()

kills.append(self.alist[i])

for i in range(len(kills)):

self.alist.remove(kills[i])

return pt.Task.cont

theProgram = Model()

theProgram.run()

C.11 GrouperView-P3d

import direct.showbase.ShowBase as psb

import direct.task as pt

from direct.gui.DirectGui import *

from panda3d.core import *

import random

import time as t

import numpy as np

import pylab as pl

import pygame

import agents

import environment as env

import navigation as nav

import xml.etree.ElementTree as et

depth=25.0

logfile=’Results/Carib_300cms/TrialLog_21Nov2012_2115.csv’

x=np.genfromtxt(logfile,delimiter=’,’)

tNumber=0

trial=x[:,0]

timeReq=x[:,1]

spnLat=x[:,2]

spnLon=x[:,3]

tgtLat=x[:,4]

tgtLon=x[:,5]

gcRng=x[:,6]

bestp=x[:,7]

bestc=x[:,8]

180

Page 182: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

bestg=x[:,9]

bestEpoch=x[:,10]

bestA=x[:11]

bestB=x[:,12]

bestC=x[:,13]

bestD=x[:,14]

groupers=x[:,15]

epochs=x[:,16]

startStdev=x[:,17]

pconv=x[:,18]

cconv=x[:,19]

gconv=x[:,20]

testcfg=et.parse(logfile.split(’.’)[0]+’.xml’)

spd=np.float(testcfg.find(’speed’).text)

energy=np.float(testcfg.find(’energy’).text)

power=np.float(testcfg.find(’power’).text)

timestep=np.float(testcfg.find(’timeStep’).text) # seconds

thresh=np.float(testcfg.find(’threshold’).text) # meters

timeout=np.float(testcfg.find(’timeout’).text)

VF=env.VField(’Carib2012_05_01.npz’)

HF=env.HField(’ETOPO2.npz’)

k=0

for j in trial:

if j==np.float(tNumber):

k=int(j)

spawn=(spnLat[k],spnLon[k]) # lat lon

target=(tgtLat[k],tgtLon[k]) # lat lon

hdg=np.degrees(nav.gcCrs(spawn,target))

bestParams=(x[k,11],x[k,12],x[k,13],x[k,14])

meanlat=(spawn[0]+target[0])/2.0

meanlon=(spawn[1]+target[1])/2.0

fll=np.array((23.1,-88.5,26.1,-83.5))

fxy=np.array((-75,-75,75,75))

def lalo2xy(flalo,fxy,lalo):

x = (lalo[1] - flalo[1]) / (flalo[3]-flalo[1]) * \

(fxy[2]-fxy[0])+fxy[0]

y = (lalo[0] - flalo[0]) / (flalo[2]-flalo[0]) * \

(fxy[3]-fxy[1])+fxy[1]

return (x,y)

def settNumber(newtNumber):

global tNumber

tNumber=newtNumber

class Model(psb.ShowBase):

def __init__(self):

global k

psb.ShowBase.__init__(self)

render.setAntialias(AntialiasAttrib.MAuto)

self.setBackgroundColor(VBase3(1.0,1.0,1.0))

light=DirectionalLight(’Sun’)

light.setColor(VBase4(1.0,1.0,1.0,1.0))

lightNode=self.render.attachNewNode(light)

lightNode.setHpr(45.0,-45.0,0.0)

181

Page 183: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

render.setLight(lightNode)

self.bc=np.nan

self.bg=np.nan

self.bp=np.nan

self.startMark=self.loader.loadModel(’/home/aj/markerGreen.x’)

self.startMark.reparentTo(self.render)

txy=lalo2xy(fll,fxy,(spawn[0],spawn[1]))

self.startMark.setPos(txy[0],txy[1],0.0)

self.endMark=self.loader.loadModel(’/home/aj/markerRed.x’)

self.endMark.reparentTo(self.render)

self.Tentry=DirectEntry(text=’’,command=settNumber,\

initialText=str(tNumber),numLines=1,text_bg=\

(0.0,0.0,0.0,0.0),scale=0.08,width=4,frameColor=\

(0.0,0.0,0.0,0.0),text_pos=(-1.0,0.7))

self.bestgLabel=DirectLabel(text=’G: ’+str(np.round(\

0/3600,2)),text_bg=(0.0,0.0,0.0,0.0),text_fg=\

(0.0,0.0,1.0,1.0),frameColor=(0.0,0.0,0.0,0.0)\

,text_scale=(0.08,0.08),text_pos=(-0.5,0.8))

self.bestpLabel=DirectLabel(text=’P: ’+str(np.round(\

0/3600,2)),text_bg=(0.0,0.0,0.0,0.0),text_fg=\

(0.7,0.7,0.0,1.0),frameColor=(0.0,0.0,0.0,0.0)\

,text_scale=(0.08,0.08),text_pos=(0.0,0.8))

self.bestcLabel=DirectLabel(text=’C: ’+str(np.round(\

0/3600,2)),text_bg=(0.0,0.0,0.0,0.0),text_fg=\

(1.0,0.0,0.0,1.0),frameColor=(0.0,0.0,0.0,0.0),\

text_scale=(0.08,0.08),text_pos=(0.5,0.8))

self.trialLabel=DirectLabel(text=’Trial: ’+str(tNumber)\

,text_bg=(0.0,0.0,0.0,0.0),text_fg=(0.0,0.0,0.0,1.0)\

,frameColor=(0.0,0.0,0.0,0.0),text_scale=(0.08,0.08)\

,text_pos=(-1.0,0.8))

self.rangeLabel=DirectLabel(text=’Range: ’,text_bg=\

(0.0,0.0,1.0,0.0),text_fg=(0.0,0.0,0.0,1.0),frameColor=\

(0.0,0.0,0.0,0.0),text_scale=(0.08,0.08),\

text_pos=(-0.2,-0.8))

self.rngvalLabel=DirectLabel(text=str(nav.gcRng(\

spawn,target)),text_bg=(0.0,0.0,1.0,0.0),text_fg=\

(0.0,0.0,0.0,1.0),frameColor=(0.0,0.0,0.0,0.0),\

text_scale=(0.08,0.08),text_pos=(0.2,-0.8))

self.mtimeLabel=DirectLabel(text=’MTime: ’,text_bg=\

(0.0,0.0,1.0,0.0),text_fg=(0.0,0.0,0.0,1.0),frameColor=\

(0.0,0.0,0.0,0.0),text_scale=(0.08,0.08),\

text_pos=(-0.2,-0.7))

self.mtvalLabel=DirectLabel(text=str(0.0),text_bg=\

(0.0,0.0,1.0,0.0),text_fg=(0.0,0.0,0.0,1.0),\

frameColor=(0.0,0.0,0.0,0.0),text_scale=(0.08,0.08)\

,text_pos=(0.2,-0.7))

txy=lalo2xy(fll,fxy,(target[0],target[1]))

self.endMark.setPos(txy[0],txy[1],0.0)

self.alist=list()

self.alist.append([agents.pigeonAgent(spawn,depth,hdg,\

spd,target,thresh,energy,power),self.loader.loadModel(\

’/home/aj/agentYellow.x’)])

182

Page 184: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

self.alist.append([agents.crabAgent(spawn,depth,hdg,spd,\

target,thresh,energy,power),self.loader.loadModel(\

’/home/aj/agentRed.x’)])

self.alist.append([agents.grouperAgent(bestParams[0],\

bestParams[1],bestParams[2],bestParams[3],spawn,\

depth,hdg,spd,target,thresh,energy,power),\

self.loader.loadModel(’/home/aj/agentBlue.x’)])

self.taskMgr.add(self.moveAgents,"moveAgents")

self.missiontime=0

for i in self.alist:

i[1].reparentTo(self.render)

xy=lalo2xy(fll,fxy,i[0].pos)

i[1].setPos(xy[0],xy[1],0.0)

def moveAgents(self,task):

global k,bestg, bestParams, bestp, bestc,\

epoch,epochs,stdev,fll,fxy,meanlat,meanlon\

,gcr,spawn,target,hdg,energy,power

xy=lalo2xy(fll,fxy,(meanlat,meanlon))

self.camera.setPos(xy[0],xy[1],200)

self.camera.setP(-90.0)

kills=list()

if len(self.alist)==0:

if self.trialLabel[’text’].split(’ ’)[1]!=str(tNumber):

self.trialLabel[’text’]=’Trial: ’+str(tNumber)

k=0

for j in trial:

if j==np.float(tNumber):

k=int(j)

spawn=(spnLat[k],spnLon[k]) # lat lon

target=(tgtLat[k],tgtLon[k]) # lat lon

txy=lalo2xy(fll,fxy,(spawn[0],spawn[1]))

self.startMark.setPos(txy[0],txy[1],0.0)

txy=lalo2xy(fll,fxy,(target[0],target[1]))

self.endMark.setPos(txy[0],txy[1],0.0)

hdg=np.degrees(nav.gcCrs(spawn,target))

bestParams=(x[k,11],x[k,12],x[k,13],x[k,14])

gcr=nav.gcRng(spawn,target)

meanlat=(spawn[0]+target[0])/2.0

meanlon=(spawn[1]+target[1])/2.0

fll=np.array((14.5,-63.0,16.5,-59.5))

fxy=np.array((-75,-75,75,75))

self.bc=np.nan

self.bg=np.nan

self.bp=np.nan

self.bestgLabel[’text’]=’G: ’+str(np.round(0/3600,2))

self.bestcLabel[’text’]=’C: ’+str(np.round(0/3600,2))

self.bestpLabel[’text’]=’G: ’+str(np.round(0/3600,2))

self.alist.append([agents.pigeonAgent(spawn,depth,hdg,spd\

,target,thresh,energy,power),self.loader.loadModel(\

’/home/aj/agentYellow.x’)])

self.alist.append([agents.crabAgent(spawn,depth,hdg,spd,\

target,thresh,energy,power),self.loader.loadModel(\

183

Page 185: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

’/home/aj/agentRed.x’)])

if bestParams!=(0.0,0.0,0.0,0.0):

self.alist.append([agents.dorAgent(bestParams[0],\

bestParams[1],bestParams[2],bestParams[3],spawn\

,depth,hdg,spd,target,thresh,energy,power),\

self.loader.loadModel(’/home/aj/agentBlue.x’)])

for i in self.alist:

i[1].reparentTo(self.render)

xy=lalo2xy(fll,fxy,i[0].pos)

i[1].setPos(xy[0],xy[1],0.0)

self.missiontime=0.0

self.mtvalLabel[’text’]=str(np.round(self.missiontime/3600.0,2))

self.missiontime=self.missiontime+timestep

for i in xrange(len(self.alist)):

killi=False

self.alist[i][0].updateAll(VF.linterpTDLL((self.missiontime\

-timestep)/3600.0,depth,self.alist[i][0].pos[0],\

self.alist[i][0].pos[1]),HF.grad(self.alist[i][0].pos,\

100.0),HF.linterpll(self.alist[i][0].pos),timestep,0)

if self.alist[i][0].__class__==agents.crabAgent:

drdt=(nav.gcRng(self.alist[i][0].prepos,\

self.alist[i][0].tgt)-nav.gcRng(\

self.alist[i][0].pos,self.alist[i][0].tgt))/timestep

self.rngvalLabel[’text’]=str(np.round(nav.gcRng(\

self.alist[i][0].pos,self.alist[i][0].tgt),0))

if (np.isnan(self.alist[i][0].pos[0]) or \

self.alist[i][0].energy <= 0.0 or ( ((fll[0]<\

self.alist[i][0].pos[0]<fll[2]) and (fll[1]<\

self.alist[i][0].pos[1]<fll[3]))!=True )):

killi=True

elif nav.gcRng(self.alist[i][0].pos,self.alist[i][0].tgt)\

<thresh:

drdt=(nav.gcRng(self.alist[i][0].prepos,\

self.alist[i][0].tgt)-nav.gcRng(self.alist[i][0].pos\

,self.alist[i][0].tgt))/timestep

dt=nav.gcRng(self.alist[i][0].pos,self.alist[i][0].tgt)\

/drdt

if self.alist[i][0].__class__==agents.crabAgent and\

((self.missiontime + dt < self.bc) or \

(np.isnan(self.bc))):

self.bc = self.missiontime+dt

self.bestcLabel[’text’]=’C: ’+str(np.round(\

self.bc/3600,4))

if self.alist[i][0].__class__==agents.pigeonAgent and\

( (self.missiontime + dt < self.bp ) or \

(np.isnan(self.bp)) ):

self.bp = self.missiontime+dt

self.bestpLabel[’text’]=’P: ’+str(np.round(self.bp\

/3600,4))

if self.alist[i][0].__class__==agents.dorAgent and (\

(self.missiontime + dt < self.bg ) or (np.isnan(\

self.bg)) ):

184

Page 186: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

self.bg = self.missiontime+dt

self.bestgLabel[’text’]=’G: ’+str(np.round(\

self.bg/3600,4))

killi=True

if killi==False:

xy=lalo2xy(fll,fxy,self.alist[i][0].pos)

self.alist[i][1].setH(-1*self.alist[i][0].heading)

self.alist[i][1].setPos(xy[0],xy[1],0.0)

else:

self.alist[i][1].removeNode()

kills.append(self.alist[i])

for i in xrange(len(kills)):

self.alist.remove(kills[i])

del kills

return pt.Task.cont

theProgram = Model()

theProgram.run()

C.12 popro.py

import numpy as np

def none2Nan(isnone):

if isnone==’None’:

return np.nan

else:

return np.float(isnone)

logroot=’/home/aj/Dropbox/EclipseWorkspace/GrouperNav/Results/’

lognames=[’Carib_300cms/Series0/TrialLogAStar_20k_0.csv’,\

’Carib_300cms/Series0/TrialLogAStar_10k_0.csv’,\

’Carib_300cms/Series0/TrialLog_S500_I50_0.csv’,\

’Carib_300cms/Series0/TrialLog_C_0.csv’,\

’Carib_300cms/Series0/TrialLog_P_0.csv’,\

’Carib_300cms/Series0/TrialLogZeroA_S50_I50_0.csv’,\

’Carib_300cms/Series0/TrialLogZeroA15_S50_I50_0.csv’,\

’Carib_150cms/Series0/TrialLog_S500_I50_0.csv’ ,\

’Carib_150cms/Series0/TrialLogAStar_0.csv’,\

’Carib_150cms/Series0/TrialLogAStar_20k_0.csv’,\

’Carib_150cms/Series0/TrialLog_C_0.csv’,\

’Carib_150cms/Series0/TrialLog_P_0.csv’,\

’Carib_150cms/Series0/TrialLogZeroA_S50_I50_0.csv’,\

’Carib_150cms/Series0/TrialLogZeroA15_S50_I50_0.csv’,\

’Carib_40cms/Series0/TrialLog_S500_I50_0.csv’,\

’Carib_40cms/Series0/TrialLog_S500_I50_1.csv’,\

’Carib_40cms/Series0/TrialLog_S500_I50_2.csv’,\

’Carib_40cms/Series0/TrialLogAStar_1.csv’,\

’Carib_40cms/Series0/TrialLogAStar_20k_0.csv’,\

’Carib_40cms/Series0/TrialLog_C_0.csv’,\

’Carib_40cms/Series0/TrialLog_P_0.csv’,\

185

Page 187: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

’Carib_40cms/Series0/TrialLogZeroA_S50_I50_0.csv’,\

’Carib_40cms/Series0/TrialLogZeroA15_S50_I50_0.csv’,\

’Gulf_300cms/Series0/TrialLog_S500_I50_0.csv’,\

’Gulf_300cms/Series0/TrialLogAStar_0.csv’,\

’Gulf_300cms/Series0/TrialLogAStar_20k_0.csv’,\

’Gulf_300cms/Series0/TrialLog_C_0.csv’,\

’Gulf_300cms/Series0/TrialLog_P_0.csv’,\

’Gulf_300cms/Series0/TrialLogZeroA_S50_I50_0.csv’,\

’Gulf_300cms/Series0/TrialLogZeroA15_S50_I50_0.csv’,\

’Gulf_150cms/Series0/TrialLogAStar_0.csv’,\

’Gulf_150cms/Series0/TrialLogAStar_20k_0.csv’,\

’Gulf_150cms/Series0/TrialLog_S500_I50_0.csv’,\

’Gulf_150cms/Series0/TrialLog_C_0.csv’,\

’Gulf_150cms/Series0/TrialLog_P_0.csv’,\

’Gulf_150cms/Series0/TrialLogZeroA_S50_I50_0.csv’,\

’Gulf_150cms/Series0/TrialLogZeroA15_S50_I50_0.csv’,\

’Gulf_40cms/Series0/TrialLog_S500_I50_0.csv’,\

’Gulf_40cms/Series0/TrialLogAStar_20k_0.csv’,\

’Gulf_40cms/Series0/TrialLog_C_0.csv’,\

’Gulf_40cms/Series0/TrialLog_P_0.csv’,\

’Gulf_40cms/Series0/TrialLogZeroA_S50_I50_0.csv’,\

’Gulf_40cms/Series0/TrialLogZeroA15_S50_I50_0.csv’

]

res=list()

print ’Suc Rate\t’,’Mtime\t’,’treq\t’,’tsuc’

for i in range(len(lognames)):

res.append(np.genfromtxt(logroot+lognames[i],delimiter=’,’))

conv=0

treq=0.0

treqsuc=0.0

mtime=0.0

effspd=0.0

FEpoch=0.0

minFEpoch=None

maxFEpoch=None

for j in xrange(res[i].shape[0]):

if res[i].shape[1]==27:

if res[i][j,26]==1.0:

conv+=1.0

if minFEpoch==None:

minFEpoch=res[i][j,17]

elif res[i][j,17]<minFEpoch:

minFEpoch=res[i][j,17]

if maxFEpoch==None:

maxFEpoch=res[i][j,17]

elif res[i][j,17]>maxFEpoch:

maxFEpoch=res[i][j,17]

FEpoch+=res[i][j,17]

treqsuc+=res[i][j,1]

mtime+=res[i][j,10]

effspd+=res[i][j,6]/res[i][j,10]

treq+=res[i][j,1]

186

Page 188: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

if res[i].shape[1]==12:

if res[i][j,11]==1.0:

conv+=1.0

treqsuc+=res[i][j,1]

mtime+=res[i][j,10]

effspd+=res[i][j,6]/res[i][j,10]

treq+=res[i][j,1]

if res[i].shape[1]==13:

if res[i][j,12]==1.0:

conv+=1.0

treqsuc+=res[i][j,1]

mtime+=res[i][j,11]

effspd+=res[i][j,6]/res[i][j,11]

treq+=res[i][j,1]

if res[i].shape[1]==9:

if res[i][j,8]:

conv+=1.0

treqsuc+=res[i][j,1]

mtime+=res[i][j,7]

effspd+=res[i][j,6]/res[i][j,7]

treq+=res[i][j,1]

avgtreq=treq/res[i].shape[0]

avgtsuc=treqsuc/conv

avgmtime=mtime/conv

avgeffspd=effspd/conv

avgFEpoch=FEpoch/conv

sFEpoch=0.0

seffspd=0.0

conv=0.0

for j in xrange(res[i].shape[0]):

if res[i].shape[1]==27:

if res[i][j,26]==1.0:

conv+=1.0

sFEpoch+=np.sqrt( (res[i][j,26]-avgFEpoch)**2 )

seffspd+=np.sqrt( (res[i][j,6]/res[i][j,10]\

-avgeffspd)**2 )

if res[i].shape[1]==12:

if res[i][j,11]==1.0:

conv+=1.0

seffspd+=np.sqrt( (res[i][j,6]/res[i][j,10]\

-avgeffspd)**2 )

if res[i].shape[1]==13:

if res[i][j,12]==1.0:

conv+=1.0

seffspd+=np.sqrt( (res[i][j,6]/res[i][j,11]\

-avgeffspd)**2 )

if res[i].shape[1]==9:

if res[i][j,8]:

conv+=1.0

seffspd+=np.sqrt( (res[i][j,6]/res[i][j,7]\

-avgeffspd)**2 )

sFEpoch=sFEpoch/conv

187

Page 189: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

seffspd=seffspd/conv

print ’----’+lognames[i]+’----’

print conv/res[i].shape[0],avgmtime/3600,avgeffspd,seffspd,\

avgtreq,avgtsuc,avgFEpoch,sFEpoch,’\n’

C.13 HullProp.py

import numpy as np

import navigation as nav

import environment as env

import pylab as pl

import Image as im

import random

def hf(B,F,x):

return -B/(2.0*F**2)*x**2.0 + B/2.0

def ha(B,F,A,Chi,x):

return -B*Xfa/F**2*x+Chi/2 - A*B*Xfa/F**2

WtBat=3.88/16.0 #lbf

NBats=2.0

WtBats=WtBat*NBats

WtMot=2.0/16.0

wsp=62.4/(12.0**3)

A=12.0 # inches

F=14.0 # inches

B=8.0 # inches

Chi=4.0 # inches

LWL=A+F

xMid=LWL/2.0-A

T=0.75

Bmid=2.0*hf(B,F,xMid)

Amid=Bmid*T

Xfa=-A + F**2.0*np.sqrt(B/F**2.0*(B*A**2/F**2-B+Chi))/B

Area=2.0*( -B*Xfa/(2.0*F**2)*Xfa**2 - B*A*Xfa/F**2*Xfa + Chi/2.0*Xfa\

+B*Xfa/(2.0*F**2.0)*(-A)**2.0 + B*A*Xfa/F**2.0*(-A) - \

Chi/2*(-A) )+2.0/3.0*B*F+B/(3*F**2.0)*(Xfa)**3.0-B*Xfa

print ’Area * T’,Area*T,’\tLWL*B*T’,LWL*T*B

print Area

thick=.25

VDisp=T*Area

VHull=Area*thick

WHull=1.2*wsp*VHull

w=WtBats+WtMot+WHull

C0=1+(B*Xfa/F**2)**2

C1=1+(B/F)**2

Lf=-1* F**2/(2.0*B)*( B/F*( (Xfa/F)*np.sqrt(C0) -np.sqrt(C1))\

+np.log((-B/F + np.sqrt(C1)) / (-B*Xfa/F**2 \

+np.sqrt(C0)) ) )

print ’WHull: ’,WHull,’Draft: ’,w/(wsp*Area),’Lf: ’,Lf,np.sqrt(B**2\

188

Page 190: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

+F**2),Xfa

La=np.sqrt((Xfa+A)**2*C0)

print ’Side Length: ’,La+Lf

N=103

Lfnum=0.0

for i in xrange(1,N+1):

dx=(F-Xfa)/N

x=dx*i+Xfa

Lfnum+=np.sqrt(dx**2+(hf(B,F,x)-hf(B,F,x-dx))**2)

print Lfnum,La,np.sqrt( (ha(B,F,A,Chi,Xfa)-ha(B,F,A,Chi,-A))**2+\

(Xfa+A)**2 )

shape=np.zeros((N+1,2))

for i in xrange(N+1):

x=-A+(A+F)/N*(i)

shape[i,0]=x

shape[i,1]=(1.0/(1.0+np.exp(100.0*x-Xfa)))*(-1.0/(1.0+np.exp(\

100.0*x-A))+1)+(1.0/(1.0+np.exp(100.0*x-F)))*(-1.0/(1.0+\

np.exp(100.0*x+Xfa))+1 )

if x<=Xfa:

shape[i,1]=(-B*Xfa/(F**2.0)) * x - A*B*Xfa/(F**2) + Chi/2.0

else:

shape[i,1]=-B/(2.0*F**2)*x**2.0 + B/2.0

print 26-(shape[i,0]+12),np.round(2.54*shape[i,1],1)

print ’Bmid’,Bmid,’\tB ’,B

print ’C_WP: ’,Area/(B*LWL)

print ’C_B: ’,Area/(B*LWL)

print ’C_P: ’,VDisp/((T*Bmid*LWL))

print ’C_M: ’,Amid/(Bmid*T)

print ’C_VP: ’,1.0

pl.plot(shape[:,0],shape[:,1],’k’,linewidth=3.5)

pl.plot(shape[:,0],-shape[:,1],’k’,linewidth=3.5)

pl.plot((-A,-A),(-Chi/2,Chi/2),’k’,linewidth=3.5)

pl.axes().set_aspect(’equal’)

pl.show()

C.14 FieldTest.py

import numpy as np

import asearch

log=np.genfromtxt(’/home/aj/Dropbox/FieldTrials/04092013-\

Series2/LOG00314.TXT’,delimiter=’,’)

route=list()

#route.append( (28.12610,-80.63946))

for i in range(16,log.shape[0]):

route.append([log[i][4],log[i][5]])

route=np.array(route)

asearch.Route2KML([route], ’FieldTest-1.kml’)

189

Page 191: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

C.15 csv2kml.py

import numpy as np

import asearch

WPFileName=’Gulf100.csv’

rp=np.genfromtxt(WPFileName,delimiter=’,’)

routes=list()

#for j in range(rp.shape[0]):

for j in range(100):

routes.append(np.array( [ (rp[j,0],rp[j,1]), (rp[j,2],rp[j,3]) ] ) )

for j in range(len(routes)):

print ’------------------’

for i in range(len(routes[j])):

print routes[j][i,1],routes[j][i,0]

asearch.Route2KML(routes,’Gulf100.kml’)

C.16 ANav.py

import random

import time

import numpy as np

import pylab as pl

import environment as env

import asearch as a

import navigation as nav

print ’Loading VField’

#VFileName=’SEGulf2012_05_01.npz’

VFileName=’Carib2012_05_01.npz’

#HFileName=’etopo1_gulf0.npz’

HFileName=’etopo1_carib0.npz’

#WPFileName=’Gulf100.csv’

WPFileName=’EWCarib150.csv’

VF=env.VField(VFileName)

print ’Velocity File Loaded.’

HF=env.HField(HFileName)

print ’Depth File Loaded.’

WPs=np.genfromtxt(WPFileName,delimiter=’,’)

print ’Waypoint File Loaded.’

print ’All Files Loaded.’

print VF.maxmag

fll=np.array( (14.5,-63.0,16.5,-59.5) ) #Carib

#fll=np.array( (23.1,-88.5,26.1,-83.5) ) #Gulf

Range=150000.0 # meters

random.seed(time.time())

timeout=24*7.0*3600

trials=100

spd=3.0

190

Page 192: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

leg=10000.0 # meters/sec

tstep=360.0*0.5/(spd) # seconds

thresh=(spd+VF.maxmag)*tstep # meters

print ’Range Threshold: ’,thresh

depth=2.0

for t in xrange(1):

logroot=’Results/Carib_’+str(int(spd*100))+\

’cms/Series0/TrialLogAStar_’+str(int(leg/1000))+’k_’+str(t)

logname=logroot+str(’.csv’)

for trial in range(100):

########################## TRIAL BOUNDARY ############

spawn=(WPs[trial,0],WPs[trial,1],0.0)

target=(WPs[trial,2],WPs[trial,3],0.0)

t0=time.time()

mintime=a.AstarSearchTime(spawn,target, leg, 0, spd,\

tstep, VF,depth,HF,timeout)

t1=time.time()

if mintime[1]!=None:

outtime=mintime[1]*3600

else:

outtime=mintime[1]

aconv=0

if outtime!=None:

aconv=1

log=open(logname,’a’)

outString=str(trial)+’,’+str(t1-t0)+’,’+str(spawn[0])\

+’,’+str(spawn[1])\

+’,’+str(target[0])+’,’+str(target[1])\

+’,’+str(nav.gcRng(spawn,target))\

+’,’+str(spd)\

+’,’+str(leg)\

+’,’+str(tstep)\

+’,’+str(timeout)\

+’,’+str(outtime)\

+’,’+str(aconv)+’\n’

print outString

log.write(outString)

log.close()

C.17 GrouperNav.py

import random

import time

import numpy as np

import agents

import environment as env

import navigation as nav

#VFileName=’SEGulf2012_05_01.npz’

191

Page 193: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

VFileName=’Carib2012_05_01.npz’

#HFileName=’etopo1_gulf0.npz’

HFileName=’etopo1_carib0.npz’

#WPFileName=’Gulf100.csv’

WPFileName=’EWCarib150.csv’

#fll=np.array( (23.1,-88.5,26.1,-83.5) )

fll=np.array( (14.5,-63.0,16.5,-59.5) )

VF=env.VField(VFileName)

HF=env.HField(HFileName)

WPs=np.genfromtxt(WPFileName,delimiter=’,’)

print ’File Loaded.’

print HF.maxz

Range=150000.0 # meters

Doff=np.degrees(Range/(nav.REarth*np.cos(np.radians(\

(fll[0]+fll[2])/2.0))))

random.seed(time.time())

trials=100

spd=0.4 # meters/sec

power=1.0

timestep=360.0*0.5/(spd) # seconds

thresh=(spd+VF.maxmag)*timestep # meters

print ’Range Threshold: ’,thresh

timeout=24*7.0*3600

StrayFactor=2.0 # Factor between still-water, straight-line time, and

# maximum allowable time

energy=np.nanmin((Range/spd*StrayFactor,timeout))#16#24*7*3600.0

Eepochs=5 #Maximum Epochs to search for a solution

Xepochs=10 #Maximum Epochs to optimize a solution

doras=500 #int(np.floor((groupers**(1.0/3.0))**4))

factor=10

depth=2.0

stdevstart=0.5

for t in (5,):

M=0.001#1.0/(10.0**float(m))

T=2700 #t*900/5-900+2700

logroot=’Results/Carib_’+str(int(spd*100))+\

’cms/Series0/TrialLog’+’_S’+str(doras)+’_I’+\

str(doras/factor)+’_’+str(t)

logname=logroot+str(’.csv’)

cfgout=open(logroot+’.xml’,’w’)

cfgout.write(’<?xml version="1.0"?>\n’)

cfgout.write(’<xml>\n’)

cfgout.write(’\t\t<Notes></Notes>\n’)

cfgout.write(’\t\t<velocityFile>’+VFileName+’</velocityFile>\n’)

cfgout.write(’\t\t<latlon>’+str(fll)+’</latlon>\n’)

cfgout.write(’\t\t<HeightFile>’+HFileName+’</HeightFile>\n’)

cfgout.write(’\t\t<M>’+str(M)+’</M>\n’)

cfgout.write(’\t\t<T>’+str(T)+’</T>\n’)

cfgout.write(’\t\t<range>’+str(Range)+’</range>\n’)

cfgout.write(’\t\t<speed>’+str(spd)+’</speed>\n’)

cfgout.write(’\t\t<energy>’+str(energy)+’</energy>\n’)

cfgout.write(’\t\t<power>’+str(power)+’</power>\n’)

192

Page 194: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

cfgout.write(’\t\t<timeStep>’+str(timestep)+’</timeStep>\n’)

cfgout.write(’\t\t<threshold>’+str(thresh)+’</threshold>\n’)

cfgout.write(’\t\t<timeout>’+str(timeout)+’</timeout>\n’)

cfgout.write(’\t\t<Sepochs>’+str(Eepochs)+’</Sepochs>\n’)

cfgout.write(’\t\t<Oepochs>’+str(Xepochs)+’</Oepochs>\n’)

cfgout.write(’\t\t<doras>’+str(doras)+’</doras>\n’)

cfgout.write(’\t\t<stdev>’+str(stdevstart)+’</stdev>\n’)

cfgout.write(’\t\t<depth>’+str(depth)+’</depth>\n’)

cfgout.write(’</xml>\n’)

cfgout.close()

for trial in range(78,trials):

########################## TRIAL BOUNDARY #################

spawn=(WPs[trial,0],WPs[trial,1])

target=(WPs[trial,2],WPs[trial,3])

hdg=np.degrees(nav.gcCrs(spawn,target)) # ROUND EARTH

epoch=0

bestp=None

bestc=None

bestd=None

firstbest=None

firstsolEpoch=None

firstsolParams=(0,0,0,0)

bestDEpoch=None

bestDParams=(0,0,0,0)

alist=list()

missiontime=0.0

t0=time.time()

############EXPLORATION

while (epoch<=Eepochs and bestDParams==(0,0,0,0)):

if len(alist)==0:

alist.append([agents.pigeonAgent(spawn,depth,hdg,\

spd,target,thresh,energy,power),None])

alist.append([agents.crabAgent(spawn,depth,hdg,spd,\

target,thresh,energy,power),None])

for i in range(doras):

alist.append([agents.dorAgent(random.random()\

*2-1,random.random()*2-1,random.random()\

*2-1,random.random()*2-1,spawn,depth,hdg\

,spd,target,thresh,energy,power,M,T),None])

missiontime=0.0

epoch=epoch+1

else:

missiontime=missiontime+timestep

kills=list()

for i in xrange(len(alist)):

killi=False

alist[i][0].updateAll(VF.linterpTDLL(\

(missiontime-timestep)/3600.0,depth, \

alist[i][0].pos[0], alist[i][0].pos[1]),\

HF.grad(alist[i][0].pos,10.0),\

HF.linterpll(alist[i][0].pos),timestep,0)

if (missiontime>=timeout or np.isnan(\

193

Page 195: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

alist[i][0].pos[0]) \

or alist[i][0].energy <= 0.0 \

or ( ((fll[0]<alist[i][0].pos[0]<fll[2]) \

and (fll[1] < alist[i][0].pos[1]<fll[3]))\

!=True )):

killi=True

elif nav.gcRng(alist[i][0].pos,alist[i][0].tgt)\

<thresh:

drdt=(nav.gcRng(alist[i][0].prepos,\

alist[i][0].tgt)-nav.gcRng(\

alist[i][0].pos,alist[i][0].tgt))\

/timestep

dt=nav.gcRng(alist[i][0].pos,\

alist[i][0].tgt)/drdt

if drdt<0:

print ’WARNING DRDT IS NEGATIVE!! \

THIS CAN\’T HAPPEN!!’

if alist[i][0].__class__==agents.crabAgent\

and (missiontime + dt < bestc \

or bestc == None):

bestc = missiontime+dt

if alist[i][0].__class__==agents.pigeonAgent\

and (missiontime + dt < bestp \

or bestp == None):

bestp = missiontime+dt

if alist[i][0].__class==agents.dorAgent \

and ( missiontime + dt < bestd \

or bestd == None ):

bestd = missiontime+dt

if bestDEpoch==None:

firstbest=missiontime+dt

firstsolEpoch=epoch

firstsolParams=(alist[i][0].A,\

alist[i][0].B,alist[i][0].C,\

alist[i][0].D)

bestDEpoch=epoch

bestDParams=(alist[i][0].A,\

alist[i][0].B,alist[i][0].C,\

alist[i][0].D)

killi=True

if killi==True:

kills.append(alist[i])

for i in xrange(len(kills)):

alist.remove(kills[i])

del kills

##################EXPLOITATION:

if bestDParams!=(0,0,0,0):

epoch=0

while (epoch < Xepochs):

if len(alist)==0:

stdev=stdevstart*(1-float(epoch)/Xepochs)

for i in range(doras/factor):

194

Page 196: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

alist.append([agents.dorAgent(random.gauss( \

bestDParams[0],stdev),random.gauss( \

bestDParams[1],stdev),random.gauss(\

bestDParams[2],stdev),random.gauss(\

bestDParams[3],stdev),spawn,depth,\

hdg,spd,target,thresh,energy,power,\

M,T),None])

missiontime=0.0

epoch=epoch+1

else:

missiontime=missiontime+timestep

kills=list()

for i in xrange(len(alist)):

killi=False

alist[i][0].updateAll(VF.linterpTDLL((\

missiontime-timestep) \

/3600.0,depth, alist[i][0].pos[0],\

alist[i][0].pos[1]), \

HF.grad(alist[i][0].pos,10.0), \

HF.linterpll(alist[i][0].pos),\

timestep,0)

if (missiontime>=timeout or np.isnan(\

alist[i][0].pos[0]) \

or alist[i][0].energy <= 0.0 \

or ( ((fll[0]<alist[i][0].pos[0]\

<fll[2]) and (fll[1] < \

alist[i][0].pos[1]<fll[3]))!=True )):

killi=True

elif nav.gcRng(alist[i][0].pos,\

alist[i][0].tgt)<thresh:

drdt=(nav.gcRng(alist[i][0].prepos,\

alist[i][0].tgt)-nav.gcRng(\

alist[i][0].pos,\

alist[i][0].tgt))/timestep

if drdt<0:

print ’WARNING DRDT IS NEGATIVE!!\

THIS CAN\’T HAPPEN!!’

dt=nav.gcRng(alist[i][0].pos,\

alist[i][0].tgt)/drdt

if alist[i][0].__class__==\

agents.dorAgent and ( missiontime \

+ dt < bestd or bestd == None ):

bestd = missiontime+dt

bestDEpoch=epoch

bestDParams=(alist[i][0].A,\

alist[i][0].B,alist[i][0].C,\

alist[i][0].D)

killi=True

if killi==True:

kills.append(alist[i])

for i in xrange(len(kills)):

alist.remove(kills[i])

195

Page 197: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

del kills

#######################TRIAL BOUNDARY####################

t1=time.time()

cconv=False

pconv=False

dconv=False

if bestc!=None:

cconv=True

if bestp!=None:

pconv=True

if bestd!=None:

dconv =True

log=open(logname,’a’)

outString=str(trial)+’,’+str(t1-t0)+’,’+str(spawn[0])\

+’,’+str(spawn[1])\

+’,’+str(target[0])+’,’+str(target[1])\

+’,’+str(nav.gcRng(spawn,target))\

+’,’+str(doras)+’,’+str(bestp)\

+’,’+str(bestc)+’,’+str(bestd)\

+’,’+str(bestDEpoch)+’,’+str(bestDParams[0])\

+’,’+str(bestDParams[1])+’,’+str(bestDParams[2])\

+’,’+str(bestDParams[3])+’,’+str(M)\

+’,’+str(firstsolEpoch)+’,’+str(firstbest)+’,’\

+’,’+str(firstsolParams[0])\

+’,’+str(firstsolParams[1])\

+’,’+str(firstsolParams[2])\

+’,’+str(firstsolParams[3])\

+’,’+str(int(pconv))+’,’+str(int(cconv))\

+’,’+str(int(dconv))+’\n’

print ’TRIAL: ’+str(t)

print outString

log.write(outString)

log.close()

C.18 GrouperNav0A.py

import random

import time

import numpy as np

import agents

import environment as env

import navigation as nav

#for i in (’G’,’C’):

for i in (’C’,):

#for j in (0.4,1.5,3.0):

for j in (3.0,):

if i==’G’:

base=’Results/Gulf_’

196

Page 198: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

WPFileName=’Gulf100.csv’

VFileName=’SEGulf2012_05_01.npz’

HFileName=’etopo1_gulf0.npz’

fll=np.array( (23.1,-88.5,26.1,-83.5) )

elif i==’C’:

base=’Results/Carib_’

WPFileName=’EWCarib150.csv’

VFileName=’Carib2012_05_01.npz’

HFileName=’etopo1_carib0.npz’

fll=np.array( (14.5,-63.0,16.5,-59.5) )

WPs=np.genfromtxt(WPFileName,delimiter=’,’)

VF=env.VField(VFileName)

HF=env.HField(HFileName)

print ’File Loaded.’

Range=150000.0 # meters

Doff=np.degrees(Range/(nav.REarth*np.cos(np.radians((fll[0]+\

fll[2])/2.0))))

random.seed(time.time())

trials=100

spd=j # meters/sec

power=1.0

timestep=360.0*0.5/(spd) # seconds

thresh=(spd+VF.maxmag)*timestep # meters

print ’Range Threshold: ’,thresh

timeout=24*7.0*3600

StrayFactor=2.0 # Factor between still-water,

# straight-line time, and

# maximum allowable time

energy=np.nanmin((Range/spd*StrayFactor,timeout))#16#24*7*3600.0

Eepochs=15 #Maximum Epochs to search for a solution

Xepochs=15 #Maximum Epochs to optimize a solution

doras=50 #int(np.floor((groupers**(1.0/3.0))**4))

factor=1

depth=2.0

stdevstart=0.5

for t in xrange(1):

M=0.001#1.0/(10.0**float(m))

T=2700 #t*900/5-900+2700

logroot=base+str(int(spd*100))\

+’cms/Series0/TrialLogZeroA15’+’_S’+str(doras)+’_I’+\

str(doras/factor)+’_’+str(t)

logname=logroot+str(’.csv’)

cfgout=open(logroot+’.xml’,’w’)

cfgout.write(’<?xml version="1.0"?>\n’)

cfgout.write(’<xml>\n’)

cfgout.write(’\t\t<Notes></Notes>\n’)

cfgout.write(’\t\t<velocityFile>’+VFileName\

+’</velocityFile>\n’)

cfgout.write(’\t\t<latlon>’+str(fll)+’</latlon>\n’)

cfgout.write(’\t\t<HeightFile>’+HFileName\

+’</HeightFile>\n’)

cfgout.write(’\t\t<M>’+str(M)+’</M>\n’)

197

Page 199: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

cfgout.write(’\t\t<T>’+str(T)+’</T>\n’)

cfgout.write(’\t\t<range>’+str(Range)+’</range>\n’)

cfgout.write(’\t\t<speed>’+str(spd)+’</speed>\n’)

cfgout.write(’\t\t<energy>’+str(energy)+’</energy>\n’)

cfgout.write(’\t\t<power>’+str(power)+’</power>\n’)

cfgout.write(’\t\t<timeStep>’+str(timestep)\

+’</timeStep>\n’)

cfgout.write(’\t\t<threshold>’+str(thresh)\

+’</threshold>\n’)

cfgout.write(’\t\t<timeout>’+str(timeout)\

+’</timeout>\n’)

cfgout.write(’\t\t<Sepochs>’+str(Eepochs)\

+’</Sepochs>\n’)

cfgout.write(’\t\t<Oepochs>’+str(Xepochs)\

+’</Oepochs>\n’)

cfgout.write(’\t\t<doras>’+str(doras)+’</doras>\n’)

cfgout.write(’\t\t<stdev>’+str(stdevstart)+’</stdev>\n’)

cfgout.write(’\t\t<depth>’+str(depth)+’</depth>\n’)

cfgout.write(’</xml>\n’)

cfgout.close()

for trial in range(trials):

########################## TRIAL BOUNDARY #########

spawn=(WPs[trial,0],WPs[trial,1])

target=(WPs[trial,2],WPs[trial,3])

hdg=np.degrees(nav.gcCrs(spawn,target)) # ROUND EARTH

epoch=0

bestp=None

bestc=None

bestd=None

firstbest=None

firstsolEpoch=None

firstsolParams=(0,0,0,0)

bestDEpoch=None

bestDParams=(0,0,0,0)

alist=list()

missiontime=0.0

t0=time.time()

############EXPLORATION

while (epoch<=Eepochs and bestDParams==(0,0,0,0)):

if len(alist)==0:

for i in range(doras):

alist.append([agents.dorAgent(0.0,\

random.random()*2-1,random.random()\

*2-1,random.random()*2-1,spawn,\

depth,hdg,spd,target,\

thresh,energy,power,M,T),None])

missiontime=0.0

epoch=epoch+1

else:

missiontime=missiontime+timestep

kills=list()

for i in xrange(len(alist)):

198

Page 200: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

killi=False

alist[i][0].updateAll(VF.linterpTDLL(\

(missiontime-timestep)/3600.0,\

depth, alist[i][0].pos[0], \

alist[i][0].pos[1]),\

HF.grad(alist[i][0].pos,10.0), \

HF.linterpll(alist[i][0].pos),\

timestep,0)

if (missiontime>=timeout or np.isnan(alist[i][0].pos[0]) \

or alist[i][0].energy <= 0.0 \

or ( ((fll[0]<alist[i][0].pos[0]<\

fll[2]) and (fll[1] < \

alist[i][0].pos[1]<fll[3]))!=True )):

killi=True

elif nav.gcRng(alist[i][0].pos,\

alist[i][0].tgt)<thresh:

drdt=(nav.gcRng(alist[i][0].prepos,\

alist[i][0].tgt)-nav.gcRng(\

alist[i][0].pos,alist[i][0].tgt\

))/timestep

dt=nav.gcRng(alist[i][0].pos,\

alist[i][0].tgt)/drdt

if drdt<0:

print ’WARNING DRDT IS \

NEGATIVE!! THIS CAN\’T HAPPEN!!’

if alist[i][0].__class__==\

agents.crabAgent and (missiontime\

+ dt < bestc or bestc == None):

bestc = missiontime+dt

if alist[i][0].__class__==\

agents.pigeonAgent and (missiontime\

+ dt < bestp or bestp == None):

bestp = missiontime+dt

if alist[i][0].__class__==\

agents.dorAgent and ( missiontime\

+ dt < bestd or bestd == None ):

bestd = missiontime+dt

if bestDEpoch==None:

firstbest=missiontime+dt

firstsolEpoch=epoch

firstsolParams=\

(alist[i][0].A,\

alist[i][0].B,\

alist[i][0].C,\

alist[i][0].D)

bestDEpoch=epoch

bestDParams=(\

alist[i][0].A,\

alist[i][0].B,\

alist[i][0].C,\

alist[i][0].D)

killi=True

199

Page 201: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

if killi==True:

kills.append(alist[i])

for i in xrange(len(kills)):

alist.remove(kills[i])

del kills

##################EXPLOITATION:

if bestDParams!=(0,0,0,0):

Xepochs=Eepochs-epoch

epoch=0

while (epoch < Xepochs):

if len(alist)==0:

stdev=stdevstart*(1-float(epoch)/Xepochs)

for i in range(doras/factor):

alist.append([agents.dorAgent(0.0,\

random.gauss(\

bestDParams[1],stdev),\

random.gauss(\

bestDParams[2],stdev),\

random.gauss(\

bestDParams[3],stdev),\

spawn,depth,hdg,spd,\

target,thresh,energy,power\

,M,T),None])

missiontime=0.0

epoch=epoch+1

else:

missiontime=missiontime+timestep

kills=list()

for i in xrange(len(alist)):

killi=False

alist[i][0].updateAll(\

VF.linterpTDLL((missiontime-timestep) \

/3600.0,depth,\

alist[i][0].pos[0],\

alist[i][0].pos[1]),\

HF.grad(alist[i][0].pos,10.0),\

HF.linterpll(alist[i][0].pos),\

timestep,0)

if (missiontime>=timeout \

or np.isnan(alist[i][0].pos[0]) \

or alist[i][0].energy <= 0.0 \

or ( ((fll[0]<alist[i][0].pos[0]\

<fll[2]) and (fll[1] < \

alist[i][0].pos[1]<fll[3]))!=True)):

killi=True

elif nav.gcRng(alist[i][0].pos,\

alist[i][0].tgt)<thresh:

drdt=(nav.gcRng(\

alist[i][0].prepos,\

alist[i][0].tgt)-nav.gcRng(\

alist[i][0].pos,\

alist[i][0].tgt))/timestep

200

Page 202: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

if drdt<0:

print ’WARNING DRDT IS \

NEGATIVE!! THIS CAN\’T \

HAPPEN!!’

dt=nav.gcRng(alist[i][0].pos,\

alist[i][0].tgt)/drdt

if alist[i][0].__class__==\

agents.dorAgent and(missiontime\

+ dt < bestd or bestd == None ):

bestd = missiontime+dt

bestDEpoch=epoch

bestDParams=(\

alist[i][0].A,\

alist[i][0].B,\

alist[i][0].C,\

alist[i][0].D)

killi=True

if killi==True:

kills.append(alist[i])

for i in xrange(len(kills)):

alist.remove(kills[i])

del kills

#######################TRIAL BOUNDARY####################

t1=time.time()

cconv=False

pconv=False

dconv=False

if bestc!=None:

cconv=True

if bestp!=None:

pconv=True

if bestd!=None:

dconv =True

log=open(logname,’a’)

outString=str(trial)+’,’+str(t1-t0)\

+’,’+str(spawn[0])\

+’,’+str(spawn[1])\

+’,’+str(target[0])+’,’+str(target[1])\

+’,’+str(nav.gcRng(spawn,target))\

+’,’+str(doras)+’,’+str(bestp)\

+’,’+str(bestc)+’,’+str(bestd)\

+’,’+str(bestDEpoch)\

+’,’+str(bestDParams[0])\

+’,’+str(bestDParams[1])\

+’,’+str(bestDParams[2])\

+’,’+str(bestDParams[3])+’,’+str(M)\

+’,’+str(firstsolEpoch)\

+’,’+str(firstbest)+’,’\

+’,’+str(firstsolParams[0])\

+’,’+str(firstsolParams[1])\

+’,’+str(firstsolParams[2])\

+’,’+str(firstsolParams[3])\

201

Page 203: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

+’,’+str(int(pconv))+’,’+str(int(cconv))\

+’,’+str(int(dconv))+’\n’

print outString

log.write(outString)

log.close()

trial+=1

C.19 PNav.py

import random

import time

import numpy as np

import agents

import environment as env

import navigation as nav

for i in (’G’,’C’):

for j in (0.4,1.5,3.0):

if i==’G’:

base=’Results/Gulf_’

WPFileName=’Gulf100.csv’

VFileName=’SEGulf2012_05_01.npz’

HFileName=’etopo1_gulf0.npz’

fll=np.array( (23.1,-88.5,26.1,-83.5) )

elif i==’C’:

base=’Results/Carib_’

WPFileName=’EWCarib150.csv’

VFileName=’Carib2012_05_01.npz’

HFileName=’etopo1_carib0.npz’

fll=np.array( (14.5,-63.0,16.5,-59.5) )

WPs=np.genfromtxt(WPFileName,delimiter=’,’)

VF=env.VField(VFileName)

HF=env.HField(HFileName)

print ’File Loaded.’

Range=150000.0 # meters

Doff=np.degrees(Range/(nav.REarth*np.cos(np.radians((fll[0]+\

fll[2])/2.0))))

random.seed(time.time())

trials=100

spd=j # meters/sec

power=1.0

timestep=360.0*0.5/(spd) # seconds

thresh=(spd+VF.maxmag)*timestep # meters

print ’Range Threshold: ’,thresh

timeout=24*7.0*3600

StrayFactor=2.0 # Factor between still-water, straight-line

# time, and

# maximum allowable time

energy=np.nanmin((Range/spd*StrayFactor,timeout))#16#24*7*3600.0

202

Page 204: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

Xepochs=10 #Maximum Epochs to optimize a solution

depth=2.0

stdevstart=0.5

for t in xrange(1):

logname=base+str(int(100*spd))+’cms/Series0/TrialLog’+’_P_’\

+str(t)+’.csv’

for trial in range(trials):

spawn=(WPs[trial,0],WPs[trial,1])

target=(WPs[trial,2],WPs[trial,3])

hdg=np.degrees(nav.gcCrs(spawn,target)) # ROUND EARTH

epoch=0

bestp=None

bestc=None

alist=list()

missiontime=0.0

t0=time.time()

while (epoch<=1 ):

if len(alist)==0:

alist.append([agents.pigeonAgent(spawn,depth,\

hdg,spd,\

target,thresh,energy,power),None])

missiontime=0.0

epoch=epoch+1

else:

missiontime=missiontime+timestep

kills=list()

for i in xrange(len(alist)):

killi=False

alist[i][0].updateAll(VF.linterpTDLL((\

missiontime-timestep)/3600.0,depth,\

alist[i][0].pos[0], \

alist[i][0].pos[1]), \

HF.grad(alist[i][0].pos,10.0),\

HF.linterpll(alist[i][0].pos),\

timestep,0)

if (missiontime>=timeout or np.isnan(\

alist[i][0].pos[0]) \

or alist[i][0].energy <= 0.0 \

or ( ((fll[0]<alist[i][0].pos[0]<fll[2])\

and (fll[1] < alist[i][0].pos[1]<\

fll[3]))!=True )):

killi=True

elif nav.gcRng(alist[i][0].pos,\

alist[i][0].tgt)<thresh:

drdt=(nav.gcRng(alist[i][0].prepos,\

alist[i][0].tgt)-nav.gcRng(\

alist[i][0].pos,alist[i][0].tgt))\

/timestep

dt=nav.gcRng(alist[i][0].pos,\

alist[i][0].tgt)/drdt

if drdt<0:

print ’WARNING DRDT IS NEGATIVE!! \

203

Page 205: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

THIS CAN\’T HAPPEN!!’

if alist[i][0].__class__==\

agents.crabAgent and (missiontime + dt\

< bestc or bestc == None):

bestc = missiontime+dt

if alist[i][0].__class__==\

agents.pigeonAgent and (missiontime\

+ dt < bestp or bestp == None):

bestp = missiontime+dt

killi=True

if killi==True:

kills.append(alist[i])

for i in xrange(len(kills)):

alist.remove(kills[i])

del kills

t1=time.time()

cconv=False

pconv=False

if bestc!=None:

cconv=True

if bestp!=None:

pconv=True

log=open(logname,’a’)

outString=str(trial)+’,’+str(t1-t0)+’,’+str(spawn[0])\

+’,’+str(spawn[1])\

+’,’+str(target[0])+’,’+str(target[1])\

+’,’+str(nav.gcRng(spawn,target))\

+’,’+str(bestp)\

+’,’+str(int(pconv))\

+’\n’

print outString

log.write(outString)

log.close()

trial+=1

C.20 CNav.py

import random

import time

import numpy as np

import agents

import environment as env

import navigation as nav

for i in (’G’,’C’):

for j in (0.4,1.5,3.0):

if i==’G’:

base=’Results/Gulf_’

WPFileName=’Gulf100.csv’

204

Page 206: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

VFileName=’SEGulf2012_05_01.npz’

HFileName=’etopo1_gulf0.npz’

fll=np.array( (23.1,-88.5,26.1,-83.5) )

elif i==’C’:

base=’Results/Carib_’

WPFileName=’EWCarib150.csv’

VFileName=’Carib2012_05_01.npz’

HFileName=’etopo1_carib0.npz’

fll=np.array( (14.5,-63.0,16.5,-59.5) )

WPs=np.genfromtxt(WPFileName,delimiter=’,’)

VF=env.VField(VFileName)

HF=env.HField(HFileName)

print ’File Loaded.’

Range=150000.0 # meters

Doff=np.degrees(Range/(nav.REarth*np.cos(np.radians((fll[0]\

+fll[2])/2.0))))

random.seed(time.time())

trials=100

spd=j # meters/sec

power=1.0

timestep=360.0*0.5/(spd) # seconds

thresh=(spd+VF.maxmag)*timestep # meters

print ’Range Threshold: ’,thresh

timeout=24*7.0*3600

StrayFactor=2.0 # Factor between still-water, straight-line

# time, and

# maximum allowable time

energy=np.nanmin((Range/spd*StrayFactor,timeout))#16#24*7*3600.0

Xepochs=10 #Maximum Epochs to optimize a solution

depth=2.0

stdevstart=0.5

for t in xrange(1):

logname=base+str(int(100*spd))+’cms/Series0/TrialLog’+\

’_C_’+str(t)+’.csv’

for trial in range(trials):

spawn=(WPs[trial,0],WPs[trial,1])

target=(WPs[trial,2],WPs[trial,3])

hdg=np.degrees(nav.gcCrs(spawn,target)) # ROUND EARTH

epoch=0

bestp=None

bestc=None

alist=list()

missiontime=0.0

t0=time.time()

while (epoch<=1 ):

if len(alist)==0:

alist.append([agents.crabAgent(spawn,depth,hdg,spd,\

target,thresh,energy,power),None])

missiontime=0.0

epoch=epoch+1

else:

missiontime=missiontime+timestep

205

Page 207: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

kills=list()

for i in xrange(len(alist)):

killi=False

alist[i][0].updateAll(VF.linterpTDLL((\

missiontime-timestep) \

/3600.0,depth, alist[i][0].pos[0],\

alist[i][0].pos[1]), \

HF.grad(alist[i][0].pos,10.0), \

HF.linterpll(alist[i][0].pos),\

timestep,0)

if (missiontime>=timeout or np.isnan(\

alist[i][0].pos[0]) \

or alist[i][0].energy <= 0.0 \

or ( ((fll[0]<alist[i][0].pos[0]<fll[2]) \

and (fll[1] < alist[i][0].pos[1]<fll[3]))\

!=True )):

killi=True

elif nav.gcRng(alist[i][0].pos,\

alist[i][0].tgt)<thresh:

drdt=(nav.gcRng(alist[i][0].prepos,\

alist[i][0].tgt)-nav.gcRng(\

alist[i][0].pos,alist[i][0].tgt))\

/timestep

dt=nav.gcRng(alist[i][0].pos,\

alist[i][0].tgt)/drdt

if drdt<0:

print ’WARNING DRDT IS NEGATIVE!!\

THIS CAN\’T HAPPEN!!’

if alist[i][0].__class__==\

agents.crabAgent and (missiontime + dt\

< bestc or bestc == None):

bestc = missiontime+dt

killi=True

if killi==True:

kills.append(alist[i])

for i in xrange(len(kills)):

alist.remove(kills[i])

del kills

t1=time.time()

cconv=False

if bestc!=None:

cconv=True

log=open(logname,’a’)

outString=str(trial)+’,’+str(t1-t0)+’,’+str(spawn[0])\

+’,’+str(spawn[1])\

+’,’+str(target[0])+’,’+str(target[1])\

+’,’+str(nav.gcRng(spawn,target))\

+’,’+str(bestc)\

+’,’+str(int(cconv))\

+’\n’

print outString

log.write(outString)

206

Page 208: Deliberative Optimization of Reactive Agents for ...my.fit.edu/~swood/Anthony Jones Dissertation 2013.pdf · Deliberative Optimization of Reactive Agents for Autonomous Navigation

log.close()

trial+=1

207