Ant Colony Algorithms and its applications to Autonomous...
Transcript of Ant Colony Algorithms and its applications to Autonomous...
IN DEGREE PROJECT MATHEMATICS,SECOND CYCLE, 30 CREDITS
, STOCKHOLM SWEDEN 2017
Ant Colony Algorithms and its applications to Autonomous Agents Systems
DANIEL JARNE ORNIA
KTH ROYAL INSTITUTE OF TECHNOLOGYSCHOOL OF ENGINEERING SCIENCES
Ant Colony Algorithms and its applications to Autonomous Agents Systems DANIEL JARNE ORNIA Degree Projects Systems Engineering (30 ECTS credits) Degree Programme in Aerospace Engineering (120 credits) KTH Royal Institute of Technology year 2017 Supervisor at TU Delft: Manuel Mazo Supervisor at KTH: Xiaoming Hu Examiner at KTH: Xiaoming Hu
TRITA-MAT-E 2017:74 ISRN-KTH/MAT/E--17/74--SE Royal Institute of Technology School of Engineering Sciences KTH SCI SE-100 44 Stockholm, Sweden URL: www.kth.se/sci
Abstract I
Med den senaste tidens utveckling inom autonoma agentsystem och teknologier, finns ett okat
intresse for utveckling av styralgoritmer och metoder for att koordinera stora mangder roboten-
heter. Inom detta omrade visar anvandandet av biologiskt inspirerade algoritmer, baserade pa
naturliga svarmbeteenden, intressanta egenskaper som kan utnyttjas i styrandet av system som
innefattar ett flertal agenter. Dessa ar uppbyggda av simpla instruktioner och kommunikation-
smedel for att tillgodose struktur i systemet.
I synnerhet fokuserar detta masterexamensarbete pa studier av Ant Colony-algoritmer, baser-
ade pa stigmergy-interaktion for att koordinera enheter och fa dem att utfora specifika uppgifter.
Den forsta delen behandlar den teoretiska bakgrunden och konvergensbevis medan den andra
delen i huvudsak bestar av experimentella simuleringar samt resultat. Till detta andamal har
metriska parametrar utvecklats, vilka ansags sarskilt anvandbara nar planeringen av en enkel
bana studerades. Huvudkonceptet som utvecklats i detta arbete ar en tillampning av Shannon-
Entropi, vilket mater enhetlighet och ordning i ett system samt den viktade grafen. Denna
parameter har anvants for att studera prestandan och resultaten hos ett autonomt agentsys-
tem baserat pa Ant Colony-algoritmer.
Slutligen har denna styralgoritm modifierats for att utveckla ett handelsestyrt styrschema.
Genom att anvanda egenskaperna hos den viktade grafen (entropi) tillsammans med sensorsys-
temet hos agentenheterna, sa har en decentraliserad handelsestyrd metod implementerats, tes-
tats och visat sig ge okad effektivitet gallande utnyttjandet av systemresurser.
i
Abstract II
With the latest advancements in autonomous agents systems and technology, there is a growing
interest in developing control algorithms and methods to coordinate large numbers of robotic
entities. Following this line of work, the use of biologically inspired algorithms based on swarm
emerging behaviour presents some really interesting properties for controlling multiple agents.
They rely on very simple instructions and communications to develop a coordinated structure
in the system.
Particularly, this master thesis focuses on the study of Ant Colony algorithms based on stig-
mergy interaction to coordinate agents and perform a certain task. The first part focuses on
the theoretical background and algorithm convergence proof, while the second part consists of
experimental simulations and results. For this, some metric parameters have been developed
and found to be especially useful in the study of a simple path planning test case. The main
concept developed in this work is an adaptation of Shannon Entropy that measures uniformity
and order in the system and the weighted graph. This parameter has been used to study the
performance and results of an autonomous agent system based on Ant Colony algorithms.
Finally, this control algorithm has been modified to develop an event-triggered control scheme.
Using the properties of the weighted graph (Entropy) and the sensing of the agents, a decentral-
ized event-triggered method has been implemented and tested, and has been found to increase
efficiency in the usage of system resources.
iii
Acknowledgements
I would like to express my deepest gratitude first to Professor Manuel Mazo, who was more
than welcoming from the first moment and proposed me this project (that was in an extremely
early stage) when I first approached him. He guided me through the work and pushed me to
tackle the issues I was more reluctant to.
I want to thank as well the entire DCSC (Delft Center for Systems and Control) department
and TU Delft for having me and helping me out with anything I needed.
Also I thank Professor Xiaoming Hu from KTH for being my supervisor at my home university,
and the entire KTH institution not only for this work, but for the last years of studies that
exceeded the expectations (in every way) I had when I first got to Sweden two years ago.
Finally, I would like to thank my parents Sergio and Dolores for their support, their feedback
and second opinion whenever I sent them pieces of this work.
v
Contents
Abstract I i
Abstract II i
Acknowledgements v
1 Introduction 1
1.1 Motivation and Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Goals and Expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Background and Problem Description 3
2.1 Biological Inspiration and Background . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Event-Triggered Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Markov Process, Random Variables and Martingales . . . . . . . . . . . . . . . . 5
2.3.1 Markov Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.3.2 Martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.4 Problem Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.5 Dynamic System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.5.1 Solution Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.5.2 Pheromone Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.5.3 Pheromone as a Random Variable . . . . . . . . . . . . . . . . . . . . . . 15
2.6 Entropy as a Metric Parameter . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
vii
viii CONTENTS
2.6.1 Shannon Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6.2 Maximum and Minimum Entropy . . . . . . . . . . . . . . . . . . . . . . 19
2.6.3 Entropy Function and its Properties . . . . . . . . . . . . . . . . . . . . 21
2.6.4 Entropy Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.7 Graph Convergence Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.8 Entropy-Based Event Triggered Proposal . . . . . . . . . . . . . . . . . . . . . . 29
2.8.1 Proposal A: Entropy-Based Marking Frequency Shift . . . . . . . . . . . 29
2.8.2 Proposal B: Entropy Based Pheromone Intensity Shift . . . . . . . . . . 30
2.9 Decentralized Entropy Trigger . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.9.1 Surrounding Entropy Estimation . . . . . . . . . . . . . . . . . . . . . . 30
2.10 Summary of Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
3 Experimental Analysis and Results 33
3.1 Algorithm Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Other Metric Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 Convergence and Entropy Results . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3.1 One and multiple optimal solutions . . . . . . . . . . . . . . . . . . . . . 35
3.3.2 Parametric convergence analysis . . . . . . . . . . . . . . . . . . . . . . . 41
3.3.3 Entropy Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.4 Convergence Validation Experiments . . . . . . . . . . . . . . . . . . . . . . . . 47
3.5 Entropy-Based Event Triggered Control . . . . . . . . . . . . . . . . . . . . . . . 50
3.5.1 Proposal A: Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5.2 Proposal B: Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.6 Decentralized Entropy Trigger: Results . . . . . . . . . . . . . . . . . . . . . . . 55
3.6.1 Entropy Estimation: Examples . . . . . . . . . . . . . . . . . . . . . . . 55
3.6.2 Decentralized Trigger - Proposal B . . . . . . . . . . . . . . . . . . . . . 57
3.6.3 Decentralized Trigger Results . . . . . . . . . . . . . . . . . . . . . . . . 59
4 Conclusion 62
4.1 Summary of Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.3 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
Bibliography 65
ix
List of Tables
3.1 Standard parameters for simulations . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Maximum Pheromone Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.3 Parameter results for simulation 1 . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4 Entropy limit values for simulation 1 . . . . . . . . . . . . . . . . . . . . . . . . 38
3.5 Parameters for simulation 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.6 Maximum Pheromone Values 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.7 Parameter results for simulation 2 . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.8 Entropy limit values for simulation 2 . . . . . . . . . . . . . . . . . . . . . . . . 39
3.9 Parameters for Convergence Experiments . . . . . . . . . . . . . . . . . . . . . . 41
3.10 Experiment Results for nants = 15 . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.11 Experiment Results for ρ = 0.03 . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.12 Parameters for Proposal A . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.13 Cycle Parameter Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.14 Total amount of Pheromones added . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.15 Parameters for Decentralised Event Triggered Simulations . . . . . . . . . . . . 59
3.16 Results for standard simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.17 Results for different slopes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
xi
List of Figures
2.1 Grid Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Agent walking between nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3 Agent Neighbourhood . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.1 Pheromone Graph evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2 Entropy Results for simulation 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3 Pheromone Graph evolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.4 Entropy Results for simulation 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.5 Entropy Results for variable evaporation . . . . . . . . . . . . . . . . . . . . . . 42
3.6 Entropy Results for variable number of agents . . . . . . . . . . . . . . . . . . . 43
3.7 Convergence in Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.8 Convergence to Optimal Solution . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.9 Entropy at 1500s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.10 Time plot of γ for all nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.11 Time plot of γ ∈ (0, 5) for all nodes . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.12 Time plot of γ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
3.13 Entropy Computation and Entropy Limit Example . . . . . . . . . . . . . . . . 50
3.14 Entropy for Event Triggered and Continuous Marking . . . . . . . . . . . . . . . 51
3.15 Quadratic Event Trigger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.16 Results for z=1/1000 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
xiii
3.17 Results for z=1/800 trigger slope . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.18 Entropy for Intensity Trigger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.19 Global Entropy and Entropy Estimations . . . . . . . . . . . . . . . . . . . . . . 55
3.20 Average Estimated Entropy and Real Time Entropy . . . . . . . . . . . . . . . . 56
3.21 Entropy Results for Linear Threshold . . . . . . . . . . . . . . . . . . . . . . . . 57
3.22 Entropy Results for Quadratic Threshold . . . . . . . . . . . . . . . . . . . . . . 58
3.23 Entropy Results for several slopes . . . . . . . . . . . . . . . . . . . . . . . . . . 60
3.24 Cycle Parameter for Decentralized Simulations . . . . . . . . . . . . . . . . . . . 61
3.25 Convergence Time and Pheromone Efficiency . . . . . . . . . . . . . . . . . . . . 61
xiv
Chapter 1
Introduction
1.1 Motivation and Concepts
In the past couple of decades there has been a growing interest in the systems engineering and
control field to apply computer science concepts and algorithms to engineering problems. From
biologically inspired optimization algorithms to AI and deep learning methods, these tools are
being currently applied to engineering fields such as robotics, transport automation and so on.
The concept of Swarm Intelligence comes from putting together biologically inspired algorithm
with machine learning techniques. It is an analogous idea to Artificial Intelligence; while AI
focuses on having a highly complex system learn to perform highly complex tasks, SI is based
on the idea of having several extremely simple agents or systems learning to perform a complex
task or solve a complex problem by collective emerging behaviour methods. This is why Swarm
Intelligence uses (in many of its forms and applications) biologically inspired systems and
algorithms; nature is full of swarm behaviour examples, from bee hives and ant nests to fish
schools and bird flocks.
Parallel to this, event-triggered control focuses on the active scheduling of control tasks in a
system. Instead of periodic feedback between sensors, processors and actuators, event-triggered
methods focus on defining threshold conditions for each subsystem, so that the control tasks
can be scheduled independently depending on some desired triggering function, greatly reducing
network loads, increasing battery life, etc. This method also enables us to decentralize control
tasks, enabling each set of sensors and actuators to act independently of a central processing
system.
The idea is that emerging behaviour algorithms, and in general most algorithms that go through
some kind of learning process, are usually considered to be a ”black box”. It becomes extremely
1
2 Chapter 1. Introduction
hard to understand the process that develops inside, and it is even harder to tune and modify
the behaviour of the algorithm to obtain different performances. Through the analysis and
application of the metrics we can better understand the process that goes on when the system
is evolving from a uniform and chaotic state to a coordinated and structured behaviour.
Following this line of thought, the work in this thesis will be focused on ant inspired behaviour
applied to autonomous agent systems. In this case Ant Algorithms that have been for long
applied to discrete optimization problems can be used to control a physical system of simple
agents (robots, drones, etc...), focusing on a simple path planning robotic problem. To study
these methods and have some validation, some metric concepts will be developed and a deeper
mathematical reasoning will be laid out in order to argue and justify the validity of the work.
1.2 Goals and Expectations
The main goals in this work will be related to the two lines of study mentioned before; Event-
Triggered Control and Swarm Intelligence. Hence, we can summarize the goals and expected
outcomes:
• Study and develop new methods and metrics for analyzing Swarm Intelligence Algorithms,
applied in particular to Ant Colony Algorithms, to further understand and represent the
behaviour and evolution process for these algorithms.
• Develop a control routine based on this algorithm for a path planning system of multi-
agents.
• Analyze the impact that the problem parameters have on the convergence and behaviour
of the system.
• Suggest and implement solutions and schemes for turning the algorithm into an event-
based triggered system, relating the coordination of the agents to the interaction with
the environment (pheromone marking).
• Explore ways to decentralize the control tasks, so each agent can act independently.
Finally, the work will first be focused on theoretical and simulated environments, leaving the
practical implementation for further work depending on the resources available.
Chapter 2
Background and Problem Description
In order to understand the emerging behavior in the proposed set of autonomous agents through
an ant inspired algorithm, first we will cover the background and specific characteristics of
ant optimisation algorithms, and how they can be used to implement a control system for a
swarm of autonomous agents. Then the specific problem to be solved is described, as well as
the computational implementation of the algorithms for this particular problem. Also, some
concepts such as Entropy and Martingale Theory are presented for further use in the algorithm
convergence and performance analysis.
2.1 Biological Inspiration and Background
In the past years there has been a growing interest in biologically-inspired algorithms for solving
a wide range of different problems. Ant colony algorithms, formally described first by M. Dorigo
([1], [2]) are a kind of insect inspired algorithms which (like bee-hive algorithms, etc) rely on
emerging behaviour from a big group of simple agents.
In the case of ant algorithms, they are very much based on ant foraging behaviour ([3]). The
biological process that ants follow when foraging to find the shortest path to the food source
is called Stigmergy. The concept of Stigmergy is based on independent agents communicating
through environment interaction. Ants do this by leaving a trail of pheromones when looking
for food. Once the food is found, ants head back to the nest adjusting the pheromone marking
depending on food quality, etc. The implicit idea behind this behaviour is that since pheromones
evaporate with time, shorter paths will have on average a higher amount of pheromones. Other
ants then follow these pheromone trails, probabilistically choosing the ones with a bigger amount
of pheromones. Eventually, only ”short” paths prevail, hence reaching an optimal solution and
having most ants travelling through the shortest path.
Examples of ant inspired algorithms can be found for solving a wide range of optimization
3
4 Chapter 2. Background and Problem Description
problems (see references [4] and [5]). The idea in this case is to modify these algorithms to be
apply them to control a swarm of autonomous agents, getting them to perform a certain task
in a coordinated way. Focusing first on ant foraging, these algorithms can be implemented so
that the ants walking around the domain are thought of as a swarm of autonomous agents.
The algorithm then enables the swarm to eventually converge to a certain organized solution,
starting from a chaos dominated behaviour. This would set the ground for applying these
methods as an event-triggered controller, where the swarm interacts with the environment
depending on the level of chaos or coordination required.
After the literature research process, a gap was noticed in the previous work done in this
field. There is extensive research in using Ant Colony Algorithms to solve discrete optimization
problems. In fact, this was the first use of these algorithms since M. Dorigo ([1]) first introduced
them. Lately, these methods have been explored in practical applications (see [6], [7] and [8]).
This has raised some questions on how complex the agents have to be, and how big must the
surrounding and monitoring system be in order to obtain the desired optimal motion. This
means that, in many cases, the algorithms have to be adapted to comply with the physical and
equipment limitations of the robotic agents.
This is one of the main differences between the version of the ant algorithm developed in this
work and other used in previous optimization problems (see [4]). In this case, the final goal of
the algorithm is to be implemented as a control method for an autonomous multi-agent (swarm)
system. Hence, the algorithm description is focused not on quality or speed of convergence for
this particular optimization problem, but on practicalities when applying this scheme to swarm
coordination and behavior of robotic systems.
The goal of this work is then to propose a continuous form of the algorithm that enables agents
to use it as a control routine, while keeping the resource requirements to the minimum. Also,
try to gain a deeper understanding of these methods, keeping in mind possible future practical
applications and implementations in physical systems.
2.2 Event-Triggered Control
Control tasks in a sensor/actuator system are usually time-triggered; that means the system
checks periodically the sensor signal, and executes the control tasks accordingly. The idea of
event-triggered control is to schedule the control tasks according to the feedback on the sensors,
so that the actuators work only when the signal goes above a desired threshold. So in this case,
the control tasks are triggered by certain ”events”, instead of being scheduled periodically. This
optimizes resources in the system, by taking a more efficient approach to scheduling the control
tasks (for a more detailed description of event-triggered control and its benefits, see [9]).
In our case, this idea can be extended to the ant system. Seeing the environment marking
2.3. Markov Process, Random Variables and Martingales 5
(pheromones) as a control task through which the system seeks a stabilization or convergence,
the standard functioning would be the equivalent to periodic time control. The agents mark
at each time step depending on the type of algorithm. The idea is then to design some kind
of event-triggering function that lets the agents schedule the control tasks more efficiently. So
instead of interacting continuously with the environment (and hence each other) the agents
will only do so when the system goes past a certain limit (triggering) condition. Finding and
applying this condition is hence one of the goals of the work presented here. For this, the
concept of entropy is presented in section 2.6.
2.3 Markov Process, Random Variables and Martingales
In order to study the behaviour of the algorithm and to make sure the modifications to be
introduced in the system are valid, some proof of convergence needs to be developed. For that
we will make use of random variable and probabilistic theory tools such as the concepts of
Markov Process and Martingale.
2.3.1 Markov Process
For this problem it is useful to recall the Markov condition for a stochastic process.
Definition 2.1. [10] A stochastic process defined by the random variable Xn is to be a homo-
geneous Markov process if:
P (Xn = j |Xn−1 = i,Xn−2 = k, ..., X0 = m) = P (Xn = j |Xn−1 = i) = Pij (2.1)
And following this, it is a non-homogeneous Markov process if:
P (Xn = j |Xn−1 = i,Xn−2 = k, ..., X0 = m) = P (Xn = j |Xn−1 = i) = Pij(n) (2.2)
This two conditions mean that, for a stochastic process to be considered a Markov process,
the future state can only depend on the current state, regardless of the previous ones. In the
non-homogeneous case, the transition probabilities depend on the time step (probabilities are
not constant).
2.3.2 Martingales
The concept of Martingale is useful in this particular case, as it will be explained in the following
sections, for its associated convergence theorems.
6 Chapter 2. Background and Problem Description
Definition 2.2. [11] A discrete time stochastic process Xn, n > 1 is a Martingale if E{|Xn|} <∞ and it has the property:
E{Xn |Xn−1, Xn−2, ..., X1} = Xn−1 (2.3)
Similarly, Xn is a sub(super)-martingale if:
E{Xn |Xn−1, ..., X1} > (6)Xn−1 (2.4)
The concept of martingale comes from gambling theory, describing processes where bets depend
on the current record of accumulated wins or losses (See [11]). This concept is useful given the
existence of a set of Martingale convergence theorems.
Theorem 2.1. [11] (Martingale Convergence Theorem) Let (Xn)n>1 be a submartingale (or a
non-negative supermartingale) such that supnE{X+n } < ∞. Then limn→∞Xn = X exists a.s.
(and is finite a.s.). Moreover, X is in L1.
For proof and further details, see [11]. In practice, this theorem means that if a stochastic
process is a non-negative supermartingale, it will converge almost surely to a certain value.
2.4. Problem Description 7
2.4 Problem Description
Overall, the idea behind this method is to achieve coordination in a set of autonomous agents
interacting with each other through the environment. This can be structured as a discrete
optimization problem, assuming there is a finite set of solutions and the agents walk the graph
trying to find the optimal ones. The behaviour of the process can be described by a dynamic
system model and as a random variable.
Consider an optimization problem described by the graph G = (V ,L,W), which consists of
a set of vertex V , a set of links between vertex L, and a set of weights (pheromone values)
W = f(t) associated to the links L.
One of the main differences with other optimization problems is that in this case the weight
field is part of the graph description but also part of the solution; solutions are constructed
increasing pheromones on the desired links, starting at initial node pi and finishing at node pf .
The optimization problem to solve is a path planning or routing problem. Hence, the problem
is defined by (S, g), where :
• S is a set of all feasible solutions with |S| < ∞ defined in terms of node sequences s so
that s = {vi, ..., vk, ..., vf} and they fulfill the conditions:
– vi = pi and vf = pf .
– ls(k),s(k+1) ∈ L ∀ k ∈ [1, |s| − 1].
• g(s) is a cost function, g : S → R>0.
It is assumed that a set S∗ ⊆ S of optimal solutions exists, and it is defined:
g(s∗) = min{g(si)} , ∀i ∈ [1, 2, ...|S|] (2.5)
The problem then consists on finding at least one of the solutions that fulfill condition 2.5 in a
finite amount of time. This is done by a number of agents N that walk the graph stochastically
constructing possible solutions, i.e. reaching the goal node having started from the initial node,
while following a set of plausible links connecting the visited nodes. In this case, focusing first
on a simple implementation, the graph G will be represented by a two-dimensional grid.
Each agent will follow a probabilistic model to construct a possible solution formed by a set of
solution components, which in this case are the subsequent set of links L′ ⊆ L connecting the
set of visited nodes P ′. Between two interconnected nodes (vi, vj) (i.e. for a given link lij ∈ L),
there is an assigned value τij of pheromones associated to that link. Hence, for each time step,
agents choose probabilistically between the connected nearby nodes following these pheromone
trails.
8 Chapter 2. Background and Problem Description
Agents then deposit a certain amount ∆τ on the links between walked nodes after each time
step. These pheromones have (as in real ant behavior) a certain evaporation factor ρ associated.
Hence, after every time step, the pheromones on the graph are updated by both the movement
of the agents and the evaporation of the pheromones. In this case the evaporation acts as a
form of implicit cost function; longer solutions will take more time to be completed, so the
pheromones decrease further compared to those shorter paths with a faster completion time.
Considering this, the cost function g(s) cannot be computed explicitly and used to actively
affect the behavior of the algorithm after each step. In other ant-inspired algorithms explicit
cost functions (such as path length, time cost, etc) are used to evaluate the quality of the
solutions that are being generated, and then adjust the pheromone values accordingly. In
this case, the pheromones are thought of as a physical trail or mark in the environment left
behind by an agent when moving. Since memory and information sharing is to be kept to
the minimum in each agent, and the idea is to build an algorithm as generalist as possible
(applicable to different kinds of physical systems) that relies on swarm coordination rather
than agent computing capabilities and precision, having this implicit cost function may (and
probably will) slow down the convergence, but it is a much more permissive method regarding
physical implementation.
2.5. Dynamic System Model 9
2.5 Dynamic System Model
We can now model the problem as a discrete time dynamic system. First, let us define the
adjacency matrix for this weighted graph A, |V| × |V| size, such that:
Aij = 1 , ∀i, j : ∃ lij ∈ LAij = 0 else.
The values of Aij are then related to the existence of a valid link between nodes i and j. In
other words, if node i is connected to node j in the graph, Aij = 1. So, in this case, the diagonal
terms of the matrix Aii represent the possibility of enabling agents to stay in the node they are
in. Since for our problem this is not required, it will be set Aii = 0 for all i, so that agents are
not allowed to stay in the same node on consecutive time steps.
For this particular application, we will use a grid graph with equal length in each link. This
means the graph will look like a set of interconnected links such as the one in figure 2.1. If this
Figure 2.1: Grid Structure
was the entire system, the connectivity matrix would have the following structure:
A =
0 1 1 1 1
1 0 0 0 0
1 0 0 0 0
1 0 0 0 0
1 0 0 0 0
(2.6)
Following this framework, the problem can be written then as the following discrete time
dynamic system:
y(t+ 1) = ρ2y(t) ; ρ2 ∈ (0, 1) , y(0) = c , c ∈ R>0
X(t+ 1) = ρ1(X(t)− y(t)A ) +∑n
Bn(t) + y(t+ 1)A ; ρ1 ∈ (0, 1) , X(0) = y(0)A
rn(t+ 1) = f(X(t), rn(t))
(2.7)
10 Chapter 2. Background and Problem Description
In the system described, y(t) is a ”virtual” (it is not added by the agents) base pheromone value
with an assigned specific evaporation rate ρ2 and initial value c. This is introduced so that the
graph is initially explorable for all ants, without the need of real pheromone spreading, with
ρ1 being the real pheromone evaporation rate. The state variables X(t) and rn(t) represent
the amount of pheromones in each link and the position of the agents. According to the graph
description:
X(t) ≡ W(t) , Xij(t) = τij(t)
In this way, we make sure that pheromones are only added to valid links in the graph, and
enables us (together with the artificial pheromones y(t)) to write the probability law in a very
simple form. Finally, Bn is a stochastic (random) coefficient |V|× |V| matrix that only depends
on the current state X(t) generated by agent n in every step. This represents the pheromone
increase function, and it has the values:
Bn,ij(t) = ∆τ , ∀i, j : {rn(t+ 1) = j | rn(t) = i}Bn,ij(t) = 0 else.
Where rn(t) is the position in the graph of agent n at time t, which is a stochastic variable that
depends on the pheromone field. At last, the probability law for the agent steps is described
as:
P (rn(t+ 1) = j | rn(t) = i) =Xij(t)∑kXik(t)
, ∀k : ∃ lik ∈ L (2.8)
This means that the probability of an agent n moving from node i to node j is equal to the
pheromone value of link lij over the sum of pheromones for all possible links starting at node
i. Hence, the whole state of the system G is defined by the two variables:
G = f(X(t), rn(t)) (2.9)
Nevertheless, the variable rn(t) can be substituted by its inverse function, i.e. the amount of
agents in each node ai(t). This form becomes more useful when studying the evolution of the
graph, instead of individual agent positions.
We can now find a random variable that represents the system to study its properties. First,
we define the random variable Y as the pheromone distribution matrix X and the amount of
agents in each node ai(t):
Y (t) =
a1(t)
X(t) ...
a|V|(t)
(2.10)
Hence, our problem can be written as a random variable, and the expected value for a given
time t+ 1 only depends on the variable values at t.
2.5. Dynamic System Model 11
Proposition 2.1. The ant algorithm dynamic system described in 2.4 is a non-homogeneous
Markov Process defined by the random variable Y (t).
Proof. Consider first the pheromone field X(t). In each time step, the values Xij(t) can only
change through two different mechanisms; evaporation and pheromone addition. Hence, the
probability for the values Xij(t+ 1) is:
P{Xij(t+ 1) = ρXij(t) + ki∆τ} =
(Xij(t)∑|V|k=1Xik(t)
)ki
∀ ki ∈ [0, 1, ..., ai(t)] (2.11)
In this case ki is an integer value that represents the different possible states (amount of agents
crossing the path). As it can be seen, the pheromone field is a non-homogeneous Markov
process. Finally, for the value of ai(t):
P{ai(t+ 1) = qi} = f(X(t), aNi (t)) (2.12)
Where aNi (t) is the amount of agents in the neighbourhood of i. This is easy to see when
considering that the position of the agents rn(t) only depend on the pheromone field at time t
(expression 2.8). With X(t) and ai(t) i, the variable Y (t) is defined.
Proposition 2.2. The expected conditional value of the random variable E{Y (t+ 1) |H} only
depends on the values of the variable at time t, i.e:
E{Y (t+ 1) |Y (t)} = E{Y (t+ 1) |Y (t), ..., Y (0)} = f(Y (t)).
Proof. Following the same notation as previously described, X(t) is our pheromone weight
graph. And in this case, ak is the amount of agents placed in node k at time t. Hence,
the variable Y has the size |V| × (|V| + 1). For simplicity in the expressions, we will define
M = |V| + 1. We can now write the expected conditional value for the variable Y , by first
recalling the probability of agents moving from i → j in expression 2.8. Hence, the expected
value of agents at j is:
E{aj(t+ 1) | a(t), X(t)} =
|V|∑l=1
al(t)Xlj(t)∑|V|k=1Xlk(t)
(2.13)
Since the pheromone trails between unconnected nodes are 0 by definition, we can write ex-
pression 2.13 as a sum for all nodes. Now, for the expected conditional value of Y we can use
the classic definition of E{Y (t + 1) |Y (t), ...Y (0)}. This means that the expected conditional
value of Y (t+1) is equal to the different possible outcomes of the variable times the probability
of those outcomes. For the case of the pheromones, it is then:
E{Xij(t+ 1) |Xij(t), ai(t)} = Xij(t)(1− ρ) + p∗ijai(t)∆τ
12 Chapter 2. Background and Problem Description
In this case, p∗ij represents the probability of agents traversing from i → j, and hence, using
2.8:
p∗ij =Xij∑|V|k=1Xik
Expression 2.13 means that the expected conditional value of agents at a certain node j and
time t+ 1 is equal to the amount of agents surrounding j times the probability of these agents
actually moving to j. It is interesting to see that this value does not depend on the current
amount of agents aj(t), since all the agents have to move for each time step. Also, note that
given the definition of the variable Y (t), we can write the number of agents a(t) in terms of Y :
ai(t) = YiM(t)
That is, the set of values contained in the last column of Y . With these expressions, and writing
everything in terms of the variable Y we get:
E{Y (t+ 1) |Y (t)} =
Yij(t)
((1− ρ) + YiM (t)∑|V|
k=1 Yik(t)∆τ)
, ∀ j ∈ [1, 2, ...|V|]
∑|V|k=1 Ykj(t)
Yki(t)∑|V|q=1 Ykq(t)
, j = |V|+ 1
(2.14)
Therefore, the expected value for Y (t+ 1) only depends on the current state Y (t).
2.5.1 Solution Generation
In order to fully define the behaviour of the system presented in (2.7), its relation to the
generation of valid solution must be described. First of all, the agents will store the amount
of steps N-S and E-W they take (two integer values) from the initial node until they reach the
goal node. We will introduce a kernel overlapping the pheromone field, such that:
Xij(t) = Xij(t) ·Qn if ∃t′ ≤ t : rn(t′) = vf (2.15)
Xij(t) = Xij(t) otherwise (2.16)
In condition (2.15), Qn is the kernel virtually modifying the surrounding weights given a certain
desired condition (i.e. the direction towards the nest) for a given agent n.
This kernel is the additional layer added to the algorithm in order to ensure a ”back-and-forth”
agent movement. Other options were considered, such as storing in each agent’s memory all the
followed steps until the goal node is reach, and then walking back that same path until the nest
is reached. The angular kernel was implemented due to a smaller amount of memory required
and the geometry of the problem to be solved. In any case, this additional algorithm layer
must be implemented to impose the desired behaviour conditions to be met, i.e. a recurrent
walk between goal and nest. Hence, in this case, the kernel is computed as follows:
2.5. Dynamic System Model 13
• Each agent stores two integer values, one for the amount of north-south (N-S) steps and
another one for east-west (E-W).
• When the goal is reached for the first time, an angle θf is computed using these values,
corresponding to the angular direction between goal and the nest. This angle is used
thereon to compare with the actual direction of the agent.
• From that point, the actual direction of the agent is compared after each step with the
angle θf , and the kernel Qn is computed accordingly.
In the first implementation, we will design this kernel with a simple angle relative difference:
Qn =|θn(t)− θf |
2π(2.17)
In this case, the angle differences are always computed in the closest arc. It must be noted
that the angle θn is the angle computed by the agent according to the N-S and E-W current
values, and it is relative to the previous limiting node encountered (goal or nest depending on
the cycle). Also, this kernel is designed in the most simple way considering the geometrical
problem.
This method causes that, after the initial exploration phase, the agents are ”pulled” into link se-
quences that start and end at the desired nodes. Then, the evaporation ensures the convergence
towards the shorter solution cycles.
14 Chapter 2. Background and Problem Description
2.5.2 Pheromone Distribution
The behavior of the algorithm must be described in a deeper way first, to better understand
the purpose and the results of the simulations and studies. Consider first a generic agent
constructing a solution following the procedure in expression 2.7. The agent will move between
interconnected nodes from the initial node to the final node in a stochastic way depending on
the pheromone value X between nodes. But for each step, the pheromone values are altered by
both the evaporation factor ρ and the pheromone deposited by the agent ∆τ . It is useful now
Figure 2.2: Agent walking between nodes
to define the average pheromone value in a solution X with an amount of links |l| as follows:
X =
∑|l|i=1 Xl
|l|
In this case, the agent is at the red node for the given time step sequence. Assuming a common
pheromone value τ0 at the beginning of the sequence, and taking |l| as the length of the solution,
we can write the average pheromone value in the given solution after the agent has completed
the cycle as:
X = τ0(1− ρ)|l| +∆τ
|l|
|l|∑i=1
(1− ρ)i−1 (2.18)
It must be noted that this average value represents the whole set of links after the agent has
completed the cycles. Each link will have a different pheromone value, with the stronger ones
being found at the last (most recent) links. It is interesting to see how in this case the ”implicit”
cost function of the problem becomes clear; longer solutions have, in average, lower pheromone
values (and hence a lower chance of being walked again).
Following a similar procedure for a continuous flow of n ants, the pheromone value for a certain
link k at time t would be:
τk(t) = τ0(1− ρ)t + n∆τt∑i=1
(1− ρ)i−1 (2.19)
Now, the average for a set of links in a solution of length |l| that follow equation (2.19), we can
define the following concept:
2.5. Dynamic System Model 15
Proposition 2.3. For a constantly walked solution (set of links), with a number of n agents
and an increment of pheromones ∆τ , the maximum value of average pheromones is:
τmax =n∆τ
ρ|l|(2.20)
Proof. Take expression (2.19), and consider a solution (set of links) l constantly walked in the
same manner. Taking the average pheromone value for this links:
X(t) =1
|l|∑k∈l
τk(t)
If there are n agents, we can consider they are split equally among links, so that we get n/|l|agents per link. Now taking the limit in time, we get the maximum value:
τmax = limt→∞
X(t) =1
|l|limt→∞|l|τ0(1− ρ)t + |l|n∆τ
|l|
t∑i=1
(1− ρ)i−1 =n∆τ
ρ|l|
This result is interesting because it gives us an upper bound to be expected in the frequently
walked solutions, i.e. the optimal solutions. This upper bound is then the maximum amount of
pheromones expected in the graph for the given parameters. The pheromones are then globally
bounded by τmax and τmin = 0.
2.5.3 Pheromone as a Random Variable
Now it is important to understand how these dynamics affect the pheromone field, considering
it to be a random variable (or a set of random variables).
Theorem 2.2. The pheromone values of links with agents on the nodes are sub-martingales,
i.e.:
E{τij(t+ 1) | τij(t), ai(t)} > τij(t) ∀ i : ai(t) 6= 0
If ai(t)/Ti(t) > n/T0, where Ti(t) is the local sum of pheromones in the links connected to i.
Proof. First of all, consider the expected conditional value of a pheromone link τij in its general
form:
E{τij(t+ 1) | τij(t), ai(t)} = τij(1− ρ) + ai(t)∆ττijTi(t)
(2.21)
As described previously, ∆τ is the desired amount of pheromones to be added at every step by
each agent. Consider now for simplicity we take a constant amount of pheromones in the graph
T0, and to ensure that we impose ∆τ = ρT0/n (the benefits of this are further explained in the
16 Chapter 2. Background and Problem Description
following sections). Considering this, we would get the following expression for the expected
conditional value:
E{τij(t+ 1) | τij(t), ai(t)} = τij(1− ρ) + τijρai(t)T0
nTi(t)= τij(1− ρ+ ρ
ai(t)T0
nTi(t)) (2.22)
Consider now the idea of having just one agent walking the graph (n = 1). This condition
should not affect any of the assumptions made until now. This means that now we can divide
the links between the ones without agents on the nodes, and the set of links that share the only
node i with ai = 1. The two expressions for the conditional expected value of τij then become:
E{τij(t+ 1) | τij(t), ai(t) = 1} = τij(1 + ρT0 − TiTi
) > τij(t) (2.23)
E{τij(t+ 1) | τij(t), ai(t) = 0} = τij(1− ρ) < τij(t) (2.24)
Finally, reflecting on the assumption that n = 1 and going back to expression (2.22) what we
are in fact considering is:
1− ρ+ ρai(t)T0
nTi> 1⇔ ai(t)
Ti(t)>
n
T0
The complexity relies in the fact that as the agent moves, the links switch modes, from super to
sub-Martingale and back. Nevertheless, it can be argued that for links in the optimal solution
set, the dynamics of the system will make agents walk over them with a higher frequency. Hence,
what we have is a set of random variables where the ones contained in the optimal solution
set are expected to be sub-Martingales, while the rest are super-Martingales. Furthermore,
if we consider the system relative to the agent(s), we could argue we have a sub-Martingale
”relative” to the observer. From the agent perspective, all the links considered in its visible
neighbourhood are in fact sub-Martingales.
This means that the links that are being walked will in fact be a super-martingale only if the
amount of pheromones added is superior to the amount that is being evaporated (evaporation
is proportional to Ti and pheromone addition is proportional to ai). It can be argued that this
is not the case at every time step, but it is the result to be expected given the dynamics of the
system.
We can in fact generalize the results in Theorem 2.2, if our graph is bidirectionally connected
(as it is in our case.
Theorem 2.3. If the graph is bidirectionally connected (Aij = Aji = 1) the pheromone values
2.5. Dynamic System Model 17
are symmetric such that τij = τji and are a sub-martingale if:
E{τij(t+ 1) | τij(t), ai(t)} > τij(t) ifai(t)
Ti(t)+aj(t)
Tj(t)>
n
T0
So Theorem 2.2 becomes a conservative condition for having a sub-martingale in a bidirection-
ally connected pheromone graph.
Proof. Take expression (2.21), and consider we have a number of agents aj(t) that may also
add pheromones to the link:
E{τij(t+ 1) | τij(t), ai(t)} = τij(t)(1− ρ) + ai(t)∆ττijTi(t)
+ aj(t)∆ττijTj(t)
(2.25)
Applying again ∆τ = ρT0/n, and re-arranging terms:
E{τij(t+ 1) | τij(t), ai(t)} = τij(t)(1− ρ+ ρai(t)T0
nTi(t)+ ρ
aj(t)T0
nTj(t))
So from this, we have a sub-martingale if:
E{τij(t+ 1) | τij(t), ai(t)} > τij(t)⇔ai(t)
Ti(t)+aj(t)
Tj(t)) >
n
T0
(2.26)
And this shows that in fact, Theorem 2.2 is a conservative approach to checking if a pheromone
link behaves as a sub-martingale.
18 Chapter 2. Background and Problem Description
2.6 Entropy as a Metric Parameter
In this particular case of the algorithm, it is interesting to achieve coordination for a set of
autonomous agents moving between targets in a graph. It is then useful to define a metric
parameter to study this coordination. In physics and thermodynamics, entropy is actually a
sort of measure of the amount of order or disorder in a system. In general terms, chaotic
systems will have values of entropy larger than systems with a higher degree of order. It can
be seen how this concept can be useful for this particular case; when studying the graph (and
the agents) we are interested in achieving the highest possible amount of order, both in the
pheromone value and the agent movement, and that would result in a lower entropy value. For
this, we will introduce the concept of Shannon Entropy.
2.6.1 Shannon Entropy
Shannon Entropy is an analogue concept to thermodynamic entropy, used in information theory.
The idea relies on the fact that information gain (in bits) depends on the probability of that
certain piece being part of a given set of information. Hence, for lower probabilities you have
higher information gains; the gain G(B|A) is higher the more unlikely B is with respect to
A. Let us define the probability space A, such that A = {a1..., an} and each element has an
associated probability pi.
Definition 2.3. [12] Given a probability space A consisting on a set of elements with associated
probabilities Ai = (ai, pi), Shannon entropy is then defined as:
H(A) = −n∑i=1
pi log2(pi) (2.27)
Furthermore, since − log2 is strictly convex, we have for a given size of set |A| = n:
H(A) 6 log2(n), withH(A) = log2(n) ⇐⇒ p(ai) = 1/n ∀i (2.28)
Concepts presented in expressions 2.27 and 2.28 are part of basic information theory (for further
details, see Shannon’s work [13]). This means that entropy of a given distribution of size n has
an absolute maximum when all probabilities are equal (i.e. the distribution is uniform). In our
case, for a given pheromone graph, we can build a probabilistic (proportional) distribution of
pheromones by:
pij =τijT
, T =∑L τij
Also, for our discrete time process (∆t = 1), the average entropy can be defined:
2.6. Entropy as a Metric Parameter 19
Definition 2.4. The average entropy for a discrete time process with ∆t = 1, from t = t0 to
t = tf is:
H =1
tf − t0
tf∑t0
H(A(t)) (2.29)
And finally, from now on we will apply this concept to the pheromone graph X(t). Hence:
Definition 2.5. The entropy of a pheromone graph X(t) is defined as:
H (X(t)) = −∑L
τij(t)
Tlog2(
τij(t)
T)
2.6.2 Maximum and Minimum Entropy
It is easy to see now that in our problem, the entropy values for the pheromone graph X are
bounded. First, the higher value of entropy will be the one where all the links have the same
pheromone values.
Proposition 2.4. : The maximum entropy value for a given graph is:
Hmax(X) = log2(|L|) (2.30)
Proof. Take definition 2.4, with n = |L|. It is a consequence of Shannon’s theory.
This definition means that the maximum entropy of the variable set X only depends on the
size of the graph. Plus, in this case the pheromone distribution X is uniform for t = t0, so
recalling (2.28) we have:
Hmax(X) = H(X(t0)) > H(X(t)) , ∀t > t0 (2.31)
It can be argued that in this case, the inequality is strict. After the first step of the agents, a
certain amount ∆τ is added to the first surrounding links. From that moment on, considering
the dynamics of the system 2.7, the evaporation rate is proportional to the current value of
pheromones, while the added pheromones is a constant value. Hence, it is virtually impossible
to get an homogeneous matrix X for any t > t0, since that would imply that all the links go
back to having the same pheromone values as each other.
Now, regarding the lower limit for Hmin, according to the principles behind Shannon Entropy, it
is easy to see that the absolute theoretical minimum for H(X) happens when all the pheromones
are concentrated in one link, with the rest being zero.
20 Chapter 2. Background and Problem Description
Remark 2.1. The absolute minimum entropy for a given graph pheromone distribution X is:
Hmin(X) = −|L|∑i=1
pi log2(pi) = log2(1) = 0 (2.32)
Even though in our algorithm this situation is a pathological case, the graph can present values
close to zero if, for example, all the agents end up trapped in a small number of circling links. It
is now useful to define the case of a group of agents converging to a solution s. If the pheromone
graph perfectly converges to this solution, it means that:
Xij = 0 ∀i, j /∈ s (2.33)
By definition in our problem, an solution is optimal if it exists in the set of feasible solutions
S and has minimum cost (length). Assuming the pheromone values on the solution links are
(almost) constant when the algorithm has converged, we have that for a given solution s, the
probability distribution pi = 1/|s| ∀i.
Definition 2.6. We define the entropy of a graph Xs that has converged to a certain solution
s as:
H(Xs) = −∑
pi log2(pi) = −|s| 1
|s|log2(
1
|s|) = log2(|s|) (2.34)
Furthermore, if a solution is optimal:
H(X∗s ) 6 H(Xs) ∀ s∗ ∈ S∗, s ∈ S
This result means that the entropy is minimum if the solution is optimal. So, if the algorithm
converges to a feasible solution, the quality of the solution can be evaluated using the final
entropy; the closer it is to the minimum value, the better the solution is.
We will also define the concept of normalized entropy for a graph that behaves according to
these conditions.
Definition 2.7. The normalized entropy h of a given pheromone graph X is:
h(X) =H(X)−H(X∗s )
Hmax −H(X∗s )(2.35)
Where Hmax and H(X∗s ) are boundary values for the entropy as defined previously. With this,
the normalized entropy goes from 1 (completely uniform graph) to 0 (converged optimal solu-
tion).
To apply this parameter we need to know H(X∗s ), which means that it can only be computed
on problems where the length of the optimal solution is known. It can be seen that with this
2.6. Entropy as a Metric Parameter 21
definition, we will obtain negative normalized entropy values in the case that the algorithm
converges to an unfinished solution (ants trapped in links without reaching the goal node).
It must be stated that it is not the first time Shannon entropy is used to analyse stochastic
processes (see [14] for example), but it becomes specially interesting in this case. Finally, as a
conclusion:
• Shannon Entropy can be used to measure disorder in a pheromone graph.
• This entropy values are theoretically bounded, and only depend on the problem size.
• The maximum entropy is found at the beginning of the simulation (when X = X0).
• In a set of converged pheromone graphs Xs, the ones converged to optimal solution will
have lower entropy than the rest, i.e.:
H(X∗s ) = minH(Xs) ∀ Xs ∈ Xs (2.36)
2.6.3 Entropy Function and its Properties
Now let’s evaluate the response in entropy for the graph with respect to the pheromone adding
∆τ process. First, consider the case that at a certain t = t′ we stop adding pheromones to the
walked links.
Proposition 2.5. For a given pheromone graph in discrete time X(t), the entropy will remain
constant if no pheromones are added to the graph.
Proof. Let time t′ be the stopping time for the pheromone adding process. The entropy at t′+1
will then be:
H(Xt′+1) = −|L|∑i=1
pi(t′ + 1) log2(pi(t
′ + 1)) =
= −|L|∑ij
τij(t′)(1− ρ)
T (t′)(1− ρ)log2(
τij(t′)(1− ρ)
T (t′)(1− ρ)) =
= −|L|∑ij
τij(t′)
T (t′)log2(
τij(t′)
T (t′)) = H(Xt′)
(2.37)
Since no new pheromones are added to the graph and evaporation is proportional for all links,
the entropy then remains constant.
Proposition 2.6. If evaporation is set to 1 (ρ = 1), the entropy value is then bounded by
H(X(t)) ∈ [0, log2(n)], where n is the total number of agents.
22 Chapter 2. Background and Problem Description
Proof. In this case the expressions for T (t+ 1) and τij(t+ 1) can be written as:
E{T (t+ 1) |T (t), ..., T (0)} = T (t)(1− ρ) + n∆τ = n∆τ
E{τij(t+ 1) | τij(t), ..., τij(0)} = τij(t)(1− ρ) + p∗ijn∆τ = p∗ijn∆τ
In this case, p∗ represents probability of having all ants adding pheromone to τij, since it is a
random variable that depends on the movement of the ants. So now, we can write the expected
value of H for a given instant (t+ 1):
E{H(Xt+1)} = −|L|1∑ij
p∗ijn∆τ
n∆τlog2(
p∗ijn∆τ
n∆τ) (2.38)
Now, the entropy H is bounded between two cases: All the ants are adding pheromones to the
same link (p∗ij = 1) or all ants are adding pheromones to a different link (p∗ij = 1/n):
min{H(Xt+1)} = 1 log2(n∆τ
n∆τ) = 0
max{H(Xt+1)} = −n 1
nlog2(
1
n) = log2(n)
So, in the case for evaporation close to 1, the entropy is bounded by a tougher condition than
the one presented in definition 2.7. And it can be further argued that the entropy will be closer
to 0 than to the upper bound; when all other links have pheromone values really close to 0,
agents will tend to accumulate in each others trails. Hence, since it is possible to have more
than one agent per node, it is expected that agents will accumulate in a lower amount of nodes,
since that would yield higher pheromone values.
2.6. Entropy as a Metric Parameter 23
2.6.4 Entropy Convergence
Convergence has been proven in previous occasions (see [15], [16], [17]) for different Ant Colony
algorithms. In this case, given we have a different (continuous running) version of an Ant
Colony algorithm, we are interested on using entropy to show that the graph will always end
up in a stable distribution of pheromones. First, the amount of added pheromones will be
chosen following the next proposition.
Proposition 2.7. If the pheromone addition parameter is set with the value ∆τ = ρT0/n the
total sum of pheromones in the graph remains constant, where T0 =∑τij(0) and n is the
number of agents.
Proof. The total amount of pheromones in two consecutive time steps is:
T (t) =∑L
τij(t)
T (t+ 1) =∑L
τij(t)(1− ρ) + n∆τ
If the amount of pheromones has to be constant for all time steps:
T (0) = T (t) = T (t+ 1) = T0 ⇒ T (t+ 1) =∑L
τij(t)(1− ρ) + n∆τ = (1− ρ)T (t) + n∆τ = T0
So, finally:
(1− ρ)T0 + n∆τ = T0 ⇒ ∆τ = ρT0/n
Now, the following conjecture has not been proved in this work. But it remains to be studied
in future work with confidence that it can in fact be proved, and would ensure that the algo-
rithm modifications are valid and do not affect the convergence and stability of the system.
Conjecture 2.1. The entropy for a discrete time pheromone graph following the dynamics
presented in this algorithm H(X(t)) is a super-martingale with respect to the pheromone graph
and a set of agents in each k node ak, and will always converge to a value H∞ in finite time.
The following lemmas are those properties that have been proven in relation to Conjecture 2.1.
Lemma 2.1. In a given pheromone graph, the evaporation will cause a decrease in entropy in
the non-walked links ifτij(t)
T< 1
e.
24 Chapter 2. Background and Problem Description
Proof. Consider the entropy contribution of one single link where the pheromones are evapo-
rating:
H(t) = −τij(t)T
log(τij(t)
T)
H(t+ 1) = −τij(t)(1− ρ)
Tlog(
τij(t)(1− ρ)
T)
(2.39)
For simplicity, lets define K =τij(t)
T. We now want to study the parameter relation necessary
to obtain H(t+ 1) < H(t). Therefore, we have:
H(t+ 1) < H(t)⇔ K log(K) > K(1− ρ) log(K(1− ρ))⇔ log(K)
log(K(1− ρ))< 1− ρ
Applying logarithm rules, this yields:
log(K)
log(K(1− ρ))= logK(K(1− ρ))−1 < 1− ρ⇒ K < (1− ρ)
1−ρρ (2.40)
This result means that there is a limit value under which the entropy will decrease as a result
of evaporation. And in fact, taking the limit for ρ→ 0 we obtain the lower boundary:
limρ→0
(1− ρ)1−ρρ =
1
e(2.41)
Furthermore, it is easy to check that the obtained function K(ρ) is convex, positive and strictly
increasing in the interval ρ ∈ [0, 1]. So, if it is guaranteed that the relative pheromone values
of the links stay under 1/e at all times, the contribution to the graph entropy is proven to
decrease for those non-walked links.
Lemma 2.2. : The entropy for a discrete time pheromone graph H(X) is a super-martingale
and will converge to a value in finite time if it can be ensured that:
|l| > e
ai(t)
Ti(t)>
n
T0
Where |l| is the length of the solution to the graph and Ti(t) is the local pheromone sum for the
set of links around node i.
Proof. Let us first define the conditional expected values of the following expressions:
E{−pi log(pi) |H} 6 −E{pi |H} log(E{pi |H}) (2.42)
2.6. Entropy as a Metric Parameter 25
For a proof of this inequality, see Jensen’s Inequality (Theorem 23.9) in [11].
E{Xij(t+ 1) |X(t), ak(t)} =
τij(t)(1− ρ) , ∀ i : ai(t) = 0
τij(t)(1− ρ) + ai(t)∆ττij(t)∑k∈Ni
τik, else
(2.43)
Expression 2.43 comes from the results in expression 2.14, it represents the expected value of
the pheromone levels for the links, depending on if there are any agents in the neighbouring
nodes.
Now, we can write the entropy H(Xt+1) splitting the sum in two terms. The first one for the
set of nodes with no agents LA, and the second one for the sets of links that have an agent
in the common node LB (as done in 2.43). This way, the expected conditional value for the
entropy results:
E{H(t+ 1) |X(t), ak(t)} =E{HA(t+ 1) |X(t), ak(t)}+ E{HB(t+ 1) |X(t), ak(t)} 6
=−∑LA
τij(t)(1− ρ)
Tlog(
τij(t)(1− ρ)
T)−
−∑LB
τij(t)(1− ρ+ ai(t)∆τ∑k τik
)
Tlog(
τij(t)(1− ρ+ ai(t)∆τ∑k τik
)
T)
(2.44)
Now we can apply the previous Lemma as follows. Expression 2.41 provides a maximum value
for the ratio K = τij/T to have the entropy contribution of the first sum term to decrease.
Recalling the expression for τmax:
τmax =n∆τ
ρ|l|=nρT0
nρ|l|=T0
|l|
And applying the results in Lemma 2.1:
τmaxT0
=1
|l|6
1
e⇒ |l| > e
This means, if we have a graph where our optimal solution is longer than e = 2.718 links, the
expected conditional value for the non walked links LA is:
E{HA(X(t+ 1)) |X(t), ak(t)} 6 HA(X(t)) (2.45)
Now for the second sum, we can evaluate the entropy for a set of links Li with a node i
in common that has ai(t) agents. We will use the variables Ti(t) for the pheromone sum in
these links and HL,i for the entropy contribution of these links. Hence, expected value for the
26 Chapter 2. Background and Problem Description
contribution of these links to the global entropy becomes:
E{HL(t+ 1) |X(t), ak(t)} = −∑j∈L4
τij(t)(1− ρ+ ai(t)∆τTi(t)
)
Tlog(
τij(t)(1− ρ+ ai(t)∆τTi(t)
)
T) =
=(1− ρ+ ai(t)∆τ
Ti(t))
T
(−∑j∈L4
τij(t) log(τij(t)(1− ρ+ ai(t)∆τ
Ti(t))
T))
=
=(1− ρ+ ai(t)∆τ
Ti(t))
T
(−∑j∈L4
τij(t)(log(τij(t)) + log(1− ρ+ ai(t)∆τ
Ti(t)
T)))
=
=(1− ρ+ ai(t)∆τ
Ti(t))
T
(−∑j∈L4
τij(t) log(τij(t))− Ti(t) log(1− ρ+ ai(t)∆τ
Ti(t)
T))
=
= (1− ρ+ai(t)∆τ
Ti(t))(− 1
T
∑j∈L4
τij(t) log(τij(t))−Ti(t)
Tlog(1− ρ+
ai(t)∆τ
Ti(t)) +
Ti(t)
Tlog(T )
)Now let us study the first term in brackets. Assuming as before the value ∆τ = ρT/n, and
referring to the term as β, we have:
β = 1− ρ+ai(t)∆τ
Ti(t)= 1− ρ+ ρ
ai(t)T
nTi(t)= 1 + ρ(
ai(t)T
nTi(t)− 1)
Now, we want to check if E{HL(t+ 1) |X(t), ak(t)} 6 HL(t). It is easy to show that HL(t) can
be written as:
HL(t) = − 1
T
∑j∈L4
τij(t)(log(τij(t))− log(T ))
And therefore:
E{HL(t+ 1) |X(t), ak(t)} 6 HL(t)⇔
⇒ β(− 1
T
∑j∈Li
τij(t) log(τij(t))−Ti(t)
Tlog(β) +
Ti(t)
Tlog(T )
)6
6 − 1
T
∑j∈Li
τij(t) log(τij(t)) +Ti(t)
Tlog(T )⇔
⇔ HL(t)(β − 1)− βTi(t)T
log(β) 6 0
(2.46)
Take now expression (2.46). If we re-arrange it, and assuming β 6= 1, we get:
HL(t) 6β
β − 1
Ti(t)
Tlog(β)
Hence, since the entropy is always H > 0, the right side of the inequality must bigger than
2.6. Entropy as a Metric Parameter 27
zero. This means:
log(β) > 0⇒ β > 1⇒ ai(t)T − nTi(t) > 0⇒ ai(t)
Ti(t)>n
T
In this case, ai(t) is a given conditional for the expected value. But in fact, the value is part of
the global random variable, and given the dynamics of the system, we expect to have a bigger
amount of agents in those regions with higher pheromones (simply because given a set of links
the agents will choose to move in probability towards those links with higher values).
If we can ensure these conditions hold, then it means:
E{HA(X(t+ 1)) |X(t), ak(t)} 6 HA(X(t))
E{HB(X(t+ 1)) |X(t), ak(t)} 6 HB(X(t))
E{H(X(t+ 1)) |X(t), ak(t)} 6 H(X(t))
These results need some reasoning to be completed. In fact, the convergence proof relies on the
fact that ai/n > Ti(t)/T for the links that are being walked. It is interesting to see that the
same conclusion was reached in Section 2.5 when proving that the pheromone random variable
were sub and super-Martingales. The convergence properties then rely on the fact that as
the system evolves, the inequalities forced by exploration and solution construction will make
the agents oscillate less between sets of nodes. The links that have not been walked for a
long period of time will have a smaller chance of jumping from a super-martingale state to a
sub-martingale, and the opposite happens for the walked links.
This inequalities between links are actually what ensures convergence to a certain distribution.
If we consider a perfect uniform interconnected graph with the same amount of agents on every
node, it becomes harder to imagine how the system could evolve to a certain distribution (even
though eventually the uniformity would break, and it would probably converge to a random
distribution of values). But what we have here is an interconnected graph where (due to having
a limited amount of agents) the amount of connected nodes in the graph slowly decreases. This
reduces the scope of links that agents can traverse, and forces agents to concentrate on highly
walked links making this highly walked links a super-Martingale in terms of pheromone values
(and making the non-walked links a permanent sub-Martingale).
28 Chapter 2. Background and Problem Description
2.7 Graph Convergence Criteria
Before the results and simulation exploration and evaluation it must be defined what it means
to converge to a solution as well as how to measure it, and how it relates to the entropy
convergence relations presented previously. Since agents are constantly walking the graph and
the moves are stochastic, we must define convergence within a probabilistic range.
Definition 2.8. We say a certain graph (or agent system) has converged to a solution (or set
of links) s ∈ S after a certain time tc if for t > tc we have:
Xkl(t) >1
αXij(t) , ∀{k, l, i, j} : ∃ lij, lkl ∈ L , lkl 6= lij , lkl ∈ s (2.47)
With α ∈ (0, 1) being the desired convergence sensitivity.
This means that the pheromones in all links contained in the solution (or solutions) has to be
bigger than the pheromones in the rest of the graph by a factor of α. Depending on the problem
size and complexity this α could be adjusted as a formality, but a good indication given the
structure of the system would be α 6 0.05. This would mean in practice that the agents have
a maximum of a 5% probability of deviating from the solution they have converged to.
This graph convergence can be related to the entropy convergence concept presented in previous
sections. Technically, it cannot be mathematically proven that entropy convergence in time
means graph convergence in time, i.e. a pheromone graph will remain constant in value and
distribution for all links if the entropy is constant. This is because there exist theoretically
possible situations where this condition would not be met. But it can be shown how entropy
convergence is related to graph convergence for a set of useful cases regarding this problem.
Definition 2.9. We say the entropy of a graph H(X) has converged to a value H∞ at t = tf
if: ∫ tftf−∆t
H(X(t))dt
∆t= H∞ , H(X(t))−H∞ < ε ∀t ∈ [tf −∆t, tf ] (2.48)
Where ∆t is a desired time gap (related to the lenght of the simulation, and ε is the desired
error threshold.
2.8. Entropy-Based Event Triggered Proposal 29
2.8 Entropy-Based Event Triggered Proposal
In this section, different practical proposals to turn the algorithm into an event triggered
controller are presented, to be further analised in Chapter 3. The event triggering will be
based on the entropy levels of the graph as the algorithm evolves. Considering the properties
shown in section 2.6, the entropy levels will be used to determine how close the algorithm is
to convergence, and set the triggering conditions. The different methods will be tested for the
basic case presented before.
We can then define a trigger function Hlim to be applied as threshold for the event triggered
routine.
Definition 2.10. The entropy limit function Hlim to be used as a trigger threshold for the event
triggered control method is defined as:
Hlim = Hmax − zt ∀t ∈ (0,Hmax
z) (2.49)
Or, in its normalised form:
hlim = 1− zt ∀t ∈ (0,1
z) (2.50)
With this, we can set the threshold such that if the system’s entropy H(X(t)) > Hlim(t), the
agents switch between control modes following the desired event triggered proposal.
2.8.1 Proposal A: Entropy-Based Marking Frequency Shift
This first method is based on switching between marking frequencies for a certain entropy
threshold. The basic idea is to set an entropy boundary function that triggers the frequency
switch between the normal mode (all ants mark at all times) and the reduced marking mode
(ants mark X% of the times).This method can be written as a condition for the marking process:
P (∆τ kij = 0 | rk(t+ 1) = j, rk(t) = i) =
{0 if H(X(t)) > Hlim(t)
1− f otherwise, f ∈ (0, 1)(2.51)
In this case f is the desired marking frequency for the system when it is above the entropy
threshold. This means that for a given agent k, the probability of the marking being set to
zero is non-zero if the entropy is below a certain function Hlim. In practice, this means the
algorithm will mark less frequently when the entropy levels are below the desired limit.
30 Chapter 2. Background and Problem Description
2.8.2 Proposal B: Entropy Based Pheromone Intensity Shift
The second proposal has to do with shifting the amount of pheromones ∆τ following the same
rules as previously presented; if the entropy is under a desired value, the agents mark less
intensely, by simply using a proportional factor p ∈ (0, 1). This would be described as follows:
∆τijk(t) = p∆τij ⇐⇒ H(X(t)) > Hlim(t) (2.52)
Hence, if the entropy is below the entropy limit, the agents will mark less intensely.
2.9 Decentralized Entropy Trigger
When using entropy as an event triggered scheduling function in this problem, the main in-
convenient to solve is that it requires from a centralized control system. The agents cannot
measure the whole graph entropy by themselves, so this method has to rely on some kind of
central control system to switch between marking modes.
The next step in this line of thought is then to decentralize the entropy triggering mechanism,
so agents can operate independently (i.e. agents can decide themselves when to switch between
marking frequency).
2.9.1 Surrounding Entropy Estimation
One way to decentralize this triggering system is to have every agent estimate the entropy on
the graph by using the pheromones in their immediate surroundings. The agents can measure
how many pheromones are there in the links in their close neighbourhood (that value is used
for the stochastic walk of the agents). Hence, the agents can calculate the entropy in their
surrounding links, as if it were an independent graph itself. Consider the possible surrounding
(a) Inner Domain (b) Wall (c) Corner
Figure 2.3: Agent Neighbourhood
2.9. Decentralized Entropy Trigger 31
configurations for an agent in the graph presented in figure 2.3. Every link has a certain
pheromone value τi. Starting with the inner domain (a), and following the entropy expression
2.27 presented in Chapter 2, it is easy to calculate the entropy for the neighbour set of links Nas:
H(N ) =τ1
Tlog(
τ1
T) +
τ2
Tlog(
τ2
T) +
τ3
Tlog(
τ3
T) +
τ4
Tlog(
τ4
T) (2.53)
Where, same as in previous expressions, T is the sum of the pheromones in the considered links.
Now, lets evaluate this value in the minimum and maximum entropy situations. As defined
previously, the maximum entropy is found for an equal amount of pheromones in every link.
But we will consider the minimum entropy in this case to be that one where two of the links
have τi = 0 and the other two have the same amount. The reason for this is that this would be
the configuration to be expected in a graph that has converged to a unique solution; the ant
is always in a neighbourhood of two strongly marked links and two links marked with nearly
zero (see figure 3.1 for a converged graph). Hence, the minimum and maximum values are:
Hmax(N ) = log(4) = 2
Hmin(N ) = log(2) = 1
The lower limit in this case can be crossed; if an agent steps into a nearly-zero set of links,
the previous link to be walked will have a slightly higher value of pheromones, and that can
result in a value H ≤ 1. But it does represent the minimum to be expected in a graph that has
converged to only one solution. Now, considering the cases (b) and (c) in figure 2.3, one or two
links will be virtually added to the entropy calculation, and we will assign a value τ ∗ so that:
τ ∗ = max{τi}
That is, to compute the entropy in a case where the amount of surrounding links is less than
4, we will assign a value of pheromones to virtual links equal to the maximum value found
around the agent. This is done since entropy results depend on the size of the graph, and
taking max{τi} as the assigned value will yield a more conservative value of entropy. This will
cause an over-estimation of the entropy in certain cases, but it will also maintain the results
for a converged solution. This method will then be tested in Chapter 3.
32 Chapter 2. Background and Problem Description
2.10 Summary of Concepts
After introducing and expanding the background and necessary elements to understand and
analyze the problem, we present here the key points for the foundations of this work.
• First of all we presented the main structure of the algorithm to be studied, as well as the
background concepts of event-triggered control and stochastic theory necessary to fully
understand the analysis to be performed.
• In sections 2.4 and 2.5, the logic description of the algorithm and its expression as a
dynamic system was developed, to gain a better understanding of the behaviour of the
problem.
• To study this problem, the algorithm and its evolution, some metric concepts were pre-
sented. The most relevant one, given the structure of the problem, is the application
of Shannon Entropy (slightly modified from the original concept). This idea was fully
defined and explored, and how it relates to weighted graph we have in this particular
problem. Also, its relations with the convergence of the problem and its implications
were presented.
• Finally, the convergence of the graph to a certain solution was (partially) proven using
entropy as a random variable. Other metric parameters related to the convergence of the
graph and the efficiency of the solution generating behavior were presented and reasoned,
setting the base for the studies and analysis to be performed. And the proposals for event
triggered methods were presented, to be further analyzed in the experiments.
Chapter 3
Experimental Analysis and Results
In this chapter, more detailed information on the implementation of the algorithm is laid out,
as well as the simulation, performance and different case results for the algorithm.
The work will be focused on the following aspects:
• Describe the implementation of the algorithm, and present a baseline case to better
understand its behaviour, analyzing it in terms of the metrics presented previously.
• Study the convergence of the algorithm in terms of the entropy results.
• Suggest and study methods to implement an event-triggered routine into the algorithm.
• Study the impact this event-triggered method has on convergence and resource efficiency
(pheromones), and discuss its implementation.
3.1 Algorithm Implementation
According to the proprieties described previously, the algorithm runs as follows:
for time = 1, 2 ... end {for ant = 1, 2 ... N {
ant checks for surrounding pheromones
↓if ant has found goal ⇒ Xij = Xij
↓Compute new ant position following (2.8)
↓
33
34 Chapter 3. Experimental Analysis and Results
rn(t+ 1) = new ant position
Bn = ∆τ for the link connecting rn(t) and rn(t+ 1)
}⇒ Global update of X following 2.7
}
This process is repeated for a set amount of time steps tmax. Hence, the shortest paths con-
necting the nest with the goal node will slowly develop a higher pheromone amount compared
to the rest of the graph (since they take a smaller time to complete). This will attract more
agents towards those paths, and finally the behavior of the agents will converge to an optimal
solution.
3.2 Other Metric Parameters
It might be useful to define other parameters to better understand the performance of the
algorithm. One of these parameters can be related to how efficiently the agents walk from
start to target and back. For example, a parameter relating the amount of ”perfect cycles” the
algorithm has performed. A perfect cycle is, in this case, a full loop starting from the initial
node, reaching the goal and back to the initial node in the optimal amount of steps.
Considering this concepts, a possible expression for this kind of parameter would be:
c =#cycles× cycle length∗
nants × Time(3.1)
Note that this parameter requires of previously knowing the length of the optimal solution.
Hence, it can only be used to analyze and post process results, not to intervene in the algorithm
behaviour. The algorithm would then keep count of the amount of cycles or loops the agents
perform. Multiplying the amount of cycles by the optimal time for a cycle we get an indication
of how much time the algorithm would have run in perfect optimal coordination. Dividing this
by the amount of ants and the total time we get a parameter ranging c ∈ [0, 1], with c = 0 for
the case where there has not been a single full cycle completed, and c = 1 where the algorithm
has performed optimal solutions for the entire time.
The algorithm goes through an exploratory phase in the beginning, while the agents find the goal
node and the pheromones in the graph re-distribute. This means, the value of the parameter
is expected to be well under 1; we cannot expect to have agents walking optimal solutions
from the start. It also means that the value of the parameter depends on how long we run
the simulation for. A longer run will probably mean higher values since this will extend the
solution construction phase for the agents. Nevertheless, it is useful to compare simulations
with the same running time, since in that case the difference in parameter values will be strictly
3.3. Convergence and Entropy Results 35
due to a bigger amount of cycles (or shorter cycles) completed, and hence a bigger efficiency in
solution construction.
3.3 Convergence and Entropy Results
Now we present different cases to be solved by the algorithm. We will compare the performance
of different versions of the algorithm and different base cases in order to analyze the behaviour
of the system, both in terms of the agents behaviour and the entropy results.
3.3.1 One and multiple optimal solutions
First we will study the differences between a case with only one optimal solution in the graph
and a case that has several solutions with the same solution length. We will define first the
set of standard parameter values presented in table 3.1. Where, in this case (as presented in
nants G Size ρ ∆τ y(0)10 8× 8 0.02 ρT0/n 1
Table 3.1: Standard parameters for simulations
previous sections), the increase in pheromones is such that it mantains constant the global
sum over the graph. Hence, in this case T0 = |L|, y(0) = 112 and the pheromone increase is
∆τ = 0.224. First, we present in figure 3.1 the results in pheromone for a simulation where
the optimal solution is a straight line in the graph, hence there is one unique solution. It can
be seen how, by the 200 iteration the agents have mostly converged to the optimal solution.
Also, we can now use the τmax expression found in 2.20 to compare the maximum values found
in the simulation τ ′max with the theoretical values. As it is shown, the maximum found values
for the pheromones in the graph rapidly approach the theoretical calculated maximum as the
iterations increase. We can now present other metric parameters, such as the average entropy
shown in expression 2.29 and the cycle parameter related to the amount of optimal solutions
walked (expression 3.1).
τmax τ ′max(50) τ ′max(100) τ ′max(200) τ ′max(500)16 2.25 3.14 8.49 15.69
Table 3.2: Maximum Pheromone Values
And finally, the entropy along the whole simulation is shown in figure 3.2. In this case it can
be seen in a very clear way how the entropy converges perfectly to a minimum value, that
36 Chapter 3. Experimental Analysis and Results
(a) t=50s (b) t=100s
(c) t=200s (d) t=500s
Figure 3.1: Pheromone Graph evolution
corresponds to the graph only having the optimal solution marked. We can compare this to
the minimum solution entropy as defined in expression (2.34), shown in table 3.4.
Hence, the entropy when the simulation finishes is in fact really close to the optimal solution
entropy value. This of course could have fluctuations, specially at the early states of the optimal
solution construction, since agents are not uniformly distributed over the solution, and there
can be disparities between the existing pheromone values.
We can now compare the algorithm performance for a different solution case. Using the same
parameters, the initial and goal node are now placed in opposite corners of the graph. For a
better comparison, we will change the amount of agents proportionally to the optimal solution
length. This will give the same homogeneity to the distribution of pheromones over the final
solution; in frequency we have the same agents crossing the optimal links. The parameters are
then presented in table 3.5.
This causes two main differences with simulation 1:
3.3. Convergence and Entropy Results 37
H c3.62 0.501
Table 3.3: Parameter results for simulation 1
Figure 3.2: Entropy Results for simulation 1
• The length of the optimal solution is now more than double the size of the one in simu-
lation 1 (16 links). This will affect the maximum amount of pheromones to be found in
the optimal trail.
• In this case, given the graph is a cartesian set of nodes with only two possible directions
(vertical and horizontal) the amount of optimal solutions is much larger than one (there
are many paths connecting the corners with 16 steps).
The results for this simulation are presented in figure 3.3. Now some conclusions can be drawn
about the behaviour of the algorithm. First of all (as it would be expected) longer solutions
take longer times to converge. Also, combined with how the directional kernel is applied, the
longer the solutions are the bigger the margin for error, specially in the middle point of the
solution. It can be seen how in the center of the graph there are several links being walked.
Interestingly, all those links are contained in optimal solutions (since for this case, the entire
graph is full of optimal solution links). Also, given the kind of kernel applied (based on the
angular orientation between goal and ”nest”), the agents converge to the set of solutions closer
to the straight line connection between nodes. This is entirely caused by the kernel applied. A
38 Chapter 3. Experimental Analysis and Results
H(X∗s ) H(X ′s)2.8074 2.8072
Table 3.4: Entropy limit values for simulation 1
nants G Size ρ2 ∆τ y(0)22 8× 8 0.02 ρT0/n 1
Table 3.5: Parameters for simulation 2
stronger kernel would narrow the solutions close to the diagonal even further.
Regarding the parameters presented in simulation 1, the results for simulation 2 are shown in
table 3.6.
τmax τ ′max(50) τ ′max(200) τ ′max(500) τ ′max(1000)7 2.17 2.72 3.96 4.63
Table 3.6: Maximum Pheromone Values 2
This results show how the convergence for this case is much more complicated to analyse.
Not only the solution is longer (more than double the size) but also there are several optimal
solutions within the set. As it can be seen in figure 3.3, the agents have not converged to a
unique solution, but instead to a set of optimal solutions around the diagonal of the graph.
This causes the entropy values to be much higher than those expected in the optimal case.
To see this in further detail we can plot the entropy extending the simulation to t = 3000s.
It can be seen in figure 3.4 how the entropy kept decreasing after t = 1500s. The graph had
clearly not converged, and even at t = 3000s we cannot assure that it has converged to the
current level.
Considering the rest of the results, the cycle parameter value indicates that the agents have
successfully completed fewer solution loops compared to the previous simulation. In any case,
the number indicates that the agents have at some point started to walk efficient solutions,
probably for higher times of the simulation (it can be seen that until t = 500s the graph had
not really converged to the expected result).
And finally, regarding the entropy level at t = 1500s in figure 3.4, comparing it to the theoretical
minimum for an optimal solution indicates that the graph is far from that configuration. This
makes sense when considering the pheromone plots; even at higher times we do not find a single
solution marked, but instead a set of solutions around the diagonal connecting the goal nodes.
This clearly has an impact on the entropy, causing a much higher value to the one expected in
a single solution convergence.
3.3. Convergence and Entropy Results 39
(a) t=50s (b) t=200s
(c) t=500s (d) t=1000s
Figure 3.3: Pheromone Graph evolution
H c6.13 0.216
Table 3.7: Parameter results for simulation 2
H(X∗s ) H(X ′s(1500s))4 5.73
Table 3.8: Entropy limit values for simulation 2
3.3. Convergence and Entropy Results 41
3.3.2 Parametric convergence analysis
We will now set up an experimental framework to study the influence of the parameters on the
convergence and behaviour of the algorithm. First, we will consider the algorithm characteristics
presented in table 3.9 to be fixed. The graph will then be similar to the one presented in figure
3.1. Then, the parameters to study are ρ and nants. We will do so by studying the impact
of both independently, running the algorithm 10 times for each combination of values and
averaging results. Also, to facilitate the task of comparing results, from now on the normalized
entropy described in 2.35 will be used as the standard metric, instead of the global entropy.
G Size |S∗| ∆τ y(0)8× 8 7 ρT0/n 1
Table 3.9: Parameters for Convergence Experiments
First, taking a standard value for nants = 15, the results of the study for the different values
of ρ are presented in table 3.10. The results are presented in terms of the average normalized
entropy (10 runs) of the simulation h, the normalized entropy after 500 time steps h(500), the
normalized entropy at the end (1500s) of the simulation hf , the cycle performance parameter c
and the convergence rate. The convergence rate is simply measuring which runs have converged
to an optimal solution following the convergence criteria presented in chapter 2.
ρ h h(500) hf c Converg. (%)0.005 0.556 0.678 0.229 0.365 200.01 0.354 0.409 0.044 0.411 1000.02 0.208 0.170 0.001 0.438 1000.03 0.171 0.136 0.016 0.442 900.04 0.161 0.108 0.016 0.440 800.05 0.134 0.048 0.016 0.445 900.06 0.178 0.099 0.092 0.403 300.07 0.206 0.122 0.128 0.386 300.08 0.359 0.318 0.281 0.228 300.09 0.280 0.228 0.216 0.275 200.10 0.245 0.199 0.192 0.291 30
Table 3.10: Experiment Results for nants = 15
A lot of interesting information can be extracted from this data. First, analyzing the conver-
gence values, it can be seen how for ρ ∈ [0.01, 0.05] the algorithm seems to converge with a
high probability. After this threshold, the convergence rates go down dramatically. Probably
because for higher evaporation rates agents can get trapped in unwanted regions of the graph,
hence increasing entropy rates and not achieving convergence. It also follows this logic that for
42 Chapter 3. Experimental Analysis and Results
Figure 3.5: Entropy Results for variable evaporation
lower evaporation rates the quality results obtained decreases significantly. For really low evap-
oration rates the agents are more likely to find the goal node and explore different solutions,
but it also heavily slows down convergence, causing agents to walk out of the optimal solution
path more often.
To compare the entropy results, they are plotted in figure 3.5. It can be seen how the final
entropy values are really close for the interval ρ ∈ [0.01, 0.05], and recalling the results in table
3.4, it can be seen how the values are around the lower entropy boundary. This indicates that in
fact the results for this interval converge accurately to the optimal solution. Then the average
entropy and the entropy at 500s can be used to analyze the speed of convergence. It can be
seen that for higher evaporation rates the graph converges faster, but after ρ = 0.05 it starts
to diverge from the optimal values. We can repeat this analysis for a variable number of ants.
Taking as a standard value ρ = 0.03, we obtain the following results presented in table 3.11.
It is interesting to see how in this case there does not seem to be a logical correlation between
the number of ants and the quality of the solutions generated. This can be explained by the
size of the problem; it may happen that the size of the grid is small enough so that having five
agents is already enough to ensure convergence. In fact, the most unstable result in terms of
convergence happens for a low amount (6) of agents. It does seem, however, the simulations
have an overall better convergence and lower entropy results for an intermediate amount of
agents (10 to 13). Even so, when the amount of agents is increased we do not appreciate a
significant decrease in performance.
3.3. Convergence and Entropy Results 43
nants h h(500) hf c Converg. (%)5 0.206 0.141 0.065 0.435 906 0.154 0.074 0.025 0.410 707 0.219 0.170 0.041 0.438 1008 0.220 0.208 0.055 0.428 909 0.207 0.154 0.060 0.434 100
10 0.167 0.105 0.012 0.439 8011 0.158 0.098 0.000 0.439 10012 0.185 0.110 0.028 0.441 10013 0.182 0.105 0.026 0.443 10014 0.167 0.072 0.008 0.432 9015 0.193 0.126 0.029 0.432 9016 0.191 0.132 0.045 0.423 9017 0.163 0.096 0.007 0.438 9018 0.176 0.089 0.024 0.431 10019 0.151 0.059 0.005 0.442 10020 0.165 0.085 0.000 0.440 90
Table 3.11: Experiment Results for ρ = 0.03
Figure 3.6: Entropy Results for variable number of agents
44 Chapter 3. Experimental Analysis and Results
3.3.3 Entropy Convergence
Now it is interesting to see the parametric influence on the convergence of the entropy. Figures
3.7, 3.8 and 3.9 show some results for a spectrum of parameter combinations, going from
ρ ∈ (0, 1) and nants ∈ [5, 65]. The upper limit to the ant number is chosen since it is the amount
of nodes in the graph (one ant per node). It can be seen how the entropy converges for almost
Figure 3.7: Convergence in Entropy
all the cases. Convergence is in this case defined with a 1% upper and lower boundary. Even for
the cases where the entropy shows smaller convergence statistics, it is caused by the fact that
entropy ends up oscillating around an average value, and for low numbers of ants this oscillations
can be higher than 1%. This means that in fact, regardless of the solution generation or the
accuracy of the algorithm behaviour, the entropy seems to converge (in average) for all cases.
This would empirically validate the theorems in chapter 2, where we stated the convergence in
entropy for a weighted graph following this behaviour.
The optimal solution convergence graph (figure 3.8) shows that the algorithm clearly works for
low evaporation (as expected), and it has a threshold around 10% from which the performance
decreases dramatically. This performance is of course measured with respect to the unique
optimal solution existing in this particular problem. This can also be related to the size of the
domain; bigger domain, lower evaporation needed. The algorithm also reaches better results
for a higher number of ants, rather than lower. This conclusion was in fact to be expected; the
algorithm has a better and more efficient performance for low evaporation and high number of
3.3. Convergence and Entropy Results 45
Figure 3.8: Convergence to Optimal Solution
agents. This two characteristics ensure a smoother transition between the maximum entropy
state and the optimal solution coordinated behaviour.
The entropy at 1500s shows that for a low evaporation (higher than 0, lower than 0.15), the
entropy ends in a relatively low value, in fact close to the optimal solution entropy value. After
that it increases, matching the results for optimal solution convergence. It can be seen that for
low number of ants and high evaporation, the entropy values are also relatively low; this has to
do with ants accumulating in a few links when most of the graph has entirely evaporated. But
as it can be seen on figure 3.8, for this cases we do not obtain an optimal solution configuration.
We can then conclude that it seems the optimal solution construction is extremely related to
evaporation with respect to the size of the graph. At some point when increasing evaporation,
it affects the agent’s ability to trace back their paths to the starting node, and the performance
of the algorithm drops. At the same time, it seems for higher number of ants the algorithm is
more stable, from seen in figures 3.7 and 3.8. This is related as well to the way the pheromone
increase is designed; the less agents, the more pheromone each agent adds, and the disturbances
after each step are bigger.
3.4. Convergence Validation Experiments 47
3.4 Convergence Validation Experiments
In this section, we will try to experimentally validate the results obtained in Theorem 2.2
and Lemma 2.2. For this, the value of ai(t)/Ti(t) has been checked in a standard simulation
(conditions presented in table 3.1).
Recalling the results of Lemma 2.2, it is interesting to check if the resulting condition holds for
the simulations:ai(t)
Ti(t)>
n
T0
For better understanding of the results, we will use the condition in its normalized form, so:
γ =ai(t)T0
Ti(t)n> 1 (3.2)
First, in figure 3.10 there is a scatter plot of the parameter values for a 1500s simulation. In
this case the simulation converged to an optimal solution quite early (as it can also be seen with
the oscillation of γ values). The plot only includes the nodes and times where ai(t) 6= 0. The
Figure 3.10: Time plot of γ for all nodes
first thing that can be appreciated is that the condition in (3.2) does not hold for all times in
all nodes. This may have to do with the fact that the links jump from sub to super martingale
periodically (see Theorem 2.2). Figure 3.11 shows the same results zoomed in between γ = 0
and γ = 5. It can be better seen how actually some nodes have γ values below 1, and others
48 Chapter 3. Experimental Analysis and Results
Figure 3.11: Time plot of γ ∈ (0, 5) for all nodes
over. Finally, we can average these results per each time step to see what is the general trend
on the graph. We can define the average γ as:
γ =T0
n
∑i
ai(t)
Ti(t)∀i : ai(t) 6= 0 (3.3)
Figure 3.12 shows some interesting results. It can be seen how, when the algorithm is far from
converging, the average value is usually way above 1. The system behaves strongly as a super-
martingale. As the graph starts to converge, some values start getting close (or slightly below)
1, and by the end of the simulation the value γ oscillates around 1. It is actually a reasonable
result, since for the case of γ = 1 the system is a Martingale, and the expected conditional
value at t+ 1 is equal to the current value (remains stable).
In fact, when considering the results in Theorem 2.2, we could argue that it is in fact a conser-
vative approach given the structure of our problem; in our case, τij(t) = τji(t). This means that
the expected conditional value of τij(t) also depends on the agents aj(t) (as shown in Teorem
2.3). Hence, we are not considering a part of the expression that would yield even a bigger
expected conditional value.
After these results we can at least confirm that the stated Lemma 2.2 and Theorem 2.2 about
the system’s convergence are experimentally validated, and remains as future work to find the
formal proof.
50 Chapter 3. Experimental Analysis and Results
3.5 Entropy-Based Event Triggered Control
After the methods presented in Section 2.8, here we present the results of applying both pro-
posals.
3.5.1 Proposal A: Results
Now, using the same conditions as in 3.1, the algorithm variable values set for this method
simulations are presented in table 3.12.
nants G Size ρ1 ρ2 ∆τ y(0)10 8× 8 0.02 0.02 ρT0/n 1
Table 3.12: Parameters for Proposal A
In this case we can compute the entropy of the pheromone graph, and choose a simple entropy
boundary for the triggering condition. This is shown in figure 3.13. The dashed line represents
Figure 3.13: Entropy Computation and Entropy Limit Example
the entropy boundary; when the entropy goes above the line the algorithm will start to mark
with frequency 1, and with frequency f when it’s below. Considering these results, we can now
implement an event triggered control method for this case, choosing a value for f in expression
3.5. Entropy-Based Event Triggered Control 51
2.51. For this simulation, we choose f = 0.1. We will run the algorithms five times for each
case, and average the entropy results for a better visualization. The results then are shown in
figure 3.14.
As it can be seen, the entropy follows the limit function up to a certain point. This is to be
expected, since recalling the results in 2.6, the entropy remains constant when not marking.
Then, for a low value of f such as the one chosen, it is expected that the system will oscillate
around the chosen limit function, increasing or decreasing the marking frequency as it jumps
from one side to another. It is interesting to see that at around t = 500s the entropy cannot
decrease any faster even with continuous marking, so given the limit function chosen, after that
point the algorithm marks continuously.
Figure 3.14: Entropy for Event Triggered and Continuous Marking
We can also compare both cases using other metrics. Using the parameter 3.1 described in
section 3.2, we get the following performance:
cevent ccont
0.4617 0.4908
Table 3.13: Cycle Parameter Values
As it was expected, the continuous marking performs slightly better (around 8% more efficient
in the amount of solutions). We can also compare the amount of pheromones the agents have
spread around the graph on average each step, to see how the event triggered system has
52 Chapter 3. Experimental Analysis and Results
∆τ event ∆τ cont0.053 0.13
Table 3.14: Total amount of Pheromones added
influenced in the agent interaction with the environment. Hence, on average, the algorithm has
used around 60% less pheromone marking. We can also see the behaviour if, instead of a linear
limit, we apply a quadratic triggering function, as shown in figure 3.15.
Figure 3.15: Quadratic Event Trigger
3.5. Entropy-Based Event Triggered Control 53
3.5.2 Proposal B: Results
For this method, the simulation parameters are the same as presented in table 3.12. In this
case we would expect this method to be more stable and more flexible to different combinations
of parameters. This is since, opposite to proposal A, in this case agents do not stop marking
at any moment, they just mark with less intensity. This causes smaller and less jumps in the
pheromone values, and a smoother evolution.
The results for two different slopes in the entropy boundary (linear) are presented in figures
3.16 and 3.18.
Figure 3.16: Results for z=1/1000
It is interesting to see how in this case the solutions seem to behave more stable compared to
proposal A. Also, it can be appreciated how for the case with 1/1000 trigger slope at some
point the convergence speed surpasses the standard simulation. This can be explained by the
fact that at the early stages of the simulation we force a slower evolution in the pheromone
graph, enabling the agents to explore further and faster, which then pays off after a certain
point and converges quicker.
The main difference in behaviour between proposals A and B is the fact that entropy remains
constant when no pheromones are added to the graph. This means that, in the case of proposal
A, the evolution of the graph is much more abrupt; every time the trigger function is crossed,
there are several agents that are not adding any pheromones to the environment, hence main-
taining the entropy constant. As soon as the marking is turned on again, this causes a step
decrease in the entropy. Hence, after these results, to proceed with further studies, we will use
54 Chapter 3. Experimental Analysis and Results
Figure 3.17: Results for z=1/800 trigger slope
Figure 3.18: Entropy for Intensity Trigger
proposal B given the improvements in response and stability of the behaviour.
3.6. Decentralized Entropy Trigger: Results 55
3.6 Decentralized Entropy Trigger: Results
After the results in the previous section, proposal B is implemented with a decentralized scheme
based on the concepts in Section 2.9.
3.6.1 Entropy Estimation: Examples
In this case, we will apply the normalized entropy concept described in 2.35. The reason to
use a ratio as an indicator instead of the entropy value is because we want to estimate the
entire graph entropy using the local entropy value. Since the entropy will have a different value
just because both sets have a different size, it becomes useful to use a ratio in this case that
indicates how close the neighbourhood is to minimum entropy. Computing this parameter for
Figure 3.19: Global Entropy and Entropy Estimations
a simulation with the same conditions as presented in table 3.1, we get the results presented in
figure 3.19 for the global entropy and entropy estimation ratios for the agents.
It is interesting to see how the entropy estimation for the agents follows a similar trend to the
real entropy of the graph. Also, it can be seen how (as expected) at some point agents estimate
to have a lower entropy than the minimum entropy parameter. Nevertheless, it is useful to see
that agents can estimate the trend of the global entropy just by computing the entropy around
them. And most important, the agents estimation ratio goes to zero at the same time as the
56 Chapter 3. Experimental Analysis and Results
global entropy is converging. That makes sense since, when the agents are following a perfectly
marked path, the graph has converged and the entropy goes to the minimum value.
This becomes even clearer if we compute the average entropy estimated between all agents and
compare it to the normalized results for the global entropy. We define the average entropy as:
havg =
∑nantsi=1 hinants
(3.4)
With this, we can plot the average estimated entropy and the real measured entropy in the
graph.
Figure 3.20: Average Estimated Entropy and Real Time Entropy
As it can be seen in figure 3.20, the average of the estimated values is extremely similar to the
real computed entropy. This result was in fact to be expected; and the more agents we have,
the bigger area we use for estimating the entropy, and therefore the closer the values will get.
It can also be appreciated how the estimated entropy slightly underestimates the real entropy.
The reason for this can be that the agents tend to move in regions where there is lower local
entropy, and therefore strongly marked paths.
This method can in fact spark a more general discussion about complex interconnected systems,
and how their state can be estimated with a limited amount of observations scattered around
the system. In this case, sensing agents are able to estimate the level of chaos in the graph
by using their immediate neighbourhood. It is interesting to understand that in this case,
3.6. Decentralized Entropy Trigger: Results 57
given the dynamics of the system, this chaos estimation will always yield lower values than the
full system. The agents will concentrate with a higher possibility in those areas with smaller
”chaos” levels. In fact, general disorder or chaos enables agents to move freely through all the
graph, whereas highly marked paths (low entropy, low chaos) will trap agents and prevent them
from moving to other regions.
3.6.2 Decentralized Trigger - Proposal B
Following the line of thought presented in the previous section, we will implement a decentral-
ized entropy trigger to evaluate how the system responds to this method, and study possible
efficiency trade-offs. First, we will use the same reference conditions as in table 3.1. In this
case, for better stability and results, we will use proposal B, shifting the amount of pheromones
added in each step according to the entropy threshold. For these simulations we chose p = 0.1
as pheromone reduction parameter. In figures 3.21 and 3.22 the entropy results are presented
Figure 3.21: Entropy Results for Linear Threshold
for this decentralized trigger method. As it can be seen, in both cases the real measured en-
tropy seems to follow the trend imposed by the trigger. It is also noticeable that the entropy
estimations, even though being really unstable, provide a good overall image of the graph and
seem to be precise enough to implement this trigger.
It must be noted that in this case, a second threshold was implemented at H = 0.1, to evaluate
the capacity of this method to keep the entropy constant if desired. In the case of the linear
58 Chapter 3. Experimental Analysis and Results
Figure 3.22: Entropy Results for Quadratic Threshold
trigger, this seems to have worked accurately. In following simulations, this lower threshold will
be disabled and the algorithm will be allowed to converge to zero. With this, we will study this
method using the linear threshold, and we will compare parametric results for different slopes
(speeds) in the entropy limit.
3.6. Decentralized Entropy Trigger: Results 59
3.6.3 Decentralized Trigger Results
To study the impact of the applied method, we ran several simulations for several limit function
slopes. The idea is to study the impact that the slope (and convergence speed) has on both
the pheromone amount used and the convergence time.
First, we will apply the standard conditions for the simulations as seen in table 3.15. We chose
to slightly increase the amount of ants compared to previous cases in order to obtain results
as stable as possible. In table 3.16, an average value for convergence time and the value of
pheromone added is presented, as obtain in a set of 10 simulations without the event triggered
system.
nants G Size ρ1 p ∆τ y(0)20 8× 8 0.02 0.1 ρT0/n 1
Table 3.15: Parameters for Decentralised Event Triggered Simulations
tconv∑
∆τ872.5 3360
Table 3.16: Results for standard simulations
These values will serve as a reference to compare the event triggered result. First, in figure
3.23 is presented the graph entropy response for different trigger linear slopes; accelerating or
decelerating the convergence. In dashed lines the trigger function is represented. It can be
seen how the entropy results to be slightly higher in all cases compared to the limit function
imposed. This is since the triggering is related to the decentralized entropy estimations of all
the agents independently, hence resulting in a slight underestimation of the global entropy.
Following this results, table 3.17 shows the convergence time and amount of pheromone saved
as compared to the standard results in table 3.16. It can be seen how, for smaller slopes we
obtain longer convergence times (as seen in figure 3.23) and also a bigger cut in the pheromone
use. It is interesting to remark how the gap on convergence time results is relatively small from
z−1 = 1300 to z−1 = 1500, but instead the pheromone reduction is almost 50% lower. It is
hard to find a justification for this fact, but looking at figure 3.23, we can see how the value
for the entropy is higher for the second case. This indicates that the algorithm has converged
fewer times (or converged to the wrong solution).
In figure 3.24 we can see the results for the cycle parameter developed in Chapter 2. It must be
stated that in this case the parameter is calculated for a common simulation time of 3000 steps.
As it can be seen, smaller slopes (and higher convergence times) yield a smaller cycle parameter.
In practice this means that the slower the algorithm is in terms of convergence speed, the less
60 Chapter 3. Experimental Analysis and Results
Figure 3.23: Entropy Results for several slopes
z−1 tconv τsaved(%)600 1045 5.48700 1209 6.52800 1208 10.27900 1397 10.29
1050 1503 13.461200 1598 18.851300 1801 21.061500 1820 31.08
Table 3.17: Results for different slopes
amount of full cycles will be able to complete. This result is logical; a slower convergence means
a more spread out pheromone graph, and hence a more chaotic agent behaviour. Finally, figure
3.25 shows a plot of the convergence time versus the amount of pheromones saved, with a
linear interpolation to illustrate the general trend. These results show that it is then possible
to establish a trade-off between pheromone usage and convergence time, and this trade-off
seems to have an almost linear behaviour for the studied range of parameters. It also shows
how the use of estimated entropy instead of real computed entropy affects the response on the
graph; there is an under estimation of the entropy, hence obtaining measured results that are
slightly over the designed trigger threshold. Nevertheless, the threshold seems to impose an
overall behaviour trend, slowing down the convergence and increasing the efficiency overall.
3.6. Decentralized Entropy Trigger: Results 61
Figure 3.24: Cycle Parameter for Decentralized Simulations
Figure 3.25: Convergence Time and Pheromone Efficiency
Chapter 4
Conclusion
4.1 Summary of Results
After the results and analysis presented in Chapter 3 and the framework described in Chapter
2, there are several conclusions to be drawn from it. The summary of the work done can be
presented as follows:
• In Chapter 2, the necessary background concepts related to Ant Colony algorithms, event-
triggered control and stochastic theory were presented and related accordingly to the
work.
• The problem framework and principles were described for the kind of biologically inspired
algorithms that were to be used, and how these concepts relate to weighted graphs (in
our case, a pheromone graph).
• The concept of entropy was presented and justified, relating the idea of Shannon entropy.
An extensive reasoning was done in Chapter 2 to fully explain why and how this concept
was useful to our study. In this case, we are evaluating a sort of converging (or learning)
algorithm based on emerging behaviour; agents begin to operate in a complete uniform
(chaotic) manner, and slowly converge to a coordinated pattern using very simple instruc-
tions and tools (in this case, the stigmergy environment interaction). It can be easy to
see how the entropy parameter becomes useful in this case; entropy as a general concept
measures order and disorder in a system. In particular, we found Shannon Entropy to be
almost directly applicable to our problem, and so it became the main metric to analyze
the algorithm.
62
4.1. Summary of Results 63
• Reasoned analysis was presented in Chapter 2 to justify the convergence and stability
of these kind of algorithms. Even though we did not get solid proof of this convergence
(Conjecture 2.1) for our particular case, given the results in Theorem 2.2 and Lemma
2.2 (as well as the experimental results in section 3.4) we remain confident that our
modifications to the algorithm did not alter the main properties of convergence presented
for other Ant Colony algorithms.
• After having the background and secondary concepts properly laid out, in Chapter 3 we
presented the particular description and implementation of our algorithm, and a set of
test results to further understand how the algorithm behaves. And also to demonstrate
how the concept of entropy was in this case describing the coordination and convergence of
the system, and how different configurations and solutions have an impact on the metrics
used.
• Considering these foundations, we proceeded to implement an event-triggered scheme to
the algorithm. Two methods were tested, both using entropy as a defining function for
the state of the system: First a triggered switch between marking frequencies, and second
a triggered switch between marking intensities. The second option was found to be the
most stable and the one that yields better results. This is to be expected when considering
the global evolution of the algorithm and the fact that, in general, smoothness in graph
evolution leads to better solutions.
• Finally, this method was implemented and analyzed using the metric parameters pre-
sented in previous chapters, and one important modification was introduced. In event-
triggered control it is interesting to decentralize the scheduling of control tasks. In our
particular case this means letting each agent ”decide” when to trigger the control task
without having to rely on a centralized state computation (entropy) and a centralized
command. It can be seen how in the case of autonomous agents, decentralized schemes
become even more interesting. For this, the concept of estimated local entropy was in-
troduced, letting each agent extrapolate the global entropy in the graph using only its
neighbourhood links. This turned out to yield better results than expected, and the
amount of resources saved was finally presented, relating it to convergence time. As ex-
pected in a system of these characteristics, a longer convergence time leads to a bigger
saving in resource usage.
64 Chapter 4. Conclusion
4.2 Applications
There are different levels on the possible applications of this work, from the metric parame-
ters developed to the understanding of the algorithm itself. These can be summarized in the
following points:
• First, the idea of using entropy (in this case Shannon Entropy) to analyze learning or
emerging behaviour algorithms. This becomes really useful when evaluating autonomous
agent systems that evolve from a chaotic state to a coordinated and structured pattern.
Some particular criteria must be met in order to apply this concept, but we believe it
could be applied to other autonomous agent applications, and could provide interesting
insight on the characteristics of the systems.
• Another application, further down the line, would be to use these modified Ant Colony
algorithms as control schemes for large autonomous agent systems. Work has already
been done in this direction (see [6], [7]), and it would be interesting to extend it to more
complex behaviour and more complex systems (swarms of drones being a natural step).
• In this same line of thought, our results implementing decentralized event-triggered
schemes could also have applications for swarm robotics systems. As it was seen on the
results presented in Chapter 3, our proposal directly relates convergence time to resource
optimization. When implementing these algorithms in larger swarms of smaller agents,
resource optimization will become a key performance aspect (in this case, resource opti-
mization can refer to amount of data exchanged, network load, battery usage, autonomy
loss...).
• Finally, on a more conceptual approach, this work could have direct applications for co-
ordination of automated transport systems, for example. It could be the foundation for a
traffic coordination system in cities, calculating and modifying routes but also controlling
each vehicle independently as an agent.
4.3. Future Work 65
4.3 Future Work
There are some important aspects that could be subject of further and deeper work. First, it
would be important to develop a solid formal proof verifying the convergence of these algorithms.
As mentioned before, there is proof that certain kind of Ant Colony algorithms converge, and
that they converge to an optimal solution, but it would be interesting to try to find a similar
proof for our modified version. The reasoning presented in Chapter 2 is just an indication on
the likelihood of convergence, and cannot be considered a solid proof.
Then, following the previous conclusions on possible applications, it would be interesting to start
implementing this version on autonomous agent physical systems, specially when incorporating
the event triggered schemes, and see how they impact the behaviour of a robotic swarm. This
would be the first step to tackle further applications in this area. This line of work was initially
planned for this Master Thesis, but given the limited available time it was not possible to start
working in this area.
Finally, considering the algorithm study it could be useful to start applying modifications
to the algorithm, with a final implementation in mind, and study the changes in efficiency
and response. For example, experimenting with a variable ∆τ and letting agents interact more
heavily with the graph, cancelling or modifying links at once, could lead to better performances.
Also the exploration of other biologically inspired algorithms and the application of the metrics
developed here could lead to a better control scheme, that could incorporate characteristics of
different algorithms in a sort of efficient hybrid scheme.
Bibliography
[1] Dorigo M. Optimization, learning and natural algorithms. PhD Thesis, Politecnico di
Milano, Italy, 1991.
[2] Dorigo M, Maniezzo V, and Colorni A. Ant system: Optimization by a colony of cooper-
ating agents. IEEE Trans Syst Man Cybernet Part B, 26(1):29–41, 1996.
[3] Traniello J F A. Foraging strategies of ants. Ann. Rev. Entomol., 34:191–210, 1989.
[4] Blum C. Ant colony optimization: Introduction and recent trends. Physics of Life Reviews,
2(4):353–373, 2005.
[5] Marco Dorigo and Luca Maria Gambardella. Ant colony system: a cooperative learning
approach to the traveling salesman problem. IEEE Transactions on evolutionary compu-
tation, 1(1):53–66, 1997.
[6] Alers S, Tuyls K, Ranjbar-Sahraei B, Claes D, and Weiss G. Insect-inspired robot coordi-
nation: Foraging and coverage. Artificial Life, 14:761–768, 2014.
[7] Alers S, Claes D, Tuyls K, and Weiss G. Biologically inspired multi-robot foraging. In
Proceedings of the 2014 international conference on Autonomous agents and multi-agent
systems, pages 1683–1684. International Foundation for Autonomous Agents and Multia-
gent Systems, 2014.
[8] Sjriek Alers, Daan Bloembergen, Daniel Hennes, Steven De Jong, Michael Kaisers, Nyree
Lemmens, Karl Tuyls, and Gerhard Weiss. Bee-inspired foraging in an embodied swarm. In
The 10th International Conference on Autonomous Agents and Multiagent Systems-Volume
3, pages 1311–1312. International Foundation for Autonomous Agents and Multiagent
Systems, 2011.
[9] Paulo Tabuada. Event-triggered real-time scheduling of stabilizing control tasks. IEEE
Transactions on Automatic Control, 52(9):1680–1685, 2007.
[10] Robert G Gallager. Discrete Stochastic Processes. Springer Science + Business Media,
New York, 1996.
[11] J. Jacod and P. Protter. Probability Essentials. Springer, Berlin, 2000.
66
BIBLIOGRAPHY 67
[12] Robert M Gray. Entropy and information theory. Springer Science & Business Media,
2011.
[13] Shannon C.E. A mathematical theory of communication. The Bell System Technical
Journal, 27:379–423, 1948.
[14] Timo Mulder and Jorn Peters. Entropy rate of stochastic processes. 2015.
[15] Walter J Gutjahr. A graph-based ant system and its convergence. Future generation
computer systems, 16(8):873–888, 2000.
[16] Walter J Gutjahr. Aco algorithms with guaranteed convergence to the optimal solution.
Information processing letters, 82(3):145–153, 2002.
[17] Thomas Stutzle and Marco Dorigo. A short convergence proof for a class of ant colony
optimization algorithms. IEEE Transactions on evolutionary computation, 6(4):358–365,
2002.