Achieving Connectivity Between Wide Areas Through
Self-Organising Robots Using Embodied Evolution
Erik Aaron Hansen
Thesis submitted for the degree of
Master in Network and system administration 30 credits
Department of Informatics
Faculty of mathematics and natural sciences UNIVERSITY OF OSLO
Spring 2018
Achieving Connectivity
Between Wide Areas Through Self-Organising Robots Using
Embodied Evolution
Erik Aaron Hansen
© 2018 Erik Aaron Hansen
Achieving Connectivity Between Wide Areas Through Self-Organising Robots Using Embodied Evolution
http://www.duo.uio.no/
Printed: Reprosentralen, University of Oslo
Abstract
Internet connectivity gets more and more important as the era of IoT increasingly embraces both aspects of individuals lives’ and the industry.
By integrating the devices in our environment with humans so they are at the tip of our hands we can more easily know the state of our whereabouts and also manipulate them to our interest, whether this is for personal use or out in the industry. To enable us with this possibility to full extent requires a functioning backbone communication infrastructure for relaying information from humans to and from the devices. Abruptions to the underlying infrastructure happens occasionally, where manual dedicated personell will go out to fix the interruptions, restoring communication abilities. However, sometimes this can be dangerous to the personell carrying out the task, which can be the case in war situations, environmental disasters like earthquakes or toxic spills or in the occurrence of fire. Therefore, human casualties can be minimised if autonomous robots are deployed that can achieve the same outcome:
to establish a communication link between two previously distant but connected sites. This can be done if we deploy mobile ad hoc robots which relay traffic between them. The robots must be located in such way that packets can flow between the distant sites by multi-hops among the robots. In order to get the robots to locate themselves appropriately, we take inspiration from self-organisation and emergence in artificial life, where a common overall goal may be achieved if the correct local rules on the agents in system are invoked. In addition, suiting the robots with the correct controllers is crucial for this to happen. A lot of research exist on controller logic and also more specifically on evolution of controller logic to better be optimised for the task at hand, but such methods are often pre- evolved controllers in simulation, and not done in an embodied fashion on the robots while they are deployed. In addition, there doesn’t seem to exist a lot of research on obtaining connectivity the way mentioned here through the means of self organisation and embodied evolution. Thus, in this thesis we integrate the aspect of connectivity between two sites into the multi- robot simulation platform known as JBotEvolver. In addition, we compare three heuristics, of which one uses neuroevolution to show how self- organisation and embodied evolution can be used within the integration.
Our use of embodiment in robotic controllers shows promising results and provide solid knowledge and guidelines for further investigations.
i
Contents
1 Introduction 1
1.1 Motivation . . . 1
1.2 Problem statement . . . 2
1.3 Thesis outline . . . 4
2 Background 5 2.1 Evolutionary Algorithms (EA) . . . 5
2.1.1 The Evolutionary Algorithm Scheme . . . 6
2.2 Artificial Neural Networks . . . 10
2.3 NeuroEvolution of Augmenting Topologies . . . 12
2.4 Evolutionary Robotics . . . 15
2.4.1 Embodied evolution . . . 15
2.4.2 First Use of Embodied Evolution . . . 16
2.4.3 odNEAT Algorithm . . . 17
2.5 Classification of Evolving Robotic Controllers . . . 19
2.5.1 Online Decentralised NeuroEvolution of Augment- ing Topologies (odNEAT) . . . 20
2.5.2 odNEAT Algorithm . . . 21
2.6 Swarm intelligence . . . 22
2.7 JBotEvolver — A Simulation Platform for Evolutionary Robotics . . . 23
2.8 Orientation . . . 26
3 Experimental Methodology 29 3.1 Assumptions . . . 29
3.2 Methodology . . . 31
3.2.1 Random Walk Controller . . . 31
3.2.2 Pre-programmed Approach (“longest home route”) . 32 3.2.3 Evolvable Approach Using odNEAT . . . 33
3.3 Experiment Setup . . . 33
3.3.1 The Robot . . . 34
3.3.2 The Environment . . . 34
3.3.3 Random Walk Controller . . . 35
3.3.4 Pre-programmed Controller . . . 35
3.3.5 odNEAT Controller . . . 36
3.3.6 Visualisation of Neural Network Topologies . . . 37 iii
4 Results 39
4.1 Environment with dimensionality 4x4 . . . 39
4.2 Environment with dimensionality 2x5 . . . 40
4.3 Visualisation of Topological Structure . . . 44
5 Discussion 47 5.1 Interpretation of Results . . . 47
5.2 Visualisation of Topological Structure . . . 48
5.3 Future Work . . . 48
5.3.1 Post-Analysis . . . 48
5.3.2 Hardware Implementation . . . 48
6 Conclusion 51
A Source code 57
List of Figures
1.1 Illustration of overall goal . . . 3
2.1 Representations in evolutionary algorithms [3] . . . 6
2.2 General scheme of Evolutionary Algorithms . . . 6
2.3 Visualisation of the cycle of an EA . . . 7
2.4 A fitness landscape involving both a local maximum (dot) and global maximum (star) . . . 9
2.5 Crossover between genes of two parents . . . 9
2.6 A simple McCulloch and Pitt’s neuron . . . 10
2.7 Illustration of a single perceptron . . . 10
2.8 A Multi Layer Perceptron network (MLP) . . . 12
2.9 Genotype representation of neural network . . . 12
2.10 The two mutation operators add connection and add link . . 13
2.11 Recombination of two parent genomes . . . 14
2.12 The Competing Conventions Problem . . . 15
2.13 The evolutionary algorithm can be executed offline or online 19 2.14 Encapsulated vs. distributed maintaining of genomes . . . . 20
2.15 Possible variation operators in embodied evolution . . . 21
2.16 Simple neural network controlling two wheels. . . 22
2.17 JBotEvolver configuration view . . . 25
2.18 JBotEvolver results view . . . 25
2.19 Overview of the different complex systems sciences . . . 27
3.1 Illustration of components involved in task being solved. . . 31
3.2 Initial placement in a 4x4 size environmnet . . . 35
3.3 Initial placement in a 2x5 size environmnet . . . 36
4.1 Time taken before full connectivity, environment 4x4 . . . . 42
4.2 Distances travelled, environment 4x4 . . . 42
4.3 Time taken before full connectivity, environment 2x5 . . . . 43
4.4 Distances travelled, environment 2x5 . . . 43
4.5 Controller Neural Network Topology of one of the robots at thebeginningof the simulation . . . 44
4.6 Controller Neural Network Topology of one of the robots at theendof the simulation . . . 45
v
List of Tables
2.1 Simulation platform comparisons . . . 24 3.1 Overview of experiments to run . . . 35
vii
Preface
As soon as I realised the task I wanted to investigate I knew from the beginning this would be quite an ambitious project. Many pitfalls presented themselves, but there were just as many valuable hindrances to overcome.
I feel great gratitude towards both the University of Oslo and Oslo Metropolitan University for hosting this master program. These two years have been both challenging and uplifting, and the teachers involved seem to always put faith into their students, which have been a huge motivation factor for me.
I would therefore like to give out a special thank you to my main supervisor, Stefano Nichele, who freely let me embark on this ambitious mission, but gave invaluable input when needed. In addition I would like to thank my other co-supervisors in support of my thesis Anis Yazidiand Asieh A. Abolpour. Their voices buzzed in my head throughout my thesis, and will continue to do so in my future career.
It is better to shoot for the stars and land in the mud, than to shoot for the mud and make it.
Unknown Author,
Erik Aaron Hansen
ix
Chapter 1
Introduction
1.1 Motivation
The advent of Internet of Things (IoT) have sparked many new fields of interests, ranging from small hobby projects like hacking basically anything we can find around us, to larger scale solutions for solving complex problems. We are transcending into a new era where more and more different kinds of objects around us are being enhanced for communicating the state and environment of the objects, as well as providing interfaces for how humans can manipulate them. Examples of such objects could for instance be heating elements with built-in thermostats for remotely regulating heat and providing temperature feedback, or more industrially applied solutions like humidity sensors deployed within a plantation.
No matter the IoT devices’ purpose, what is common for them is that they require an underlying backbone connection in order to fully serve their purpose. Without an established link there is no way of communicating with the intended user (s), whether it is for receiving instructions or giving sensory feedback. This might be because of the way the system is designed, where the devices are not supposed to be connected at all times, or if there is a failure somewhere in the network infrastructure intended for the device. In any case, the link must somehow be established.
Although this could be done manually by humans, in some cases there exist motivation for doing this in a more autonomous fashion. This could be the case where the environment has toxic characteristics, or as an aid for first- responders after an urban disaster as part of a search and rescue mission.
Ad-hoc networks show promising results for solving some of these issues, but this requires the agents to be mobile to cover unconnected areas.
Swarm robotics can help us create a backbone for communication exchange where the robots act as intermediate relay nodes for large area coverage.
By using self-organising robots we can achieve an autonomous solution to providing network availability in dangerous environments, averting humans coming to harm.
1
1.2 Problem statement
How can we create a self-organising network of robots to obtain connectivity between unconnected end-points using embodied evolution?
Aself-organising network of robotsis a network of mobile agents where the agents are capable of interacting with each other and their environment.
The robotic agents are equipped with sensory capabilities which they can use actuators to make decisions of. The actuators compromises any action the robots can take, e.g. making sound if it’s equipped with speakers and steering and moving in the environment if it has wheels or other forms of mobile actuators. With a network of robot we achieve a swarm of robots which collectively can carry out missions better than what any single robot can achieve, by collaboration.
Such a mission can be establishing a backbone connection with a distant area by creating a mobile ad-hoc network(MANET), so that we can communicate and read sensory data with potential IoT devices at the target area, as attempted illustrated in figure 1.1. In a MANET there is no pre- existing network infrastructure, meaning there is no central access point each node in the network must connect to in order to establish connectivity.
This makes the network a decentralised wireless network, where each node is participating in transmitting packets, as a relay network node. If we imagine each node being a robot who are aware of one another and of the environment around them, by dispersing the robots throughout the environment we should be able to establish a network connectivity from the original point of dispersion to a target area if the robots collaborate.
This will in turn hopefully form a multi-hop routing network connecting the original deployment area with target area.
For the robots to carry out such a task, the robots will performembodied evolution to adjust their controllers. According to Embodied Evolution in Collective Robotics: A review[1], embodied evolution is a paradigm where we have the concept ofevolutionin multi-robotic systems that are:
• Decentralised: There is no leader coordinating the evolutionary process of generating and selecting offspring. It is up to the robots themselves to execute this locally, therefore the evolution isembodied within the robots themselves.
• On-line: Evolution happens while deployed and not off-line, mean- ing they are not preprogrammed to carry out the task.
• Parallel: We have a population of robots performing actions, evolving at the same time and place.
• Mating: An action where two or more robots decide to send and/or receive genetic material.
Original distribution area
Target area with IoT devices
Figure 1.1: Illustration of overall goal
3
1.3 Thesis outline
• Chapter 1: Introduction. Explains the motivation behind the thesis and the problem statement.
• Chapter 2: Background. Discusses the theoretic background used in the thesis.
• Chapter 3: Experimental Methodology. Explains our methods, how they are implemented and tested.
• Chapter 4: Results and analysis. Presents results.
• Chapter 5: Discussion. Here we answer our problem statement and present the possible roadmap for future work.
• Chapter 6: Conclusion. Summary of the contributions in the thesis.
Chapter 2
Background
To understand what motivates the use of embodied evolution we must first look at the parts involved. To do this we will begin by explaining essential concepts within evolutionary algorithms (EA). We will then go over to talk about neural networks. Lastly we will talk about swarm robotics, and tie all these fields together which constitutes the field of embodied evolution.
2.1 Evolutionary Algorithms (EA)
It proves that optimisations on species survival abilities are done naturally by itself. Any animal possessing treats considered favourable in a given environment, increases its offspring’ further chances of survival.
Darwin called this phenomenonthe survival of the fittestor simplynatural selection[2]. Such treats could be height of plants, where windy conditions have made it challenging for high growing plants and trees to remain rooted, or speed of animals which is a beneficial trait when hunting prey.
Treats like this which describes the property of the animals are what we call the phenotype representation of the property, something that we can see or observe. What actually makes the property, is what we call the genotype representation. The genotype is a low-level representation of the parts necessary to yield the phenotype, and it is important to transform the phenotype into a precise genetic encoding. A rough model of a genotype can be seen in figure 2.1 on the following page.
The importance here is not how evolution works in nature, but how we have adopted ideas from natural selection into solving tasks in scientific computing. By using evolutionary algorithms complex problems can be solved sufficiently good in a short matter of time. Roughly we imagine different individuals each contributing to a problem in their own way. By the way, an individual in this sense, is what we actually refer to as asolution, so the terms might be interchanged. We can think of a problem that can be divided into many sub-problems, and if we combine one individual having a good solution for one sub-problem with another having a good solution to another sub-problem, we end up with a new individual solving the main problem better than the two parents.
5
SRC
DST
Initialisation
Termination
Parents
Offspring Population
Parent selection
For variaton:
Recombination mutation
Offspring selection
0 ...
0
1 0
n
0 1 2
locus
allele gene
chromosone / genotype
phenotype
Figure 2.1: Representations in evolutionary algorithms [3]
SRC
DST
Initialisation
Termination
Parents
Offspring Population
Parent selection
For variaton:
Recombination mutation
Offspring selection
Figure 2.2: General scheme of Evolutionary Algorithms
2.1.1 The Evolutionary Algorithm Scheme
However we need to do this in an orderly fashion to reach reasonable good results, so here we show a general scheme for doing an EA. For a visual go-to when reading this, see figure 2.2 and 2.3 on the facing page. For a thorough walk-through about each of the topics mentioned, Eiben and Smith’sIntroduction to Evolutionary Computing[4] is highly recommended.
The first thing we need to do after finding a proper genotype represent- ation of what we want to optimise, is to find a way to evaluate each chromo- some, so with the chromosome as input, we will get afitnessor evaluation score out of it. This is what we commonly refer to as the fitness function of the algorithm, how valued our individual/solution/chromosome is. Since the fitness is dependent on the traits that we are seeing, this actually forms a hiddenfitness landscape. We will soon see what importance this has, but an example of an imagined fitness landscape for one quantifiable trait can be seen in figure 2.4.
After we have a fitness function to evaluate our solutions comes the initialisation, which is often done by stochastic means. This yields a population of individuals, and among them we will select the best candidates to be parents in this iteration, or generation as we can say in the context of EAs. The selection favours those individuals with a high fitness score, meaning if we were to apply the fitness function on the chromosomes
SRC
DST
Initialisation
Termination
Parents
Offspring Population
Parent selection
For variaton:
Recombination mutation
Offspring selection
0 ...
0
1 0
n
0 1 2
locus
allele gene
chromosone / genotype
phenotype
crossover slice
Parent B Parent A
Offspring B Offspring A
0 0
1 1 0 0 1 0 0 1 0 0
0 0
1 1 0 0 1 0 0 1 0 0
1 0
0 1 1 0 0 0 1 1 1 1
1 0
0 1 1 0 0 0 1 1 1 1
Parent selection
Mutation Recombination Population
Offspring Survivor
selection
Figure 2.3: Visualisation of the cycle of an EA
of the individuals in the population, we would favour the best of these individuals. But there’s a small catch: the selection has a stochastic element in itself, in the way that there’s a higher chance of selecting individuals with a high fitness score, but solutions with lower score also have a slight chance.
Now, there exists many different ways to do the selection, like roulette wheel selection, rank-based selection, and in turn further variations of these: stochastic universal sampling and exponential ranking selection, all depending on how the selection pressure should be regulated, but for a more in depth description see [4].
Over to something slightly different, remember our imagined fitness landscape from before? In evolutionary algorithms, each sub-population investigates the search space around it, trying to achieve better than already existing solutions. We see this as the red dot in 2.4, where this has become a local maximum (in this particular image it’s imagined that we are maximising something, but the opposite can easily be achieved by inverting the fitness function). This is known asexploitation.
By looking at the graph, we see that by simply rolling upwards we need to be “lucky” where we start not to end up on the left-side hilltop. Of course we can re-run the algorithm a couple of times and hope we start in a space with good potential using exploitation, but we can do better by invoking another factor in our algorithm: exploration. With exploration we try to search different areas in our search space, so we don’t get stuck in a potential low-fitness local maximum. So how do we explore the fitness landscape? For some this might be one of the more interesting parts of the EA cycle, and we do this by using recombination and mutation techniques
7
on the genomes. We say that we are applying variation operators on the genomes, to achieve diversity or novelty among our population. This is an opposite force from the selection mechanism and important to escape local maxima.
In the first of the two variation operators, recombination, we try to recombine the genome of selected parents into new offspring. By doing this, we hope to achieve a new solution with traits from both of the parents. This process is done stochastically, meaning that at a random point in one of the parent’s genome we do a slice, as seen in 2.5. We call this a 1-point crossover, but other and more sophisticated methods exist, like partially mapped crossover and two-point crossover. We need to remember that our recombination operator needs to adhere to the phenotypic representation. Sometimes a generated phenotype isn’t allowed, other times it doesn’t make sense or introduce change, for instance when adjacency is of importance.
In the other important variation operator, mutation, we introduce small random changes to a small number of the offspring’s genomes. Again, this is done stochastically and we usually see a small mutation rate, meaning that we might give each offspring a 5% chance of mutating. The mutation itself depends on the the phenotypic representation. If the alleles are bits, the mutation would likely be flipping some of the bits in the chromosome, or if the adjacency between the genes is of importance, a mutation could be to move the locus of some of the genes.
After we have done the variation operations on our selected parents and the produced offspring, we are left with some new recombinations of the parents, where some of them have been mutated. We now need to evaluate our new candidates, to determine which will survive. Since we are actually doing an insertion into the population pool, this step is also called replacement. There exist many different strategies for doing this, like age- based or fitness based replacement. According to [4] a common strategy is the (µ= population size,λ= number of offspring)-strategy, where we have a surplus of offspring. For each generation, all parents are discarded, so it also has an age factor within it.
Finally we have completed one cycle in our EA. We started out initially with a population, performed a parent selection among our individuals, applied variation operations and then survivor selection. With the new population at hand, we need to ask ourselves if we can stop the evolution, or if we should continue doing more iterations so we can hopefully achieve even better solutions. Different heuristics are used to determine the stop predicate in EAs, some being age, clock cycles, time or a certain threshold for required increase in average fitness the last generations. Quite often however the number of generations or cycles performed seem to be a reasonable choice, but the discussion is out of scope for this thesis.
To round up, we should have visited a best solution throughout the generations, and the fitness function of this solution is considered as the best result in the evolution.
Figure 2.4: A fitness landscape involving both a local maximum (dot) and global maximum (star)
SRC
DST
Initialisation
Termination
Parents
Offspring Population
Parent selection
For variaton:
Recombination mutation
Offspring selection
0 ...
0
1 0
n
0 1 2
locus
allele gene
chromosone / genotype
phenotype
crossover slice
Parent B Parent A
Offspring B Offspring A
0 0
1 1 0 0 1 0 0 1 0 0
0 0
1 1 0 0 1 0 0 1 0 0
1 0
0 1 1 0 0 0 1 1 1 1
1 0
0 1 1 0 0 0 1 1 1 1
Figure 2.5: Crossover between genes of two parents
9
2.2 Artificial Neural Networks
Evolutionary computation isn’t the only well-known methodology in- spired by nature which is used for solving complex problems. A different technique involving artificial neural networks (ANN) which mimicked the way biologically neurons function proved to be promising at certain tasks.
Some applications can be, but are not limited to, optical character recogni- tion, where an ANN can recognise hand-written characters, stock market prediction, within medicine to detect cancer, or for banks to decide whether or not to grant loans [5]. Whether all uses is in the best interest with the public is another matter, but ethical discussions is outside the scope for this thesis.
Already back in 1943 we were introduced what would become the roots of today’s state-of-the-art technologies, with McCulloch and Pitt’s neurons [6]. They presented how neurons with binary threshold activation functions can solve first order logic sentences, like and, or and not. By setting the correct threshold value, for instance 2.0 in the case of and, the neuron would fire if both inputs were 1. However, the model had its limitations. It only had binary inputs, there were no way of putting weight to the synapses, the edges connecting the neurons, and it couldn’t learn. Over the years McCulloch and Pitt’s artificial neuron was gradually improved to overcome these limitations, and we went from the simple model we see in 2.6 to what we now call theperceptron, seen in 2.7.
x0
x1
Figure 2.6: A simple McCulloch and Pitt’s neuron
f x0
x1 x2 x3 xm
w0
w1
w2 w3 wm
o ...
Figure 2.7: Illustration of a single perceptron
With the perceptron we could give weights to each input, or synapse, to the neuron. So by summing the perceptron’s weighted inputs we can use the activation function to decide whether the neuron fires or not. So if we are looking at given neuron i, we multiply each input with their corresponding synaptic weights, so we get x0×wi0,x1×wi1, . . . ,xj×wij, which essentially is the dot product of a perceptron’s incoming weights and values. And after we have summed all of these, we will use the activation
function to see if there’s enough potential for the neuron to fire.
h=
∑
m i=0xiwi o =
(1 h≥θ 0 h<θ
(2.1) Two more things worth mentioning about the single perceptron without getting too detailed: One is that the activation output function f in Equation 2.1 doesn’t have to be binary. Instead it can give a real value as an output. A lot of research has been done regarding this, and some such activation functions are the hyperbolic tangent tanh(x), the sigmoid 1/(1+e−x)and the softsign x/(1+|x|)function to mention a few [7]. In addition there is worth mentioning another input to the perceptron, the bias node, often set with a value of -1 or 1. This node can shift the activation function to the left or right, which can be useful for instance if you want a neuron to fire even though the inputs are zero. A neuron firing in this case means the neuron giving some output other than zero figuratively speaking.
So far, our single perceptron doesn’t do anything exciting, apart from giving an output. In order to use this for solving complex problems, we need to add more perceptrons which will form a network of connected neurons. One such structures can be found in figure 2.8. Here each synapse is indicated by an arrow, and one neuron can have both inputs and outputs coming from other neurons. The biases are also shown. Depending on the structure of the neural network, like how many hidden layers are between the inputs and the outputs, and how many perceptrons each layer has, the neural network is able to solve problems like classification, clustering and regression with varying performance. Deciding on which topology to go for is often of interest when designing a neural network, and we will come back to this later.
We need also to mention one important part about ANN, namely learning. It is crucial for the network to have sensible weight values in order for the neural network to perform. Therefore, in conventional ANNs something known asback-propagationis being used to calculate the weight values. We are going to use a slightly different approach which will be explained more detailed in next section, where we evolve the weights through evolutionary cycles. Thus the concept of back propagation is out of scope for this thesis, but for reference see [8].
11
Bias Bias
Input
Hidden
Output
Figure 2.8: A Multi Layer Perceptron network (MLP)
2.3 NeuroEvolution of Augmenting Topologies
In 2002 Stanley and Miikkulainen proposed a methodology named Neur- oEvolution of Augmenting Topologies (NEAT) [9]. Here each connection between perceptrons are explicitly represented in the encoding scheme (the genotype). Initially the neural network starts out like a regular ANN, but later new perceptrons may spawn and new connections between per- ceptrons may arise, and some be taken away. With time the topology might increase in complexity suiting the task at hand. Another interesting thing about NEAT is that since the network is constantly evolving, new tasks can be learnt while retaining learnt behaviour for previous tasks.
Initially in NEAT a random genome is generated, with the only connection being from every input neuron to every output neuron. Each gene describes a relationship between two neurons, in the following way:
• From-neuron: The source neuron in the synapse
• To-neuron: The destination neuron in the synapse
• Innovation number: Historical information of when the link was created, unique for each gene. When a new node is added, a new
Genome
Neuron 1 Input From 1 To 5 Weight 0.3 Innov. 1
Neuron 2 Input From 2 To 5 Weight 0.5 Innov. 2
Neuron 3 Input From 3 To 4 Weight 0.1 Innov. 3 Disabled
Neuron 4 Output From 5 To 4 Weight 0.7 Innov. 5
Neuron 5 Hidden Neuron
genes Gene
connections From 2
To 4 DISABLED Innov. 4 4
3 2 1
5
Phenotype
Figure 2.9: Genotype representation of neural network
gene is added at the end of the genome and given the next available innovation number
• Weight value: Strength of the synapse
• Whether the gene is enabled
If we recall back to Section 2.1 we talked about mutations and recombination as essential variation operators. We will now explain how NEAT utilises these principles to optimise artificial neural networks.
Mutations can happen by both adjusting weight values and the topology. The topology can be changed by the two mutation operators add connection, or add neuron as shown in Figure 2.10
4
3 2 1
5 1
1 → 4
2 2 → 4
3 3 → 4 DISABLED
4 3 → 5
5 5 → 4
6 2 → 5
Connection Added
6 1
1 → 4 2
2 → 4 3
3 → 4 DISABLED
4
3 → 5 5
5 → 4 6
2 → 6 7
6 → 4
Neuron Added
4
3 2 1
5
Figure 2.10: The two mutation operators add connection and add link
Recombinationis the other variation operator. The genes are aligned according to their innovation numbers as explained below. An illustration of how the genes are aligned can be seen in Figure 2.11. When recombining the two parent genomes into the offspring, a random gene is selected where parent alleles are matching. Non-matching genes are either disjoint or excess, depending on whether a particular gene within or outside the range of the other parent’s innovation numbers.
13
1 2 3 6 4
5
9 3 → 5 1
1 → 4 2 2 → 4
DIS.
3 3 → 4
4 2 → 5
5 5 → 4
DIS.
6 5 → 6
7 6 → 4
10 1 → 6
Parent 1
Parent 2
Disjoint
Disjoint Disjoint Excess Excess
Parent 1 Parent 2
Offspring
1 2 3
6 4
5
5 5 → 4 4 2 → 5 3 3 → 4 2 2 → 4
DIS.
1
1 → 4 8
1 → 5 1
1 → 4 2 2 → 4
DIS.
3 3 → 4
4 2 → 5
5 5 → 4
8 2 → 5
1 2
5 4
3
9 3 → 5
10 1 → 6 8
1 → 5 7 6 → 4 1
1 → 4 2 2 → 4
DIS.
3 3 → 4
4 2 → 5
5 5 → 4
DIS.
6 5 → 6
9 3 → 5 10
1 → 6 7
6 → 4 1
1 → 4 2 2 → 4
DIS.
3 3 → 4 4
2 → 5 5 5 → 4
DIS.
6 5 → 6
Figure 2.11: Recombination of two parent genomes
Competing Conventions and Historical Markings
In evolutionary algorithms two individuals can have the same or very similar behaviour, but expressed in the genome in a different way. This is known as the competing conventions problem, illustrated in Figure 2.12.
The two topologies are equal, but the ordering of the hidden nodes are dissimilar, which would have been expressed in the genome as incompatible parts when doing crossover. Nature’s solution is homology, where two genes are homologous if they are alleles of the same trait. For instance, the bacteria E. coli utilise a special protein RecA to perform a process called synapsis to line up two genes before crossover [10].
In NEAT, this is solved by using incrementing global innovation numbers to align the genomes, so information don’t have to be lost. Each neuron added through mutations are therefore given a unique innovation number. By global we mean the increment is shared among the solutions
in the population. By doing it this way, only differing components in the topology can be exchanged.
Speciation
When new structures in the topology occur by mutations the fitness score will likely go down. The new topology will require some time to optimise its weight since it is unlikely that the new introduction suddenly shows a good feature. Therefore, when competing with the other individuals in the population it can be hard to survive. To protect the innovations, we need a mechanism to give new traits a chance to better themselves.
Therefore, speciation, also known as niching is used within NEAT.
Speciation can be used to find multiple optima in multimodal functions, so that each species optimises their own peak. The main idea is todivide the population into species such that similar topologies are in the same species. It is induced from the use of historical markers that we can use how disjoint two genes are, to tell their evolutionary history correlation. In addition, the differences in average weights values and number of excess genes.
Therefore we get a compatibility distanceδbetween two genes δ = c1E
N + c2D
N +c3W
1 2 3
C B A
4
1 2 3
A B C
4
Figure 2.12: The Competing Conventions Problem
, wherec1,c2andc3 are coefficients determining the importance of the three factors, andNthe number of genes in the larger of the two genomes.
2.4 Evolutionary Robotics
2.4.1 Embodied evolution
One of the many challenges within swarm robotics is maintaining versatil- ity regarding the different environments the robots will function in. Robots
15
can be trained to do a task in a particular environment well, but if changes in the surroundings are introduced, for instance if new obstacles appear or the swarm is deployed into another setting, the swarm might not be able to generalise in a sufficient manner. In addition, if the robot experiences some sort of failure, for instance motor fault, it will in many cases be unable to compensate. This motivates the use of embodied evolution, where control- lers can be updated online while the robot is deployed. To rephrase from what was stated in the introduction, embodied evolution can be defined as following:
• Decentralised: There is no leader coordinating the evolutionary process of generating and selecting offspring. It is up to the robots themselves to execute this locally, therefore the evolution isembodied within the robots themselves.
• On-line: Evolution happens while deployed and not off-line, mean- ing they are not preprogrammed to carry out the task.
• Parallel: We have a population of robots performing actions, evolving at the same time and place.
• Mating: An action where two or more robots decide to send and/or receive genetic material.
2.4.2 First Use of Embodied Evolution
As to our knowledge the first use of embodied evolution in robotics can be found in [11], where they trained a robot controller to navigate to a lamp emitting light centered in the middle of an arena, a task commonly known asphototaxis. The network was a simple neural network consisting of one input, a predicate saying which of the two proximity sensors located on the robot had the highest value, as seen in Figure 2.16. The outputs were mapped to the two motors of the robot, where each output controlled one wheel. This way, given that sensible weight values were present a robot was able to move to the light emitting source. The robots started with a certain level of virtual energy independently of each other.
Over time, the energy would accumulate or degrade depending on each robot’s performance. In more accurate terms, we can say that the fitness functiondetermines thefitness score, or performance of the robot. Therefore, adjusting the fitness score of the robot in a good manner is central to control the robot so we get desirable behaviour.
Several robots were deployed, and they transmitted genetic material between each other based on a probabilistic model, where more successful controllers were more likely to be broadcast. The algorithm was named thereafter: PGTA for Probabilistic Gene Transfer Algorithm. Other receptive robots could then retrieve the genome and potentially deploy the controller. This was a novel approach to designing an artificial neural network that could be used for controlling a robot, but there were room for improvements. As stated earlier, the architecture of the neural network,
meaning the number of hidden layers and how many perceptrons each layer has is of importance when designing the neural network. Pseudo- code of the algorithm can be seen in Algorithm 1
Algorithm 1Pseudo-code of PGTA initialise_genes()
energy←min_energy loop
ifexcited?then
send(genes[random(num_genes)] +mutation) end if
ifhas_received?ANDreceptive?then
genes[indexof(received)] =valueof(received) end if
do_task_speci f ic_behaviour()
energy←limit(energy+reward−penalty) end loop
2.4.3 odNEAT Algorithm
odNEAT executes locally on each robot, making it a decentralised ap- proach. The general algorithm can be found in Algorithm 2, and a visu- alisation in figure 2.15.
17
Algorithm 2Pseudo-code of odNEAT genome←create_random_genome() controller← assign_as_controller(genome) energy←default_initial_energy
loop
ifbroadcast?then
send(genome,robots_in_communication_range) end if
ifhas_received?then
for allcinreceived_genomesdo
iftabu_list_approves()and population_accepts(c)then add_to_population(c)
adjust_population_size() adjust_species_f itness() end if
end for end if
operate_in_environment() energy←update_energy_level()
ifenergy≤min_energy_thresholdandnot(in_maturation_period?)then add_to_tabu_list(genome)
offspring←generate_offspring() update_population(offspring)
genome←replace_genome(offspring) controller←assign_as_controller(genome) energy←default_initial_energy
end if end loop
2.5 Classification of Evolving Robotic Controllers
Online evolution of robotic controllers can be classified into three different categories, according towhen,whereandhowit happens [12]:
• when: off-line vs. on-line, while deployed
• where: on-board vs. off-board
• how: encapsulated or centralised manner vs. distributed manner In off-line evolution the controllers are developed before the robot is deployed, meaning that the evolutionary operators are only applied before deployment. On-line evolution is the opposite, here controllers are adjusting while deployed. It is also possible to do both, where for instance controllers can be pre-evolved within simulation before deployment in real hardware. Recall however that on-line evolution is a prerequisite for the method to be “embodied”.
Evolution can take place both off-board and on-board, as illustrated in Figure 2.13. In an off-board situation, an external computer can process fitness-data sent from on-line robots. In this case the execution of the evolutionary algorithm like selection, mutation and recombination isn’t performed by the robot themselves, but fitness data obtained within the robots is sent to the external machine. After an evolutionary cycle is complete, the external computer injects the robots with new, updated controllers.
Figure 2.13: The evolutionary algorithm can be executed offline or online Taken from [13]
There are mainly three ways how evolutionary operators are managed:
encapsulated,distributedand a hybrid approach between the two, illustrated in Figure 2.14. In the encapsulated version, each robot maintains an
19
internal population of genomes. Certainly, at any given moment only one genome is selected to operate as the controller. The evolutionary process involves the gene pool that resides within each robot, separately. There is in other words no interactions with other robots and therefore no genome exchange. In thedistributedversion, we have local interactions among the robots, meaning they can learn from each other. However, there is only one single genome in every robot’s gene pool, so if the environment is big and there are few interactions, the robots will suffer since no optimisations are done. Therefore, ahybridapproach can involve both a bigger internal gene pool like found in the encapsulated version, in addition to exchanging genomes with interacting robots. Also here, if we recall back to the definition of embodied evolution, we need to have the prospect ofmatingin order for the method to be embodied, so a distributed or hybrid approach is required in that sense.
Figure 2.14: Encapsulated vs. distributed maintaining of genomes Taken from [13]
2.5.1 Online Decentralised NeuroEvolution of Augmenting To- pologies (odNEAT)
There are different variations of NEAT, and for the purpose of multi-robot systems the adapted NEAT methodology Online Distributed NeuroEvolu- tion of Augmenting Topologies (odNEAT) is particularly worth looking into [14]. odNEAT is a distributed and decentralised neuroevolution al- gorithm that evolves both weights and network topology in autonomous robots. In the original paper odNEAT is used to solve three tasks: aggreg- ation, integrated navigation and obstacle avoidance, and phototaxis, ap- proximating the performance of rtNEAT, a centralised method and outper- forming IM-(µ+ 1), a decentralised neuroevolution algorithm. The neural network topology in each robot is augmented progressively, starting out from the simplest possible topology where all inputs are connected to all neurons. Each robot manages an internal population of robotic controllers,
I1 I2 I5
I6 I7
I3 I4
Mutation
I1 I2 I5
I6 I7
I3 I4 I1 I2 I5
I6 I7
I3 I4
I1 I2 I5
I6 I7
I3 I4
I1 I2 I5
I6 I7
I3 I4 I1 I2 I5
I6 I7
I3 I4
I1 I2 I5
I6 I7
I3 I4
I1 I2 I5
I6 I7
I3 I4 I1 I2 I5
I6 I7
I3 I4
I1 I2 I5
I6 I7
I3 I4
I1 I2 I5
I6 I7
I3 I4 I1 I2 I5
I6 I7
I3 I4
I1 I2 I5
I6 I7
I3 I4
Recombination
Figure 2.15: Possible variation operators in embodied evolution
having only one deployed at a time. With time, the controllers will com- plexify, and should suit the task at hand.
In order to avoid similar controllers performing poorly, a tabu list is used. The tabu list keeps track of recent deployed controllers, so only controllers who are dissimilar in topology will be added.
Terminology
The “brain” of the robot, what controls the decisions for what the robot should do next, i.e. how to drive, can be seen in many ways. Therefore, before we explain the approach in its fullness a potential confusion of terms needs to be cleared out. Traditionally we use the word “controller”, however in the context of evolving robotic controllers other terms might be used. We can say chromosome or genome, and since one chromosome or genome is a list of genes which in turn represents an artificial neural network, this might also be used. Furthermore, onegeneis a description of the relationship between two neurons/perceptrons, which the terminology uses bothsynapseandlinkinterchangeably. It depends on the context, when a robot is transmitting genetic material, it might make more sense to talk about genomes than neural networks, and when we are analysing a robot’s behaviour, it makes more sense to talk about the controller or the topology of the ANN, or even the robot itself, since that is what is actually moving.
2.5.2 odNEAT Algorithm
odNEAT executes locally on each robot, making it a decentralised ap- proach. The general algorithm can be found in Algorithm 2, and a visu- alisation of variation operators in figure 2.15.
21
Figure 2.16: Simple neural network controlling two wheels.
2.6 Swarm intelligence
It turns out that nature have also perfected how individuals can solve quite complex problems through collaborating in groups. Initially this might sound obvious since it is common knowledge that humans can collaborate to solve many complex tasks. But we don’t need to look into humans to find examples of said tasks, in fact such examples can be discovered by looking at some of the “simpler” species we can think of. For instance, it appears that in one ant species, its colony’s foraging activities are regulated by the rate of ants returning to the nest [15]. In the colony ants are occupying small caves, and when an ant returns to the nest, the waiting ant receives feedback that a harvester has returned. If there are many returning harvesters, the waiting ants “know” food is easy to come by, so the colony will send out more harvesters since the cost ratio is favourable.
The ants interact with each other by antennas touching, so the colony’s harvesting decisions are actually guided by these small local interactionsas opposed to a central authority commanding the tribe when to look for food.
Such systems are therefore also distributed, and it is through this collective behaviour we achieve what we callemergence: the phenomena where small local interactions within simple systems together can constitute into large scale complex systems [16].
Within the concept of evolution it is well known that food is a drive factor, often generalised into energy, for evolving individual traits within species. For instance, there exists an hypothesis that through mutations within brown bears’ genetic material some bears developed traits which made them lose pigments in their fur, giving them better camouflage when hunting prey in polar environments. These animals would become what we today know as white polar bears, and came to persist as a result of natural selection. No matter the trueness behind the hypothesis, evolution can also target species’ ability to perform collective behavioural traits on a macroscopic level, like how nature has perfected ants foraging abilities. In other words, emergent behaviour in species can be thought of as a result of evolution in itself.
So how is this relevant according to the field of computer science?
If we go back to the ant colony example, the rate of returning ants were the deciding factor for whether to send out new foragers. This is actually a parallel to a computational problem within congestion control in the networking protocol TCP/IP, where the returning ants act as acknowledging packets (ACKs). Since it appears that ants have been doing congestion control already for many years, the same authors as in [15]
credited the ants with the funny term: the Anternet [17]. In addition, ants aren’t the only species studied for their collective behaviour, among all things we have seen slime mold doing network routing [18], plants performing collective decision making [19] and cells doing cycle transitions to solve approximating majority [20]. A more comprehensive list can be found in [21].
2.7 JBotEvolver — A Simulation Platform for Evolu- tionary Robotics
JBotEvolver is a versatile simulation platform for evolutionary robotics written in Java [22]. The platform is open source and currently maintained under https://github.com/BioMachinesLab/jbotevolver. The frame- work provides a graphical user interface (GUI) for configuring evolutions as seen in Figure 2.17 and viewing results, seen in Figure 2.18. A huge ad- vantage of the platform is that classes are loaded dynamically. This means that if we are debugging the executable through any integrated develop- ment environment (IDE) supporting hot code replace like for instance Ec- lipse or Visual Studio Code, we can start a simulation to observe robots’
behaviours, set a breakpoint within the controller and make some changes, and then continue the program execution. We can see the changes imme- diately without restarting the debug session, providing us with the oppor- tunity of quick and incremental development.
There are mainly two ways to do simulations: Through the GUI, or by an addition called Evolution Automator1. The GUI is convenient to get familiar with the framework, but if setting up and running several different experiments where many of the parameters must be tweaked, the Evolution Automator comes in handy. The latter will be used to set up the experiments.
Running evolutions using the GUI
In the configuration tab we can set numerous parameters to use. Not all parameters are compatible with each other. This requires both domain and implementation knowledge of the code base. The main parameters are:
• Robot: set parameters for the robots, like the colour, size between wheels, number of robots and so on.
• Sensor: We can add additional sensors.
1https://github.com/BioMachinesLab/jbotevolver/wiki/Evolution-Automator
23
Features | Simulator Simbad Webots ARGoS Player Enki JBotEvolver
2D/3D 3D 2D+3D 2D+3D 2D+3D 2D 2D
Open-source yes no yes yes yes yes
Dependencies Java 3D multiple multiple multiple multiple none Platforms win/mac/linux win/mac/linux mac/linux mac/linux mac/linux win/mac/linux
Language Java multiple* C++ C++ C++ Java
Learning curve low intermediate high high intermediate low
Distributed evolution no no no no no yes
GUI rich rich medium medium basic rich
Table 2.1: Simulation platform comparisons
• Actuator: We can add actuators to the robots, for instance wheels, LED lights, speakers.
• Controller: Set the controller controlling the robot.
• NeuralNetwork: Set if the robot should use a neural network as the controller. Possible implementations use NEAT, Continuous time recurrent neural networks (CTRNN) and Multilayer Perceptron Networks.
• Population: Related to managing evolved controllers.
• TaskExecutor: Whether to run simulations in parallel, synchronously or by using a distributed platform called Conillon.
• Evolution: Which strategy to use to evolve individuals in the population.
• EvaluationFunction: Which evaluation function to use to determine the performance of the robots deployed.
Running experiments using the Evolution Automator
[22] also present a table with comparisons of different simulation platforms, located in Table 2.1. Since we will need a platform for simulating self- organising robots in 2D space which also is fast to learn, the choice falls on JBotEvolver as framework for running experiments.
In addition to JBotEvolver, there is an extension implementing odNEAT (see Section 2.5.1) in [23] for online evolving of robotic controllers.
The codebase provides a common interface for real Thymio II robots2 and simulated robots within JBotEvolver. The codebase is open source and can be found published on github at https://github.com/fgsilva/
thymios [24]. The common interface makes running experiments easier, since a large portion of the code can be reused independently whether the experiment is done in simulation or on hardware (real Thymios).
2https://www.thymio.org/en:thymio
Figure 2.17: JBotEvolver configuration view
Figure 2.18: JBotEvolver results view
25
2.8 Orientation
The last thing we will do before we move on to designing our system, is to round up with an orientation of where we are positioned in the scientific landscape. So far we have mentioned many different paradigms like artificial neural networks, swarm robotics, self-organisation and embodied evolution. A nice taxonomy of the complex systems science can be found in [25], where they visually break down different subtopics, as can be seen in Figure 2.19. We can say that since we are using different forms of evolution, creating an adaptive wireless network through self-organisation, the three subgroups to the right, Collective Behaviour, Networks, and Evolution and Adaptation are the most relevant groups according to this organisational map.
Figure 2.19: Overview of the different complex systems sciences
27
Chapter 3
Experimental Methodology
The task is to connect two distant access points using an ad-hoc network of mobile robots, creating a relay network for communication between the two points. An illustration of the task can be found in 3.1. In the figure we can see different kind of geometric shapes. In the corners there are two boxes which represent stationary connection points, denoted asHome andSink. Near to Home is where the robots start initially, and it is up to them, or more specifically, their controllers to make them self-organise so that they move into place, together relaying information between the two distant points. Furthermore, each robot is equipped with a networking interface so that it can transmit packets containing information with other network devices in range of the robot. This gives us a dynamic system with moving and communicating agents. We also see that the robots are within a square, which represents walls, meaning the robots are limited regarding where they can move. With the walls in place, this sort of environment can be referred to as an arena environment, with a given width and height.
To determine whether we can accomplish our goal, we will present three methods depending on what controllers the robots are equipped with:
• method using random-walk controllers
• pre-programmed controller
• variant using embodied evolution as the controller
These will be tested in simulation only, leaving the hardware implementa- tion as future work.
3.1 Assumptions
A simulation requires many abstractions and simplifications compared to the real world, after all the purpose of a simulation is to speed up many
29
time consuming aspects. For instance, in a simulation robots can move faster, so we can quickly get an intuition whether or not something being tested will work. However, this comes with a cost, where elements tested might prove promising in simulation but not in the real world.
In this section we will go over some simplifications made regarding the simulation.
Concept of range and signal strength
We need the robots to coordinate themselves among each other to connect the two access points of interest. Since they relay messages through wireless communication, two robots need to be withinsignal rangewith one another. Therefore, in the real world we would maybe be more interested in thesignal strengthbetween the robots than the actual range. In reality, a direct network connection between two agents is continuous, meaning that the further away a distant robot is located from a given robot, the weaker the signal strength between the two. We are more interested in the self- organisation aspect than network technicalities, so which protocols can be used in an ad-hoc network, how routing would be done in said network, and so on is certainly of interest, but out of scope for this thesis. Thus the main focus will be on the placements of the individual robots according to each other more than how they communicate.
A robot’s perception
Each robot doesn’t know anything else than what it can sense from the outside world, although each robot is allowed to relay communication traffic from connected robots, and use this in their decision making. One thing a robot knows from the outside world is which robots it is adjacent with, i.e. within communication range.
In addition, each robot has arouteto both home and sink. Every robot has their own route to home and sink, and we will refer to these as the home route and sink route. These routes are linear lists, with information about the adjacencies of the contained robots. This means given a robot’s routeRh={home,robot1,robot2,robot3}, robot1 is connected to robot2, and robot2 is connected to robot3. In other words, an element is adjacent with the one before and after it. The route is learnt through adjacent robots. It compares it’s own route home, and if a neighbour has a shorter route than itself, it updates it’s route home, so that there is a shortest path first (SPF) connection with home.
Related to connectivity, a robot knows the following:
• which neighbours it has
• the route home
• the route to the sink.
Sink Home
Figure 3.1: Illustration of components involved in task being solved.
3.2 Methodology
We will look into three different self driving controllers for the robots:
• Random Walk controller
• Pre-programmed controllers
• Evolvable controllers
3.2.1 Random Walk Controller
The first of the three controllers we are going to run on the robots is the random walk controller. It’s behaviour is quite intuitive: Every x0th time step, pick a random direction between left, right, forward and backward, and apply output to the wheels accordingly. Algorithm 3 shows the basics in the algorithm. By driving left or right in this case doesn’t necessarily mean: apply appropriate speed on the two wheels so that the robot rotates 90 degrees from current standpoint. It just means: run for a given time leftwards or rightwards. This might end up making the robot turn partially to any direction. The main idea is simply to make the robot drive around randomly. Given the random behaviour of the robots and the fact that the environment is within a closed area, we will get a somewhat evenly distribution of the robots throughout the environment over time. If
31