Adapting Gait Patterns using Evolutionary Algorithms and Intelligent Trial and Error
Fredrik Sævland
Master’s Thesis Spring 2016
Adapting Gait Patterns using Evolutionary Algorithms and Intelligent Trial and Error
Fredrik Sævland 18th May 2016
Abstract
Using traditional optimization algorithms, the robot may be sensitive to changes in the environment. Adapting the robot is usually not a viable alternative due to the amount of processing power required to optimize a new solution for the changes in environment.
In this thesis a new solution, based on [9], is proposed as a means to adapt the robot for changes in the environment. The technique builds a feature- space using an illumination algorithm called MAP-Elites, and performs search using Intelligent Trial and Error. Using these techniques, it should be possible to adjust the features of a currently evolved robot, and adapting instead of evolving for new changes.
Contents
1 Introduction 1
1.1 Goal . . . 2
2 Background 3 2.1 Evolutionary Algorithms . . . 3
2.1.1 Terminology . . . 3
2.1.2 Process of Genetic Algorithm (GA) . . . 4
2.1.3 Generation . . . 7
2.1.4 Overfitting . . . 7
2.2 Evolutionary Robotics . . . 8
2.2.1 The Reality Gap . . . 9
2.3 MAP-Elites . . . 9
2.3.1 Illumination algorithms and optimization algorithms 11 2.4 Intelligent Trial and Error . . . 12
2.5 Curse of Dimensionality . . . 12
3 Software 15 3.1 Robdyn . . . 15
3.2 Sferes2 . . . 16
3.3 Limbo . . . 16
3.4 Required computational power . . . 16
4 Implementation 17 4.1 Robot . . . 17
4.1.1 Controller system . . . 18
4.2 Using MAP-Elites and IT&E to adapt gait patterns . . . 21
4.2.1 Genome . . . 22
4.2.2 Operators and Parameters . . . 22
4.3 IT&E - Adapting through trial-and-error . . . 23
4.3.1 Stopping criteria . . . 24
4.4 Simulator . . . 25
4.4.1 Placing obstacles . . . 25
4.4.2 Fitness-function: Measuring performance . . . 26
4.4.3 Feature-function: Defining the behavior . . . 27
4.4.4 Visualization . . . 27
4.4.5 Defining the environment . . . 27
5 Results 31
5.1 MAP-Elites archive . . . 31
5.1.1 Alpha cutoff . . . 32
5.1.2 Fitness correlation of the Y-axis . . . 33
5.1.3 Fitness correlation of the X-axis . . . 34
5.1.4 Location of fitness . . . 36
5.2 Building the MAP-Elites archive . . . 36
5.2.1 Parameters . . . 37
5.3 Adapting using IT&E . . . 39
5.3.1 How IT&E searches the archive . . . 40
5.3.2 Solutions found by IT&E . . . 42
5.3.3 Experiment 1 - IT&E Adaptation vs. Best Individual . 47 5.3.4 Experiment 2 - IT&E Adaptation vs. Best Individual trained with obstacles . . . 49
6 Conclusion 53 6.1 Discussion . . . 53
6.1.1 Performance of the MAP-Elites archive . . . 53
6.1.2 Performance of the IT&E-optimization . . . 53
6.1.3 MAP-Elites and IT&E vs. Traditional optimization . . 54
6.1.4 Future Work . . . 56
6.2 Conclusion . . . 57
6.3 Running the software . . . 60
List of Figures
2.1 Flowchart of GA . . . 4
2.2 Pseudocode of the general procedure of evolutionary algo- rithm[14] . . . 4
2.3 Plot of Equation 2.1 . . . 5
2.4 Illustration of a simple single-point crossover . . . 6
2.5 Example of what overfitting might look like in statistical classification . . . 8
2.6 Pseudocode of a simple MAP-Elites algorithm[24] . . . 10
4.1 Composing of physical robot . . . 18
4.2 Photo of physical robot . . . 19
4.3 Schematic of physical robot . . . 20
4.4 Lower- and upper-bands of parameters . . . 20
4.5 Graph of controller-function . . . 21
4.6 Genome . . . 23
4.7 Simplified flowchart of IT&E . . . 24
4.8 Simulation with 0_150_15 configuration . . . 25
4.9 Simulation with 0_0_0 configuration . . . 26
4.10 Testing archive . . . 29
4.11 Testing phenotypes from archive in simulation after running archived value for two runs . . . 29
4.12 Comparison of adding influence of randomness into the sampling . . . 30
5.1 Archive at generation 2300 . . . 31
5.2 Position of test points . . . 32
5.3 Areas of high fitness . . . 33
5.4 Phenotype(12, 32)exploiting the back legs to lift the front legs . . . 35
5.5 MAP-Elites archive in early generations . . . 36
5.6 MAP-Elites archive through generations . . . 37
5.7 Best fitness per generation . . . 38
5.8 Mean of archive per generation . . . 38
5.9 Phenotypes in archive, per generation . . . 39
5.10 gatest.cpp - The parameters . . . 40
5.11 IT&E searching for the best solution, see Table 6.3 for data . 41 5.12 Scatterplot of solutions from two scenarios . . . 46
5.13 Scatterplot of solutions from two scenarios . . . 47
5.14 Plot of incline and decline from 0.19 to 0.21 and -0.19 to -0.21 With forced stopping criteria of MaxIterations set to 20 . . . 48 5.15 IT&E vs. Best performance individuals . . . 51 6.1 Mean fitness of ELITE/ELITE_O vs. IT&E . . . 58
Preface
Acknowledgments
I would like to thank all the people that helped me, Matthew Greening for helping me proofread the thesis, and a special thanks to my supervisor Kyrre H. Glette for great support and guidance.
Chapter 1
Introduction
When using robots for everyday tasks, it is important for them to be able to adapt to unpedictable situations in the chaotic real world. One of the major drawbacks of Evolutionary Robotics (ER) is that they often involve expensive computation that may be sensitive to changes in the environment they are evolved in. Using what is referred to as optimization algorithms, they typically optimize for a single environment with a single solution to perform well in the given scenario. In [24] Mouret et al.
proposes a different approach to evolving robots, which is referred to as illumination algorithms, that illuminates performance rather than just optimizing for it. The general idea is to map out a feature-space where each best performing "elite" is mapped onto a multi-dimensional matrix according to the characteristics of how the robot performs as well as a performance-measure which can tell the user how well a solution with certain characteristics can become.
The algorithm used for this application, called MAP-Elites, is used to create such a feature-space where each adjustment of features have a dif- ferently evolved solution. This feature-space, mapor archive, can then be used to search for a solution with different characteristics defined by the user, such as height of lift in the legs or time the legs spend in contact with the ground. The user can then adjust these features to adapt or change. This approach is an alternative to evolving new strategies every time the robot encounters a different challenge. With something as simple as adjusting some features, the robot may overcome the new challenge by only adjust- ing the features of an already evolved solution. Using the IT&E-algorithm1 created inRobots can adapt like animals[9], it is possible to search this MAP- Elites archive in a fast and efficient way using simple trial-and-error to look in areas of the feature-space (archive) to narrow in on optimal solutions.
This thesis takes the method proposed in [9] to see if it is similarity viable for other applications. Cully et al. found very good results in their imple- mentation of intelligent adaptation, where the robot will adapt new gaits according to the damage or disability of a joint. This application uses the same idea, only applied to a slightly different area. This may prove that the technique is flexible and applicable to a broader area in Evolutionary
1Intelligent Trial and Error
Robotics. All of this while being just as effective and robust as when adapt- ing for broken or disabled legs.
An important aspect of the technique of adapting the gait pattern is that, by using a pre-computed archive, the computation of the archive is separated from the search for solutions and away from the application on the robot. This means that the archive can be developed on powerful computers, then loaded onto a smaller embedded system, where the relatively lightweight IT&E-algorithm can find the optimal solution in a short time. This will not only make the systems faster, but also make a physical robot cheaper, lighter and more flexible, as the need for large batteries and powerful on-board processing-power dissipates. This will make the robots a generally better solution, but also make the robots attractive for commercial uses where cost and ease of use is important.
Instead of requiring the user to do massive computations for the robot to learn to walk, they can instead have a low-cost system where the robot will download a pre-computed archive from a central server that does all the heavy processing required by evolutionary algorithms.
1.1 Goal
The goal of this thesis is to prove that MAP-Elites and IT&E (Intelligent Trial and Error) can be applied to other areas of Evolutionary Robotics, as well as perform adaption of gait patterns wile the environment changes in terms of incline/decline of the terrain, and obstacle shapes and sizes. The idea is that the same MAP-Elites archive, which defines a set of features, should provide a solution where it can adjust the features, like leg lift and leg sweeping2, to maximize the performance in everything from flat ground to rocky terrain with a steep incline.
2The movement of the legs back and forwards
Chapter 2
Background
2.1 Evolutionary Algorithms
Evolutionary Algorithms (EAs) is a type of population-based metaheuristic [6] optimization algorithm that uses the mechanics of biological evolution as inspiration. Evolutionary algorithms are a subset of evolutionary com- putation [1], which contains classes such as, genetic algorithms, genetic programming and evolutionary strategies [2]. This research will only take genetic algorithms into consideration.
The idea of evolutionary algorithms stems from the natural evolution, where the species that make up the world have adapted strength and fit- ness against the environment to solve complex issues by the natural pro- cess of selection. Evolutionary algorithms try to capture this strength of evolution as a way of problem solving, which is by using trial-and-error, encouraging the spread of good individuals and eliminating the weak in- dividuals that doesn’t offer a good solution.
2.1.1 Terminology
As evolutionary algorithms is inspired by biological evolution, the algo- rithms share the same terminology. In this thesis we will use many of these terminologies to describe different parts of the algorithms, as well as the different results.
Evolutionary algorithms can be summed up as an environment with a pop- ulation of individuals that strive for survival and reproduction, where fac- tors are determined by the fitness of each individual. Fitness is measured by a fitness-function which evaluates how well the individual can perform the particular task that is optimized for. This fitness represents the chance of survival and reproduction for the individual, which implies that for each generation of the population, the strong traits will remain, while the weak traits will dissipate in a metaphoricalsurvival of the fittest.
The inspiration from biology also carries the terminology over to EA, mean- ing that the metaphorical representation in the EA uses the same name to describe the process as used in biological evolution. The genetic algorithm
uses the name individual to represent a solution that is encoded by the genome, or chromosome. Genetic Algorithms (GA) takes these individuals through the process of evolution.
2.1.2 Process of Genetic Algorithm (GA)
Figure 2.1: Flowchart of GA
1 BEGIN
2 INITIALISE p o p u l a t i o n with random c a n d i d a t e s o l u t i o n s ; 3 EVALUATE each c a n d i d a t e ;
4 REPEAT UNTIL (TERMINATION CONDITION i s s a t i s f i e d ) DO 5 1 SELECT p a r e n t s ;
6 2 RECOMBINE p a i r s o f p a r e n t s ; 7 3 MUTATE t h e r e s u l t i n g o f f s p r i n g ; 8 4 EVALUATE new c a n d i d a t e s ;
9 5 SELECT i n d i v i d u a l s f o r t h e nex t g e n e r a t i o n ;
10 OD
11 END
Figure 2.2: Pseudocode of the general procedure of evolutionary algo- rithm[14]
Evolutionary algorithms and all the underlying classes share the idea behind the process of natural selection, or survival of the fittest. The process is based on a given population of individuals that struggles for the limited resources in the given environment. In order to survive and spread their genomes, the individuals will compete for survival and reproduction.
Genetic algorithm (GA) is a class under evolutionary algorithm that uses this principle to optimize for a solution. Figure 2.1 shows the process of
evolution in such a genetic algorithm. The algorithm starts of by initializing a random population that goes through the steps of evaluation, selection, crossover and mutation, which will give offsprings for a new generation, where the process repeats until it is terminated by a termination criteria, or the user.
Evaluation
Evaluation is what is referred to as a fitness-function. This is the requirement of survival for the individuals, which is the measure of how well an individual performs and what improvements means. When performing an evaluation, the individual is asked to perform a test; the results from which determines the fitness of this individual, or how strong it is to do the task that the user wants to optimize for.
One example of an evaluation is to maximize the fitness function 2.1 so that the function outputs the highest result with a genome consisting of the valuex. If for instance the genome of an individual is11011the value that is inserted into the fitness function would be decoded (11011 → 27) evaluating the fitness function as 100− 271002 =92.71, which is a fairly good individual. On the other hand an individual with a genome of, for instance, 1011001 will be decoded (1011001 → 89), and evaluated in the fitness function as 100− 891002 = 20.79, meaning that the fitness of this individual is lower than the previous. This is because the values of the function desires genomes representing values as close to 0 as possible, which can be seen in Figure 2.3.
In the case of the fitness function 2.2 the fitness of the individual will be in correlation with the value of the decoded genome. This squaring of values will cause the evaluation to prefer as large values ofxas possible.
Figure 2.3: Plot of Equation 2.1
f(x) =100− x
2
100 (2.1)
f(x) =x2 (2.2)
Selection
Parent selection, mate selection or survivor selection are methods of preferring better individuals with higher fitness values to reproduce and keeping the population at a predetermined population. The process of selection can be done in many ways, where the goal is usually selecting the individuals that performs well, like in nature where the stronger individuals with higher fitness have a bigger advantage when it comes to reproduction and survival. Typical selection strategies may be tournament, truncation, linear ranking and exponential ranking. See [4]
for a comparison of selection schemes.
Crossover
Crossover is the first step of reproduction, in where the selected parents mixes their genes into what is called an offspring, which will belong to the new generation. There are different methods of crossover that can be done, such as single-point crossover (see Figure 2.4), double-point crossover and Partially Matched Crossover (PMX) for permutation representation.
See [14] for details about crossover strategies. The job of the crossover- procedure is to create the new generation of individuals, which may bring the good traits from the parents over to the next generation. This crossover is a metaphorical way of inheriting traits from the parents.
Figure 2.4: Illustration of a simple single-point crossover
Mutation
After the reproduction-procedure has applied a crossover to create an offspring, the offspring will be mutated in order to introduce some element of randomness into the new generation, create diversity and help the population escape a local maxima. The individuals are optimizing towards a peak performance that only increases in the local area, and is not searching for a new and better maxima.
Mutation can be a simple procedure of flipping some bits in a binary genome, or it can change one or more values in the genome into another value. Mutation is trying to mimic the biological mutation that happens in nature, where the phenotypes gain some unique traits in the genome.
This is great for evolution where new traits are explored, and could either be good for the individual or prove fatal. Similarity in genetic algorithms, mutation will introduce new information into the population, whether this is good or bad for the individuals.
2.1.3 Generation
When the Evolutionary Algorithm has gone through one iteration of the evolutionary process of evaluation, selection, crossover and mutation, it is referred to as one generation of the population. The next time the Evolutionary Algorithm starts a new iteration, the population will have the offspring produced from the individuals from the previous generation.
As the parents gets older or inferior, they will naturally die off for the new generations to replace them.
2.1.4 Overfitting
Overfitting is when a model describing the pattern of data also describes noise and variance. If the model is too specific to the exact training data, then the sensitivity to changes can lead to loss of accuracy on out-of-sample data. [5]
In order to prevent overfitting the model needs to be somewhat generic, which can be achieved by simply training the model less, or design an environment where it is difficult to let the genetic algorithm overfit to a specific niche.
In the application of adapting a gait pattern, overfitting can for instance be caused by the gait being specifically suited for obstacles placed in some configuration. Instead of evolving robustness of obstacles, the robot is evolving the perfect leg-placement instead. By doing something as simple as randomizing the placement of obstacles, overfitting for obstacles can be prevented. Another approach to prevent overfitting is called early stopping [27], where the training is stopped when the out-of-sample error increases.
Figure 2.5: Example of what overfitting might look like in statistical classification
2.2 Evolutionary Robotics
Evolutionary Robotics (ER) is a relatively unexplored field of evolution- ary computation. Only in more recent years has it started to become more mainstream in the field of robotics and artificial intelligence. The concept of ER is to create autonomous robots using the techniques of Evolutionary Algorithms, where the robots develop their own skills to adapt and maxi- mize performance in a given environment when performing some kind of task. Evolutionary Robotics typically use techniques such as Neural Net- works and Genetic Algorithms. [26]
Evolutionary Robotics are rapidly gaining popularity due to their indepen- dence from human influence, where the optimization for a given task may be defined very abstractly and unspecific, enabling the robot to find solu- tions that may be untraditional and unintuitive for a human. These unin- tuitive solutions are often based on exploitation of the environment.
Although Evolutionary Algorithms are often very robust against local max- ima in applications such as ER, the solutions found are rarely the optimal one due to the large and complex search space. ER works in a similar man- ner to the other types of Evolutionary Algorithms. The user creates a pop- ulation of robots that go through a process of evolution, where the strong spread their genes through reproduction, while the weak die off. Fitness
is measured in how well the robot can perform the desired task, such as walking or jumping, whether the goal is speed, robustness or both. They typically define how the robot can do a task through a genome, then de- fines the fitness-criteria which benchmarks how well the robot is doing the task.
As the trial-and-error nature of evolutionary algorithms require a large amounts of evaluations, this is unfeasible on a physical robot because of the time and effort required. This is why ER is typically done in a simula- tion, and transferred over to the real world when a solution is found. [7]
However this transfer proposes a new set of challenges, which is referred to asThe Reality Gap.
2.2.1 The Reality Gap
When working with Evolutionary Robotics, training the robot in a simula- tor is often the only viable alternative, as doing real world evaluations can take too much time and effort. The issue that arises from doing evaluations in a simulation is the disparity between the behavior in the simulation and the behavior in the real world. Simulated environments often have unre- alistic aspects, inaccuracies and weaknesses that can be exploited by the robot. Since the robot will always try to maximize the fitness, one such exploitation will lead the robot into a deceptional performance, where the algorithm is optimizing for a behavior that is not possible in reality.
This issue of discrepancy between simulation and reality is what is referred to asThe Reality Gap. [17–19, 23]
Koos et al. in [19] proposes a solution, calledThe Transferability Approach where two main objectives are optimized via a multi-objective, pareto front MOEA [10] that evolves what is referred to as a transferability-measure, as well as the fitness-measure of the robot. Previous approaches to the re- ality gap has been reality-based optimization approaches with optimiza- tion fully or partially on a real robot, simulation-based optimization with the entire process in simulation, or robot-in-the-loop simulation-based op- timization that evolves in the simulation and does transfer experimentation during the evolution. [19]
2.3 MAP-Elites
MAP-Elites is a technique described in [24]. This type of algorithm is heavily based on genetic algorithms (GAs) [2], which is used as a part of Intelligent Trial and Error (IT&E) as described in [9]. The MAP-Elites is what was coined by Mouret et al. as an illumination algorithm, which differs from traditional optimization algorithms, such as the NSGA-II [13], a standard Multi-Objective Evolutionary Algorithm (MOEA), or simple single-objective evolutionary algorithms.
MAP-Elites is presented as a new illumination algorithm that searches for the highest performing performing solution at each point in the space. The search-space in itself can be high-dimensional, where as the feature-space
is in a lower dimension defining only the characteristics that is sought after.
[24]
MAP-Elites is conceptually easy. The user will design a fitness-function
1 BEGIN
2 INITIALIZE map o f e l i t e s with N−dimension 3 REPEAT FOR ( I i t e r a t i o n s ) DO
4 I F i t e r < G THEN
5 x ’← r a n d o m _ s e l e c t i o n ( )
6 ELSE
7 x← r a n d o m _ s e l e c t i o n ( X )
8 x ’← random_variation ( x )
9 b ’← f e a t u r e _ d e s c r i t o r ( x ’ )
10 p ’← performance ( x ’ )
11 I F P ( b ’ ) = NULL OR P ( b ’ ) < p ’ THEN
12 P ( b ’ ) ← p ’
13 X ( b ’ ) ← x ’
14 OD
15 RETURN f e a t u r e−performance map ( P and X )
G Number of random solutions X Map of elites x Elite from map X x’ Offspring from x p’ Performance or fitness P Performance map b’ Feature descriptor
Figure 2.6: Pseudocode of a simple MAP-Elites algorithm[24]
f(x) that evaluates the fitness of an individual x, similar to a standard evolutionary algorithm. One such fitness-function, or performance- measure, could be how fast the robot can move, or how far the robot can get in a set amount of time or iterations. The user will also define the N dimensions that will define the feature space of the archive, such as how high the robot lifts the legs, or how long each leg is in contact with the ground as used in [9]. The dimensions are chosen based on what the user requires to be defined in the fitness-performance map, as well as the computational resources required (see Section 2.5). The MAP- Elites algorithm will search for the best solution for each of the cells in the N dimensional fitness-performance map, each cell being defined by the desired resolution of the archive (e.g. 128 by 128). For instance, the MAP-Elites algorithm can search for the best individual that has the specific feature of lifting the legs and sweeping the legs a set amount, then placing them in their respective feature-describing position in the map. There are two types of spaces in the MAP-Elites algorithm, the first is the search space, where all possible values of x resides. The second space is the fitness-performance map,P, where evaluated individuals are mapped out by their feature descriptor,b0, that is describing the best phenotype that has the characteristics required for that cell in the fitness-performance map, or archive. x is described by the genome or genotype, as well as well phenotypePx. f(x)is the fitness function that describes the performance of eachx, where as a feature functionb(x)is what is describing the value in the fitness-feature space in the N-dimension. This can for instance be a calculation ofb(x)that measures how long the legs touch the ground and report back some value to describe this feature, or to run theb(x)in the real-
world to measure the power consumption of the robot. Another example could be to simply return parts of the genome as a direct correlation to the features. This is useful when the genome represents values to a controller- function (see Section 4.1.1).
Process
As mentioned, the MAP-Elites algorithm, shown in Figure 2.6 is relatively simple. MAP-Elites starts with an initialization process by generating G random genomes, and running the performance and feature evaluation on each of them. The random genomes are then mapped out in the archive in their appropriate positions (calculated by b(x)). If there are multiple individuals in the archive that have the same feature description, the one with the highest fitness-value will dominate over the other individuals competing to occupy the cell. The procedure then follow the steps of:
Choosing a random cell in the map and creating an offspring from the individual in that cell via mutation and/or crossover. Then the algorithm will evaluate the fitness and behavior of the new offspring and place it in the appropriate cell. If the cell is previously occupied, the individual with the highest performance will dominate the other one and will be the occupier of the cell. This process continues until some termination criteria, such as time, is reached. It can also terminate at a certain generation or when the resulting archive has a certain characteristics.
2.3.1 Illumination algorithms and optimization algorithms Traditional search algorithms, also known as optimization algorithms, have a goal of returning the highest solution in the search space. Illumi- nation algorithm is a terminology coined by Mouret et al. [24] as a different way of finding good solutions to a problem. Illumination algorithms finds solutions that are located in the search space, then maps out the highest performing solution by the features and characteristics of that particular solution. The user then adjusts the features of the robot by finding a so- lution in the feature-space that corresponds to the selection. Another ad- vantage of the illumination algorithm is that the fitness at each point in the feature-space can give an overview of how the fitness changes along with the features of the robot. This focus on returning the highest-performing solution at each point will find areas of high fitness,illuminatingthe fitness potential. In more recent years, there has been a change where the promot- ing of diversity has been a higher priority than the focus on maximizing performance. Some of these algorithms have been designed with the goal of returning a repertoire of solution that expands across related objectives.
[8, 20, 21, 24]
2.4 Intelligent Trial and Error
Intelligent Trial and Error (IT&E) is an algorithm that makes the robot adaptable to changes by seaching the behavior-performance map gener- ated by the MAP-Elites algorithm. IT&E will use the knowledge of the behavior-performance map, or archive, generated by MAP-Elites to search for compensatory behaviors to alternative objectives by adjusting the fea- tures of the solution. This prior knowledge can be used to select solutions in the feature-dimensions based on what is expected to perform well, and find a new solution in a very short time. Guided by this map, the IT&E- algorithm tests different behavior with different feature-characteristics, then starts predicting where the best solutions in the map are located by using the assumption that alternative objectives well suited for a new solu- tion are found by adjusting features. The users of the robot will only need to describe the dimensions of the map as features along with describing a fitness measure, through the requirements of the MAP-Elites algorithm. [9]
In [9] the IT&E-algorithm is used as a way to search in a behavioral feature- space over how the robot moves the legs, such as for how long each leg touches the ground per dimension, and with the fitness measure of how far the robot can get. If one of the legs then gets damaged, the robot can search in the behavioral feature-space to find solutions that can compensate well, slowly narrowing in on the features that describe a behavior where the use of the damaged leg is minimized. This solution is found by trial-and-error, as the name suggests.
Process
The IT&E will simply perform trial-and-error based on the confidence- value of the individuals. The archive will start out with a low overall confidence value, and the IT&E begins with the most promising value in the archive, i.e. the cell that holds the individual with the highest fitness- value. The individual is evaluated in the desired environment and is then assigned a confidence value. If it evaluates to a good performance, then the confidence is set to a high level, whereas evaluations that show poor performance will be assigned a lower level of confidence. The algorithm continues this "select-test-update process", as they call it in [9], and the process does not stop until a termination criteria is reach. Hopefully by then the IT&E has found a compensatory behavior that performs well in the environment that it tested for.
2.5 Curse of Dimensionality
Curse of Dimensionality, described in [30], is a term introduced by Bellman in [3] to describe the problem caused by the exponential increase of vol- ume in association with adding extra dimensions to Euclidean space. This means that as more dimensions are added to the data set, the more sparsely
the finite data will be distributed in the space. This has a range of implica- tions in the field of artificial intelligence, due to the difficulty of gathering sufficient data along with the required processing power to generate such data.
This relates to the MAP-Elites archive in that the resolution and dimen- sions of the archive has a serious impact on the quality, implied by the massive increase in required computational power. In the case of this ap- plication, the performance-feature map is a dimension of 128x128, meaning that it is mapping out the two features, lift and sweep, as two dimension with 128 samples each. This two-dimensional space makes it easy to visu- alize the map, as well as minimizing the required computation needed for a fully populated archive with solutions for every variations of the desired features. Just adding one more feature would increase the feature-space by 128 times, making the computation 128 times more intensive in order to provide the same density of data in the feature-space.
Reducing the feature-space has some advantages, such as reducing computational complexity, increasing density of data, and keeping it in three dimensions or less. This can make it easier to visualize, as well as help the users in understanding the data. In [9] the computation of a six- dimensional feature space required roughly two weeks on one multi-core computer, making a resolution of 128 have 1286 = 4.3980×1012 samples given that the archive is fully populated. Having one less dimension will reduce the feature-space by 1286−1285 = 4.3636×1012 which results in 1285 =3.4359×1010.
Chapter 3
Software
OpenSceneGraph 3.2.3
OpenDynamicsEngine 0.11.1 (Older version)
C++ C++11 gcc (Debian 5.3.1-14) 5.3.1 20160409
Robdyn https://github.com/resibots/robdyn (Tested on 2/5/16) Sferes2 https://github.com/sferes2/sferes2 (Tested on 2/5/16) Limbo https://github.com/resibots/cully_2015_nature (Older version)
3.1 Robdyn
Robdyn is the framework in which the simulation is built upon. This is a wrapper around the popular OpenSceneGraph1 graphics toolkit, together with the Open Dynamics Engine2 as its physics engine counterpart. This wrapper is made to make the simulation of robots a much easier task.
Robdyn is not a framework, as opposed to Sferes2 (Section 3.2) or Limbo (Section 3.3). Robdyn being a wrapper, it helps to define two components of a robot simulation: First it is the robot, where a robot is built using primitive geometric shapes and joints with the properties of real servos, which in this case is made to mimic the Dynamixel AX-123. The second component of the simulation is the environment. This is the virtual world where the defined robot resides. The environment can be tweaked by parameters, such as adding geometric shapes as obstacles, creating a sloped plane, or changing the gravity. The environment is defined and created with the configuration of the robot inserted into the simulator. It is then run for a set amount of steps until it exits.
Robdyn is developed by the Resibots Team as a part of their research on evolutionary robotics. This is the same simulator that is used in for instance [9].
1Homepage OpenScenegraph: openscenegraph.org
2Homepage Open Dynamics Engine: ode-wiki.org
3Dynamixel Manual: http://support.robotis.com/en/product/dynamixel/ax_series/dxl_ax_actuator.htm
3.2 Sferes2
Sferes2 is a high-performance C++ framework for evolutionary computa- tion, and is the back-end for the implementation of MAP-Elites [24]. The Sferes2-framework is the component that handles the evolutionary compu- tation. The user defines the algorithm in a template, then runs it through Sferes2, or uses one of the many supplemented algorithms such as NSGA-II [13],e-MOEA [12] or CMA-ES [16]. Sferes2 is divided into three part: The framework, the optional modules, and the user experiments. The frame- work provides the binary string genome, as well as a real number genome with the supplemented operators: Gaussian mutation, Uniform mutation, Polynomial mutation [11], and SBX4crossover [11]. The user is free to de- fine any additional modules for Sferes2. [25]
3.3 Limbo
Limbo is designed as a lightweight framework for Bayesian optimization and is mainly created with the purpose for research with novel algorithms.
Built upon the same modularity and flexibility of Sferes2, it can either be used with the supplemented modules, or the user can define their own. In the case of IT&E it is added to the Limbo-framework as an external module, although Limbo was written by the same authors. [22]
3.4 Required computational power
Performance in terms of computational speed is not what makes MAP- Elites a better alternative, being that the initial computation is equally demanding as traditional optimization algorithm. Training an archive for 2300 generations, such as the one used in Section 5.1, will require 2-3 days of computation on a standard desktop computer, and computation on a two core server will produce approximately 350 generations in 24 hours. This is with a configuration of 300 individuals and a step increment of 0.006 (See Section 3.1).
Doing these kinds of calculations on-board a robot, or during operation, is regarded as infeasible. This is where the strength of archiving comes in, where the computation can be done once on an external computer, such as a computing cluster.
4Simulated binary crossover
Chapter 4
Implementation
Based upon the work of Mouret et al. in [24] and Cully et al. in [9]i, the implementation will use these algorithms in a different area in order to see if MAP-Elites (Section 2.3) and IT&E (Section 2.4) perform just as well in the application of adapting gait patterns.
In this implementation of MAP-Elites and IT&E the approach is different in terms of application, and implementation.Robots can adapt like animalsuses a six-dimensional space where each dimension represents the proportion of time theith leg spends time in contact with the surface, which is useful for an application where the goal is to adapt to disabled legs, where the IT&E-algorithm can search in solutions that offer varying degrees of leg- contact with the ground. [9]
As mentioned in Section 2.4, the usage of these techniques depends on describing the features that are desired in searching for behavior.
This thesis takes upon the challenge of adapting a gait pattern using the techniques described in Chapter 2 which involves some changes, as described in Section 4.2
4.1 Robot
The robot used in this implementation is based on a robot developed using evolutionary algorithms itself, as described in [29]. The robot is being built in the simulator using simple geometric shapes, that are linked together with joints called motors, or servos. Figure 4.1 illustrates each individual component of the robot.
The head is what is called the main body. When running the simulation the graphical interface will create a tracer from that body, as illustrated by the red dot in Figure 4.1. The reason why the head is the main body is mainly due to simplifications in the design, as the head is the first part that is created, and the robot is designed around this part. The main body is only used as a reference of the position of the robot in the virtual space.
Looking at 4.1 the joints are numbered in the order of creation when running the build-process of the robot. The robot has four legs with two joints each. The upper joints (1, 3, 5, 7) have a sweeping-movement,
meaning that the servos on those joints move in a plane horizontal to the ground. This movement is referred to as sweep. The lower joints (2, 4, 6, 8) have a dihedral movement, which means that the robot lifts the joints from the ground. This is referred to as lift. The middle joints, between the head and mid, and mid to tail, do a sweeping-motion like the upper joints of the legs.
Some simplifications are done in the design, such as using rectangles instead of capped cylinders for the body parts. This should not have an impact on the functionality of the robot, but does make it much easier to create the robot.
The robot, also referred to asrobot 4in [29], has the measurements described in Figure 4.3, which is replicated in the simulation. The robot has some rotated joints, meaning that the legs are angled forwards on the front legs, with φ indicating the dihedral rotation, and ψ indicating the horizontal rotation around the Y-axis in a euclidean space, where Y-axis is the upwards direction relative to the ground. The back legs of the robot are slightly less rotated.
The physical robot (Figure 4.2) is not used in this implementation, but the physical robot is still used as a reference for a well-performing robot.
Future work may include a transfer to the real world on the real robot (See Section 6.1.4)
Modified image from Real-World Reproduction of Evolved Robot Morphologies:
Automated Categorization and Evaluation by Samuelsen et. al[29]
Figure 4.1: Composing of physical robot
4.1.1 Controller system
One of the key points of having the robot fully adapt is to let the robot develop the open loop gait pattern itself, as opposed to predefining and parameterizing an already working gait pattern. There are two reason for this. The first reason is because different behaviors, such as high lift, may
From Real-World Reproduction of Evolved Robot Morphologies: Automated Categorization and Evaluation by Samuelsen et al.[29]
Figure 4.2: Photo of physical robot
require a different style of walking to keep the performance at a reasonable level, and balance is key for a robot well suited to difficult terrain. The second reason for fully adapting the gait pattern is so that the MAP-Elites algorithm can be benchmarked as an alternative to optimization algorithms such as NSGA-II [13].
The robot, as described in Section 4.1, has a total of 10 joints with two for each leg and two in the body. These joints are servos that are controlled by Equation 4.1. This control-system is based on the system used in [15], us- ing a similar controller-function to parameterize the open-loop controller.
The controller-function is modified for an additional parameter f, making the robot capable of adjusting the oscillation-speed of a joint in order to in- crease the stability in demanding situations such as extreme lift in the legs, or climbing a steep incline. The controller function takes in the parameters α,θ,βand f, which are defined by the genome (see Section 4.2.1).αdefines the amplitude of the phase, meaning that a higher value of alpha will lead to more movement in the joint.
θis the offset of the phase, meaning thatθ-value of 0.5 will offset the phase by half a period, which leads to the phase being opposite of a phase having θ-value of 0 or 1. Theθ-parameter can give asynchronous joints in the open loop control-system.
βis the offset of the angle, meaning that the value will move the position where the phase will operate between.β-value of 10 can for instance make the servo phase between 30◦and−10◦, whereas withβ-value of 0, the servo would phase between 20◦and−20◦.
The last parameter is f, which controls the frequency of the phasing. Lower values of f results in slower movement of the joint, where as a high fre- quency will cause the joint to move faster. It is important to note that the
From Real-World Reproduction of Evolved Robot Morphologies: Automated Categorization and Evaluation by Samuelsen et al.[29]
Figure 4.3: Schematic of physical robot
frequency is limited by the speed of the AX-12 servos, in the real world as well as the simulated ones in Robdyn. This means that the phase will usu- ally has a limited reaction, especially with a largeα-value. This results in the delay smoothing out the angle, making the phase more sinusoidal than what the controller-function actually equates to.
The search-space of the MAP-Elites algorithm can be reduced by introduc- ing symmetry to the legs. (see Section 4.2.1)
The parameters of the robot are defined within bands. These bands are fairly arbitrary in terms of value, and are adjusted to maximize mobility, whilst preventing the body pats intersecting (clipping) each other and col- liding in a real-world scenario.
Lower Upper
Amplitude/α 0 40
Phase-offset/θ 0 1
Angle-offset/β -20 20
Frequency/f 0 2
Spread/s 0 40
Figure 4.4: Lower- and upper-bands of parameters
cα,θ,β,f(t) =αtanh(4sin(fπ×t+θ)) +β (4.1) Alpha cutoff threshold
Alpha cutoff threshold is what is often called a dead zone, meaning that values below a certain threshold are ignored. In order to make the robot
Figure 4.5: Graph of controller-function
capable of disabling joints, the controller-function has what is called an alpha cutoff threshold, meaning that α-values below the value of 5.0 will be interpreted as the joint being fully disabled and not moving during the loop. The first reason for this is give the robot the advantage of controlling this feature, where it may find an advantage in disabling some of the joint.
Another reason is due to very low α-values causing needless twitching, resulting in additional wear on real-world servos, as well as reducing the stability of the robot in general.
4.2 Using MAP-Elites and IT&E to adapt gait patterns
In [9], IT&E is used as a way of adapting to damage. This is done by searching in a behavioral-performance map that describe the solutions dimensions represents how long theith leg is in contact with the surface. In order to make the robot capable of adapting and traversing difficult terrain with obstacles, the gait pattern needs to be adjusted in terms of how high the robot lifts the legs, along with how far it sweeps one leg in front of the other. These features are simply described as lift and sweep. Having only two features means low computational complexity, and allows the fitness-performance map to be visualized in two dimensions. Adding more complexity makes it much more difficult to calculate a good result, making it more difficult to visualize and understand the generated archive from the MAP-Elites algorithm.
4.2.1 Genome
The controller-function parameters (Section 4.1.1) for each of the joints are represented in the genome, and this is what makes up a genotype for the robot. The genome is made up of 20 real-number values from 0 to 1, in the EvoFloat1-representation for Sferes2, which is a genome using 32-bit floating point values. The robot evolves by adjusting the parameters as ratios between the lower- and upper-band of the robot (see Figure 4.4), with a percentage from 0% to 100% represented by the real number values 0 to 1.
The genome is simplified in order to reduce the search space of the robot.
By making the assumption that stable gait patterns have the legs move in symmetry, the genome can be reduced by 40%, removing the need for the genome to define 4 additional joints. Instead the right legs are made identical to the left legs, only at the opposite phase, reducing the joints controlled from 10 to 6 along with the parameters from 10×3 = 30 to 6×3=18, which is a 40% reduction.
The 20 values of the genome consist of 18 joint-specific parameters, three parameters per servo, for the six controlled joints. This is listed in Figure 4.6. The two last values in the genome are referred to as global parameters, which are parameters applied to all the joints. Frequency (f) is a global value in order to reduce the search space, meaning that there is a trade- off in controlling the whole robot as means to increase stability, instead of increasing the genome from 20 to 26 (23.08% increase) by adjusting each of the joints individually. The second global parameter is referred to as spread (s). This is also the result of a simplification, where the value will adjust how far the legs are spread from each other, as opposed to letting the robot adjust this with the controller-function. This is a simplification both in terms of making the behavior more defined, as well as making it easier to define the upper- and lower-band of the joints. Not having the spread of the legs limit the boundaries that the controller can operate within, making the Spread-value independent from the upper- and lower-band.
4.2.2 Operators and Parameters
MAP-Elites implements many of the same operators and parameters as regular genetic algorithms. This application uses the same crossover-, mutation- and representation-type as [9].
Crossover-type: SBX [11]
Mutation-type: Polynomial [11]
Representation: Real numbers
Crossover-rate: 0.25 Mutation-rate: 0.1
η-m: 15 η-c: 10
Population size: 300 Parameter range: 0→1.0
1https://github.com/sferes2/sferes2/wiki/Reference-manual#evofloat
H: Head-joint
1: Left/right front-upper joint 3: Left/right rear-upper joint
T: Tail-joint
2: Left/right front-lower joint 4: Left/right rear-lower joint
G: Global
H T 1 2 3 4 G
α 0 3 6 9 12 15 S/18 θ 1 4 7 10 13 16 f/19 β 2 5 8 11 14 17
α(alpha): Amplitude of phase θ(theta): Offset of phase β(beta): Offset of angle S: Spread of legs
f: Frequency
Figure 4.6: The Genotype
SBX - Simulated Binary Crossover and Polynomial Mutation
Simulated Binary Crossover (SBX) was proposed in [11]. This crossover- type was designed with respect to the one-point crossover properties in binary-coded GA, where the average of the decoded parameter values stay the same before and after the operation. The SBX takes one parameterη- c, which defines the probability for creating solutions that are very similar to the parents, while small values allow for more distant solutions. The polynomial mutation is coupled with SBX. The polynomial mutation takes an index parameter η-m that controls the polynomial distribution of the mutation.
4.3 IT&E - Adapting through trial-and-error
The IT&E will read in an archive-file that is generated from the MAP-Elites algorithm. This archive-file is used to reconstruct the archive from the Sferes2 MAP-Elites, to the Limbo framework. The archive-file is a clear text.dat-file that is one line per cell, with 23 values per line, separated by spaces. The first value is the index of the file, which is discarded in this implementation. This is followed by two feature-descriptors defining the position in the archive, one for lift-values and one of sweep-values. The last section of the line is the genome that consists of 20 elements.
With the archive being loaded in the Limbo framework, the process of IT&E can start. The IT&E is a module written for Limbo, and the internal functionality is isolated from the user. The user will have to implement the evaluation-function, however. This evaluates the performance of the candidate that is being tested. In this implementation, the fitness-
evaluation is similar to the fitness-function used in the MAP-Elites algorithm, where the genome is simulated in Robdyn, and the performance is measured in how far the robot can get in a limited amount of iterations.
4.3.1 Stopping criteria
A stopping criteria will manage the trial and error by telling when the algorithm has found a satisfactory result, or has exhausted the predetermined amounts of tries that is allowed. The stopping criteria used in this application are the same ones used in the implementation of [9], namely the MaxPredictedValue- and the MaxIterations-criteria.
Figure 4.7: Simplified flowchart of IT&E
MaxPredictedValue
This criterion is defined as finding satisfactory results. In order to invoke the MaxPredictedValue, the evaluated performance - the perceived value - must be within a range of the estimated value. This is what is referred to as
the archived value, the fitness-performance that was archived by the MAP- Elites algorithm. The MaxPredictedValue operates with the parameter calledratio, that is set to 0.9 in the results presented in Chapter 5. This ratio is the tolerance of how well the newly adapted value has to perform in comparison to the best observed value in the archived solutions, checking if the best observed value is greater than f itnessvalue×ratio.
MaxIterations
The second criterion is the MaxIterations-condition, which is a limit of the amount of trials that the IT&E can perform before it has exhausted the search for a solution that reaches the MaxPredictedValue-criterion. If then- iteration parameter of the MaxIteration is set to 20, as done in Chapter 5, the IT&E will iterate over the procedure 20 times. If the MaxPredictedValue- criterion is not reached within those 20 trials, the IT&E-procedure will end, returning the best observed value. Typically 20 iterations should be enough explore key areas of the feature-space and find a decent result.
4.4 Simulator
4.4.1 Placing obstacles
Obstacles are used in the simulation to both encourage more robust gait patterns, as well as creating different scenarios in which the adaptation- capabilities can be tested. By defining the two parameters size and amount in the simulation, it should be possible to adjust the difficulty of the environment. The obstacles are flat blocks, which is a simplification done
Figure 4.8: Simulation with 0_150_15 configuration
to make it easier to define in a simulator, as well as making it easier to replicate in the real-world.
The blocks are being placed out on in a small area of the environment, with
Figure 4.9: Simulation with 0_0_0 configuration
a two-dimensional Gaussian distribution in order to make the obstacles increment in size and quantity as the robot moves forward, and in this way make it possible for the weaker individuals in the evolutionary algorithm to gain some fitness, and make it more challenging as the fitness increases and the robot manages to move further ahead.
/ / 2D g a u s s i a n
f l o a t b s i z e = a * exp ( −( (pow ( ( x−xc ) , 2 ) / ( 2 *pow( s , 2 ) ) ) + (pow ( ( y−yc ) , 2 ) / ( 2 *pow( s , 2 ) ) ) ) ) ; }
ode : : O b j e c t : : p t r _ t b
(new ode : : Box ( * env , Eigen : : Vector3d ( x , y , bs iz e /2 + ( tan ( t i l t )*−x ) ) , 1 0 , b s i z e * 4 , b si ze * 4 , b si ze ) ) ;
b−>s e t _ r o t a t i o n ( 0 . 0 f , −t i l t , 0 . 0 f ) ;
4.4.2 Fitness-function: Measuring performance
Performance is measured through the Robdyn-wrapper by using the head of the robot, described in Section 4.1. The simulator will constantly report the position of the robot as it moves in the environment. By measuring how far the robot has managed to move in the forwards direction, it can be used as a simple and robust fitness-measure, where fitter individuals are capable of moving further ahead in a limited amount of iterations. When measuring performance, the measuring is done within a set amount of iterations. The reason why performance is measured in terms of how far the robot gets in a set amount of iterations instead of time, is due to time not being constant in the simulation. The step increments discussed in Section 4.4.5 are used both to adjust the simulation speed as well as the quality of
the physics simulation, making the the adjustment of speed nonviable due to the altering of the results. By using a set amount of iterations instead, the measurements will be unaffected by the speed the simulation is run at, and each step increment will always count up to the limit of iterations, whether it is a fast simulation with coarse increments of the physics, or a slow and accurate simulation with small increments. The value of fitness measured by the fitness-function is used arbitrarily and only in reference to other fitness-values, which means that there isn’t any real world analogy for how well the robot performs, or any benchmark of speed. The fitness- function in this implementation can be summarized by measuring how far the robot gets in a set amount of iterations.
4.4.3 Feature-function: Defining the behavior
The feature-function in this implementation is very simple. The feature- function is an average α (amplitude) of all the lifting joints as the first feature, and the sweeping joints as a second feature. Since α is defined as a parameter in the genome (see Section 4.2.1), the feature-function is simply taking the four values from the genome, calculating the average, and applying it as the definition of the behavior.
/ / g a t e s t . cpp
s t d : : v e c t o r <f l o a t> data ;
data . push_back ( ( ind . gen ( ) . data ( 6 ) + ind . gen ( ) . data ( 1 2 ) ) / 2 ) ; data . push_back ( ( ind . gen ( ) . data ( 9 ) + ind . gen ( ) . data ( 1 5 ) ) / 2 ) ;
Note the values 6, 12, 9 and 15, which can then be seen in Figure 4.6 as theα of joint 1 and 3 (front and rear upper joint), and joint 2 and 4 (front and rear lower joint), which translates to the sweep- and lift-motion of the robot.
4.4.4 Visualization
The simulation has the graphical visualization as an option in order to reduce unnecessary computation when training the archive, done in Section 5.1. Due to the graphical visualization requiring additional processing power, as well as tying the simulation speed to the frame rate of the rendering. The graphical visualization is disabled for building the MAP-Elites archive, and is only applied when necessary, such as when inspecting a phenotype or alternatively when wanting to inspect the IT&E- optimization,
4.4.5 Defining the environment
In running a simulation, the following items need to be defined:
• Genotype/Genome
• Tilt of ground
• Count of obstacles
• Size of obstacles
• Step increments
The first parameter that needs to be input into the simulation is the genotype that makes up the phenotype being created and simulated. This is an array of floating point values, which defines the parameters to controller functions for each joint. This parameter is described in more detail in Section 4.1.1. Tilt is the parameter that defines the slope of the terrain, which rotates the ground plane on which the simulated robot stands upon. Size and count defines the roughness of the terrain in which the difficulty of the simulation can easily be adjusted by increasing or decreasing these values. Step increments is a technical parameter, and adjusts the resolution of the simulation, or the physics engine to be specific.
A value of 0.008 will simulate the robot twice as fast as 0.004, but will lack the precision and stability of a lower step increment value. This value is to compensate for accuracy and computational intensity. Increasing the step increments from 0.004 to 0.008 can cut training-time in half, which is very significant if trying to compute a multi-dimensional MAP-Elites archive, where the Curse of Dimensionality (Section 2.5) is applicable.
Multiple passes
One issue with doing evolutionary algorithms in a simulation is that the robot may exploit the weaknesses of the simulation in order to maximize fitness. Such an issue was discovered in this implementation of MAP-Elites, where the individuals in the archive had a measured fitness-performance that was not possible to replicate. This is evident in Figure 4.10 wherearchiveis the measured performance upon creation, and perceivedis the effort to replicate the fitness in an identical simulation. As Figure 4.10 show there is a large degree of falseness/inaccuracies to the fitness of the individuals, due to the archive being completely misguided by the few individuals that managed to exploit the simulation.
Unfortunately the cause of this exploitation was not found, due to it being relatively rare in the millions of evaluations done during the construction of the archive. The temporary solution for the problem was to perform the evaluation in two passes with a small element of randomness involved, then take the result with the lowest fitness. By doing this, the chances of such an exploitation is virtually non-existent. This, however, at a cost of essentially making the evaluation twice as computationally expensive. The clear improvement can be seen in Figure 4.11. Perceived value is when the archived solution is evaluated in the simulation to replicate the same fitness-value.
Influence of randomness
In order to create unique sampling of test-data for the results, an influence of randomness is added to the evaluations. The randomness is created by slightly tilting the terrain by a random factor. This small change creates
Phenotype (64,32) (64,64) (64,96) (63, 32) (63,64) (63,96) Archive: 2.5143 2.5732 2.3111 2.6909 2.9658 3.4922
Perceived: 0.2644 0.5886 0.3840 F F F
Delta: 2.2499 1.9846 1.9271 F F F
F: Robot flipped during simulation, and fitness was void Figure 4.10: Testing phenotypes from archive in simulation
(64,32) (64,64) (64,96) (63, 32) (63,64) (63,96) Archive: 1.0880 1.0221 0.9745 1.0327 1.1805 1.0979 Perceived: 1.1016 1.0355 1.0223 1.0492 1.1893 1.1039 Delta: -0.0136 -0.0134 -0.0478 -0.0165 -0.0088 -0.0060 Figure 4.11: Testing phenotypes from archive in simulation after running archived value for two runs
what Edward Lorenz calledThe Butterfly Effect2, where a tiny influence will have a large impact in the larger scale. Such a small and insignificant factor of tilt will make sure that every evaluation done by the simulator is unique like it would be in the chaotic real world.
2http://www.scholarpedia.org/article/Butterfly_effect
(a) ELITE before adding influence of randomness
(b) ELITE after adding influence of randomness
Figure 4.12: Comparison of adding influence of randomness into the sampling
Chapter 5
Results
4. The results are divided into two components. The first component is the MAP-Elites archive, and the second component is the IT&E- optimization being done using the same archive that was discussed as the first component. The goal of this chapter is to provide the proof required in building a conclusion to whether or not the IT&E works for this type of application, as well as helping to understand what benefits are related to this approach.
5.1 MAP-Elites archive
Figure 5.1: Archive at generation 2300
The results from the MAP-Elites algorithm, discussed in Section 2.3, will be analyzed in this Section as the first component of the results. The resulting archive is shown in Figure 5.1, whee this archive is the subject for analysis and testing. The purpose of analyzing the outcome from the MAP-
Figure 5.2: Position of test points
Elites algorithm is to better understand some aspects of the result, how well it is working, along with the strengths and weaknesses of different individuals. Using the figures of the visualized archives, it is then easy to notice the interesting aspects and features of the archive, and to analyze the particular individuals in areas of interest, to further investigate the cause of performance a The simulation is done by choosing the individual of interest, then running it in Robdyn under the same conditions that it was trained in. The analysis done in this section is done using the0_0_0 scenario (see Table 6.2), the same condition that the archive was trained under.
5.1.1 Alpha cutoff
The first interesting feature in the archive is the dark line at the bottom, where the fitness is rapidly drops to virtually zero. This is observed to be the effect from the cutoff-threshold of theα(amplitude), which is discussed in Section 4.1.1, where any alpha-values below a certain threshold results in the joint being disabled. In order to investigate this area of low fitness, and why the thresholds have such an impact on the fitness of the individual, the individual at the position(127, 0), as well as some of the neighboring individuals are observed and evaluated in the simulator. Individual(127, 0)is located in the absolute bottom-right of archive visualized in Figure 5.2, and is one of the individuals in the archive that has the lowest fitness-value.
Looking at the simulation of the poorly performing individual (127, 0), as well as the surrounding neighbors, it is clear that the lack of lift in the legs causes the robot to shuffle around on the spot, pushing itself around using the joints in the body and the upper joints of the legs. The robot
does not have a determined movement in the forwards direction, and any minor forms of fitness from these particular individuals can be disregarded as noise from the robot shuffling around and by random chance gaining fitness. The dark area at the bottom points to a feature where a joint on all four legs are disabled, due to the average of the alpha-values being below the given threshold. This disabling of these essential joints causes the robot to perform poorly.
5.1.2 Fitness correlation of the Y-axis
Watching the fitness in correlation to the Y-axis introduces a few interesting aspects of the archive. The fittest individuals do not seem to be heavily influenced by the Y-axis, and as soon as the legs are out of the alpha cutoff area, the robot immediately starts gaining performance. This evidently means that the robot is capable of developing a well-performing solution, regardless of the dihedral movement of the legs. In the area after the alpha cutoff threshold, the archive quickly start producing well performing individuals. The fitness then tapers off as we move up the archive, until it reaches another, larger, area that produces individuals with comparatively good fitness. Looking at Figure 5.3, the high fitness areas are marked. Even though there are areas where the fitness is generally higher, the fluctuations within those areas are still very high within the close neighbors. The localized areas of high fitness are a natural consequence of how the MAP-Elites algorithm works, and the areas may be merged together, or overtaken, by a new maxima found in later generations of the archive.
Figure 5.3: Areas of high fitness
Analysis 1
In order to observe if there are any differences in terms of behavior and performance as the average amplitude of the lower joints changes, several individuals have been simulated to see how they behave compared to each other. With individual (100, 41) which is archived as one of the best performing individuals, there is a clear and well performing gait with moderate lift in the legs. As a compensation to the lower lift in the legs, this individual uses twisting of the body in order to maximize the lifting, as seen in Figure 5.4.
Analysis 2
By observing two individuals on an equal X-axis, in correlation to the area of high fitness on the Y-axis, the behavior may give some insight into why there are some areas which perform better than other. Phenotype (50, 96) and individual (100, 96) both seem to have a synchronized and well performing gait patterns that are capable of giving the robot fast movement, which translates to high fitness. The fitness does not seem to be due to the specific features of these individuals, but rather that the individuals are based on a different maxima, making the robot use a trait that gives good performance. In this analysis there are signs of individual (50, 96)substituting the lack of movement in its legs with movement in the body-joints as well, which means that the robot is twisting the head- and tail-joints as a compensation too low sweeping-motion in the upper joints of the legs.
Analysis 3
Individual(100, 125)is one example of a individual that is not perform- ing as well as the neighbors, despite being in a generally good area. This can change over the next few generations as this discrepancy may be due to the later generations not having found a replacement individual that has the features needed to occupy the cells from an older generation. Simulat- ing this individual, it is clear that there is a determined gait pattern, but that it is stumbling due to the legs not being perfectly in synchronization, and with the lifting of the legs being high, it will easily throw itself off course and diverge off to the sides, meaning that there is lost fitness.
5.1.3 Fitness correlation of the X-axis
The X-axis seems to have a much stronger correlation to fitness compared to the Y-axis of the archive. With small values in the average amplitude of upper joints, the fitness is generally very low. The robot then starts to slowly gain fitness as the average amplitude increases. Some areas, such as the high fitness area tested in Section 5.1.2 Analysis 3, will gain fitness more rapidly than areas with other values of average amplitude of the lower joints, which is what is causing the areas of high fitness illustrated in Figure 5.3. The difference in these areas of high fitness may be closely related to