Advanced Biped Locomotion in Real/Simulated Humanoid … · Advanced Biped Locomotion in...

Advanced Biped Locomotion in Real/SimulatedHumanoid Robots

Master Thesis

Rolando RodasMicroengineering Section

June 24, 2011

Supervisor:Jesse van den Kieboom

Biorobotics Laboratory (BIOROB)Prof. Auke Jan Ijspeert

EPFL

School of Engineering (STI)Institute of Microengineering (IMT)

Abstract

The goal of this project is to study the control of biped locomotion for a humanoid robot.A CPG-based controller was design so that the robot can walk at different speeds. Thecontroller will have to be robust against perturbations, it means it should not fall ifsomeone pushes it a little.To produce a more human-like gait for the robot, a toe was added to the feet of the

robot. The toe was realized in two variants: with linear and non-linear stiffness anddamping profiles.

Acknowledgements

First of all, I would like to thank my supervisor Jesse van den Kieboom who gave meplenty of support and advice. Thank you also to prof. Auke J. Ijspeert for its suggestionsand support. Finally I would like to thank Alexander Spröwitz for the suggestions hegave me while I was working on the toe.

Contents

1 Introduction 1

2 Central Pattern Generators for Biped Locomotion 32.1 Arbitrary Waveform Oscillator . . . . . . . . . . . . . . . . . . . . . . . . 3

2.1.1 Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1.2 Output Waveforms for Joints Trajectories . . . . . . . . . . . . . . 5

2.2 Network Topology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72.3 Driving the Joints of the Robot . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Framework 103.1 Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.1.1 Starting Posture Initialization . . . . . . . . . . . . . . . . . . . . . 103.1.2 Ground Contact Model . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2 Optimizer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.2.1 Particle Swarm Optimization . . . . . . . . . . . . . . . . . . . . . 113.2.2 Parameters Encoding . . . . . . . . . . . . . . . . . . . . . . . . . 123.2.3 Simulation Parameters . . . . . . . . . . . . . . . . . . . . . . . . . 13

4 Optimizing a Reference Controller 154.1 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

5 Improving the Reference Controller 215.1 Trial One: Implicit Fitness Function . . . . . . . . . . . . . . . . . . . . . 21

5.1.1 Robustness Parameters . . . . . . . . . . . . . . . . . . . . . . . . 215.1.2 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.2 Trial Two: Staged Fitness PSO . . . . . . . . . . . . . . . . . . . . . . . . 235.2.1 Stages Description . . . . . . . . . . . . . . . . . . . . . . . . . . . 235.2.2 Additional Considerations . . . . . . . . . . . . . . . . . . . . . . . 285.2.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

5.3 Influence of the Robustness Stages on the Controllers . . . . . . . . . . . . 415.3.1 Influence of the Ground Clearance Stage . . . . . . . . . . . . . . . 415.3.2 Influence of the Torso Inclination Stage . . . . . . . . . . . . . . . 435.3.3 Influence of the Energy Consumption Stage . . . . . . . . . . . . . 44

5.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

iv

6 In Need for a Toe 506.1 Motivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 506.2 Adding a Toe to the HOAP-2 Webots Model . . . . . . . . . . . . . . . . 506.3 Controller Optimization with Additional Toes . . . . . . . . . . . . . . . . 51

6.3.1 Linear Stiffness/Damping Profiles . . . . . . . . . . . . . . . . . . 526.3.2 Non-Linear Stiffness and Damping Profiles . . . . . . . . . . . . . 55

6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

7 Stability 627.1 Stability Testing Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 627.2 Part One: Inherent Stability . . . . . . . . . . . . . . . . . . . . . . . . . . 627.3 Part Two: Step Perturbation Force . . . . . . . . . . . . . . . . . . . . . . 637.4 Part Three: Linear Perturbation Force . . . . . . . . . . . . . . . . . . . . 637.5 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

8 Conclusion 688.1 Discussion and Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . 688.2 Future Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

Bibliography 70

v

1 Introduction

Biped locomotion is an active research topic. Several robotic platforms were developedalong the years to try to replicate human locomotion. The famous Honda’s ASIMO iscertainly one of most well known biped robots. Other (smaller) platforms were developedsuch as the NAO from Aldebaran Robotics, the QRIO from Sony or the HOAP-2 fromFujitsu.The usual approach to biped locomotion is from control. ZMP (zero moment point)

criterion [1] is certainly one of the most widely used stabilization methods when itcomes to biped locomotion. The zero moment point is the point on the ground wherethe resultant reaction forces act. By ensuring that the ZMP is always within the footsupport area of the robot, stable locomotion is possible. In the biological world, however,central pattern generators [2] are preponderant when it comes to locomotion. CPGs areneural circuits which can produce rhythmic patterns without any rhythmic input. Whentransferred into mathematical models, they can be used as building blocks for roboticlocomotion. Using a sine-based controller, van den kieboom [3]successfully generated acontroller for the HOAP-2 robot. However, compared to the trajectories presented bythe human gait, it is evident that the sine-based controller is only an approximation.While the approximation is reasonable for the hip and the ankle, it is not the case forthe knee.Typically robots walk with the soles of their feet parallel to the ground. In humans,

however, the weight is first absorbed by the heel, then transferred to the sole, and finallyto the toe. To develop a fully human-like gait, the robot should then be able to performheel-contact and toe-off motions. An additional degree of freedom, such as the toe ofthe WABIAN-2R robot (figure 1.2), would need to be added to the HOAP-2 robot feet.Indeed like the majority of small robots, the HOAP-2’s feet are constituted by a singleplate and they do not possess any degree of freedom for the toe.Thus the thesis goal’s are to:

1. Design a more adapted controller for the robot HOAP-2 robot with minimum levelsof robustness.

2. Model a toe for the robot’s feet.

3. Test the actual level of robustness of the controllers.

1

Figure 1.1: Human gait patterns. [3]

Figure 1.2: Toe mechanism of the WABIAN-2R humanoid robot foot. Adapted from [4].

2

2 Central Pattern Generators for BipedLocomotion

Central pattern generators (CPG) are neural circuits found in animals that have thecapacity to produce rhythmic patterns of neural activity without receiving rhythmicinputs. They present several interesting properties such as distributed control, the abilityto deal with redundancies, fast control loops, and allowing the modulation of locomotionthrough simple signals. [2].

2.1 Arbitrary Waveform OscillatorThere is several types of oscillators which are able to generate sine wave output patternssuch as the amplitude controlled phase oscillator and the Hopf oscillator. However, usingmore complex waveforms can be interesting. An oscillator able to produce arbitraryoutput signals while maintaining the same behavior respective to phase coupling andperturbation rejections as the Hopf oscillator is needed. One such kind of oscillatoris presented below. Its output patterns, explicit form and derivative are analyticallyknown. It was developed by J. van den Kieboom [5] and is presented below. Thecoupling equation (Eq. (2.1)) describes a phase oscillator whereas the arbitrary outputis defined by Eq. (2.2).

θi = ω +∑j

sin(θj − θi − φij) (2.1)

with

• θi, the phase of the oscillator i

• ω, the angular frequency

• φij , the phase bias of the coupling between two oscillators i and j

The objective is to use the phase θi to drive a specific function fi(θi) as the oscillatoroutput.

xi = γi(fi(θi)− xi) + dfidθi· θi +Ki (2.2)

with

• f(θi), the arbitrary waveform

3

• xi, the oscillator output

• γi, a constant

• Ki, an arbitrary perturbation

Eq. (2.2) can be broken down into three distinct parts: the driver, the attractor andthe perturbation part. The driver or feed-forward term corresponds to the derivative ofthe function f(θi) with respect to θi ( dfi

dθi) and can be expressed as:

xi = dfidt

= dfidθi· dθidt

(2.3)

The second part is the attractor or feedback term (Eq. (2.4)). The attractor deter-mines the speed of convergence thanks to the constant γi.

xi = γi(fi(θi)− xi) (2.4)Finally the remaining part is an arbitrary perturbation Ki.

2.1.1 ConvergenceThe controller drives the robot by generating the new angular positions of the jointstrajectories at each time step. But what happens during the transition from a standingstill position into walking? The actual positions of the joints can be outside of the jointtrajectory , for instance. To solve this problem the attractor gain (Eq. (2.4)) needs tobe size. To do so the analytical solution of Eq. (2.2) given by [5] is used:

x(t) = g(t) + e−γtCx (2.5)With

g(t) = f(θ(t)) (2.6)

θ(t) = ωt+ Cθ (2.7)

Cx = x0 − g(0) (2.8)

Cθ = θ0 (2.9)With some trivial manipulations Eq. (2.6) can be rewritten as

e−γt = x(t)− f(ωt+ θ0)x0 − f(θ0) (2.10)

To be able to solve Eq. (2.10) for γ, some assumptions have to be made. A 5% error onthe signal after a 400ms period of time will be considered as acceptable. Mathematicallythis can be expressed as follows:

4

x(t = T ) = 0.05(x0 − f(θ0)) + f(ωT + θ0) (2.11)

By inserting (2.11) into (2.10) we obtain

e−γt = 0.05 (2.12)

which leads to

γ = −ln(0.05)/0.4 ' 7.49 (2.13)

2.1.2 Output Waveforms for Joints TrajectoriesThe oscillators will be used to generate the trajectories which will drive the robot’s joints.Several types of trajectories are possible like sine-based trajectories [3] or the trajectoriescoming with the HOAP-2. In both cases the trajectories do not provide the most optimalgait possible. On the one hand the sine-based trajectories are rough approximation ofboth human and robot gait cycles. While on the other hand the trajectories comingwith the robot result in a a gait which is slow and clumsy at the same time. Hence anew kind of trajectories are needed. However as it is not known in advance what themore suitable trajectories would look like they will be optimized at the same time as thecontroller parameters.The oscillator from Eq. (2.1) and (2.2) can output any arbitrary waveform f(θ) as

long as the explicit form of f(θ) and dfdθ are known. Thus a Nth order polynomial is nicely

suited to the task. Plus this kind of polynomial can be easily interpolated from severaldata points facilitating the job of the pattern optimization itself. To generate a curvefor use in the CPG network, four data points coordinates are generated, extended (sothe resulting interpolated function is periodic) and are then interpolated by a monotonicpiecewise cubic polynomial [6].There is two types of data points used for interpolation. The first kind is constituted

by four randomly generated points. The X-axis coordinates are defined in such a waythat they belong to the [0, 1] interval (function is 1-periodic). The Y-axis coordinatesare chosen inside the amplitude boundaries of every joint (see table 2.1). One example ofsuch interpolated data points function can be seen in figure 2.2(a). This kind of arbitrarycurves are used for the knee and frontal ankle joints. On this type of function the offsetand the amplitude of the curve are directly encoded in the data points.The second kind of data points generated is constituted by a single point. The main

idea is to produce a trajectory which is a sine-like curve but with a small plateau atits minimum and maximum values. The four data points are reconstructed from theoriginal point by making two symmetries. The data point x-coordinate is bounded to[0, 0.25] while the y-axis coordinate is limited to the joint amplitudes like the other datapoints. The first line of symmetry is the vertical line passing by x = 0.25 hence formingthe second data point (red cross in figure 2.1). The remaining data points are generatedby making a rotation of π around the x-axis and then by making a symmetry aroundthe vertical line passing by the point x = 0.5 (green crosses in figure 2.1). While more

5

constraint than the previous type of function this “mirror” function imposes a neededsymmetry between the right and left limbs, a property that the arbitrary waveformwould not necessarily present. If the x-coordinate of the data point is get close to 0, thegenerated trajectory will be a step. On the other, if it get closer to 0.25, the resultingcurve will be a sine. Physically the axis of symmetries ensure that the stance phase isas long as the swing phase.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1-8

-6

-4

-2

0

2

4

6

8

angl

e [d

eg]

Figure 2.1: By successive symmetries of the data point given by the optimizer (blackcross) three other point to be interpolated are generated. The red cross isobtained by realizing a symmetry of the vertical axis passing by 0.25. of thedata point. The green crosses are obtained by a combination of rotation andsymmetry of the two first data points.

Mathematically the data points coordinates can be easily calculated. Knowing thanthe curve period (T ) is 1 the data points coordinates can be written as follows:

p1 = (x1, y1)p2 = (x2, y2) = (T/2− x1, y1)p3 = (x3, y3) = (x1 + T/2, −y1)p4 = (x4, y4) = (x2 + T/2, −y1)

To extend the data points, two more points are added before p1 (p−1, p0) and two atthe end after p4 (p5 and p6). Those four data points are calculated as follows:

6

p−1 = (x4 − T, y4)p0 = (x3 − T, y3)p5 = (x1 + T, y1)p6 = (x2 + T, y2)

(2.14)

Once extended and interpolated the resulting trajectories can be seen in figure 2.2.

(a) Waveform pattern (red dashed line) interpolatedfrom four different data points (black crosses).

(b) Waveform pattern (blue dashed line) interpolatedfrom a single data point.

Figure 2.2: Different types of waveforms used by the CPG controller to drive the joints.Two types are considered: a sine-like waveform (a) and a monotonic cubicpiecewise interpolated arbitrary waveform (b).

2.2 Network TopologyOnce the oscillator is chosen the network topology can be defined. The topology isusually designed in such a way that it closely matches the robot joints architecture.Which is also the approach used in this document. Every joint in the network has anassociated oscillator. However not all 25 degrees of freedom of the HOAP-2 are necessaryfor biped locomotion. The upper-body joints for instance will be left untouched (besidesfor initialization purposes) whereas the lower-body joints will not.

7

The figure 2.3 shows the repartition of the joints along the robot skeleton. The CPGnetwork makes use of both ankle joints, of the knee joint but only of two out of the threehip joints. The yaw rotation was not considered a requirement for biped locomotion andas such was not activated. The network topology is represented in figure 2.4. The finalnetwork is divided into two parts: the frontal and lateral joints. The frontal joints arelinked from hip to ankle by stages. Laterally the hips and ankles are linked togetherin order to maintain the torso in a vertical position. The frontal knees and frontalankles are also linked together although it is not strictly necessary. The main benefit itbrings is to fasten the response to a phase difference between both legs. However as thenetwork is phase-locked (used in open-loop) this links are not used (see Eq. (2.1) when(θj − θi) = φij).

Figure 2.3: HOAP-2 humanoid robot 25 joints schematic.

2.3 Driving the Joints of the RobotFinally the last remaining step consists into driving the controller joints. The joints areposition-controlled and use a standard P-controller whose gain is equal to 10 (the defaultWebots gain for a servo). The angular position is calculated at each time step using Eq.(2.15).

di = Aixi(t) +Oi (2.15)with:

• di, the angular position of joint i

• Ai, the joint i amplitude, always set equal to 1

8

: :

:

:

:

:

:

:: :

hip

knee

ankle

left right

lateralfrontallateral frontal

Figure 2.4: Central pattern generator network topology. Each of the 10 DOF joints ismodeled by a distinct oscillator.

Joint Bias Amplitude Offset Function Typefrontal hip (0, π) [-60°, 60°] [-30°, 30°] sine-likelateral hip (π2 ,

π2 ) [-20°, 20°] – sine-like

frontal knee (0, π) [ 0°, 120°] – arbitraryfrontal ankle (0, π) [-30°, 30°] – arbitrarylateral ankle (−π

2 , −π2 ) [-20°, 20°] – sine-like

Table 2.1: CPG network parameters values (bias and function type) and parametersboundaries (amplitudes and offsets).

• Oi, the joint i offset, usually set equal to 0

• xi, the joint i oscillator output

However for all but the frontal hip joints the offset is null. The Amplitude is set to 1as it is already taken into account through the data points coordinates. The remainingjoints parameters (biases, amplitudes, and offsets) are defined in table 2.1.To summarize the frontal knee and ankle joints trajectories are use arbitrary trajec-

tories as it is not evident to “guess” what they could look like prior optimization. Thefrontal hip, lateral hip and ankle joints all use the same sort of sine-like function toensure the presence of symmetry in the controller. Moreover in order to maintain therobot’s trunk constantly vertical both the lateral hip and lateral ankle joints are drivenby the same trajectory and use the same bias.

9

3 Framework

The framework is divided into two main components: the simulator and the optimizer.The simulator used is the mobile robots simulation software Webots [7]. The HOAP-2robot was modeled by Pascal Cominoli [8].

3.1 SimulatorThe world where the robot is interacts is a flat 100m by 100m arena. The time step ofthe simulator is set to 2ms. The world coordinate system used by Webots is not orientedin the conventional way but instead of the z-axis pointing upwards, it is the y-axis.

3.1.1 Starting Posture InitializationAt the very beginning of the simulation the robot is set into a special posture whose goalis to ease the transition between the standing still position into a walking motion. Themain idea is to move the center of mass slightly forward. The torso is pivoted forwardby 14 ° and by contracting both legs. Finally the left leg is also moved upwards until theleft foot does not touch the ground anymore. The transition lasts 300ms after which theCPG controller is activated and takes over. A step by step illustration of the transitionbetween a standing still position and the initialization posture can be seen in figure 3.1.

(a) (b) (c) (d)

Figure 3.1: At the very beginning of the run the robot is set into its initial posture. Thetorso is moved forward while both legs are contracted. The left leg is movedslightly upward to ease the transition from a static posture into walking.

10

3.1.2 Ground Contact ModelThe ground contact model was modified in order to make the ground more “spongy”.The contact model was defined such that the robot feet would penetrate 0.2mm into theground. The “softCFM” and “softERP” parameters of the foot were then set to 0.005and 0.5 respectively.

3.2 OptimizerThe framework described in chapter 3 offers the possibility to use a CPG-based controlleron a biped humanoid robot in the Webots simulator. However the parameters of thecontroller are unknown. A possible technique to find suitable parameters values is to usean optimization algorithm. The algorithm will generate a set of parameters values whichare loaded into the framework to configure the robot controller. The simulation is thenstarted and the performance of the controller can be evaluated by means of a fitnessfunction which attributes a numerical score to the performance of a given controller.The fitness score is then used generate a new set of parameters values which are fed tothe framework and evaluated again and so on until either the stop condition conditionof the framework or the maximum number of iterations of the algorithm are reached.All the optimizations realized in the present work used a particle swarm optimization

algorithm.

3.2.1 Particle Swarm OptimizationParticle swarm optimization is a machine learning technique loosely inspired by birdsflocking in search of food. Looking for a solution to a given problem a swarm of particlesis generated which collectively move towards the global optimum [9]. Each particle ischaracterized by its position (xi) and its velocity (vi). The particles move according totheir local distribution, they communicate with their neighbors, they remember theirbest position so far, and they also know the position of neighbor with the best positionof all.The particle position at each time step is given by:

xt+1i = xti + vt+1

i (3.1)

with

• xti is the particle old position

• vt+1i is the particle new velocity

• xt+1i is the particle new position

While the particle velocity is:

vt+1i = avti︸︷︷︸

fa

+ b(xpi − xti)︸︷︷︸fb

+ c(xtj − xti)︸︷︷︸fc

(3.2)

11

Parameter Valueboundary-condition bounceboundary-damping 0.95cognitive-factor 2.05constriction 0.729convergence-threshold 0convergence-window 10social-factor 2.05

Table 3.1: PSO parameters used for the optimization.

where

• xpi is the best position so far

• xtj is the neighbors best position

• a, b, c are constants which represent the importance given to each velocity fraction

• fa is the velocity fraction whose direction is the same as in the previous time step

• fb is the velocity fraction whose direction is towards the best position so far

• fc is the velocity fraction whose direction is towards the best neighbor

The PSO algorithm implementation used for all the simulation was developed inter-nally at the Biorob laboratory. The PSO parameters used can be found in talbe 3.1.

3.2.2 Parameters EncodingThe main features to optimize are the trajectories of the different joints. Several pairsof coordinates have to be generated in order to interpolate the data points and createthe joints trajectories. Depending on the type of curve, sine-like or arbitrary, a differentnumber of points is required. For sine-like curves, a single data point is needed. Thex-coordinate can be anything between 0 and T/4 which for 1-periodic functions is 0.25.The y-coordinate values are contained within the boundaries indicated in table 3.2 andthe three missing data points are recreated using the technique from section 2.1.2Arbitrary functions however make use of four distinct data points. The y-coordinates

are limited in the same way as they are for sine-like functions. On the other hand thex-coordinates do not represent the absolute position of a given data point on the x-axis.Instead they represent the distance between the data points thus the real points valueon the x-axis have to be computed separately. The distance between the data pointsxi coordinates are generated randomly but have to satisfy the constraint given by Eq.(3.3):

12

4∑i=1

xi = 1 (3.3)

and where xi is the distance between the point xi and xi−1 along the x-axis. Ithappens sometimes that xi is equal to 0. This is problematic as it means that a sin-gle x-coordinate has two points. Therefore the data point is simply removed and thetrajectories interpolation is realized on three data points instead of four.

3.2.3 Simulation ParametersWith respect to the simulation itself. The optimization process is conduction with apopulation of 100 particles during 400 iterations. The run duration is limited to 10 s inorder to maintain the optimizations durations acceptable.

13

Parameter Description Boundariesf2_x lateral hip [0, 0.25] –f2_y lateral hip [-20, 20] °f3_x frontal hip [0, 0.25] –f3_y frontal hip [-60, 60] °f4_x1 knee [0, 1] –f4_y1 knee [0, 120] °f4_x2 knee [0, 1] –f4_y2 knee [0, 120] °f4_x3 knee [0, 1] –f4_y3 knee [0, 120] °f4_x4 knee [0, 1] –f4_y4 knee [0, 120] °f5_x1 frontal ankle [0, 1] –f5_y1 frontal ankle [-30, 30] °f5_x2 frontal ankle [0, 1] –f5_y2 frontal ankle [-30, 30] °f5_x3 frontal ankle [0, 1] –f5_y3 frontal ankle [-30, 30] °f5_x4 frontal ankle [0, 1] –f5_y4 frontal ankle [-30, 30] °O3 frontal hip offset [-30, 30] °f frequency [0.5, 1.25] Hz

springConstant spring constant [0, 2.4] NraddampingConstant damping constant [0, 0.0003] Ns/rad

sf_y1 spring constant [0, 2.4] Nradsf_y2 spring constant [0, 2.4] Nradsf_y2 spring constant [0, 2.4] Nraddf_y1 damping constant [0, 0.0003] Ns/raddf_y2 damping constant [0, 0.0003] Ns/raddf_y3 damping constant [0, 0.0003] Ns/rad

Table 3.2: Standard set of parameters used by PSO for the optimization process. Theparameter list contains the data points coordinates used to generate the jointstrajectories. In addition an offset for the frontal hip joint and a frequencyparameters are also in used. The spring and damping constants are used forthe experiment from chapter 6

14

4 Optimizing a Reference Controller

A reference controller was optimized for comparison purposes as well as to validate thesoftware developed. Hence to start with the controller is made as simple as possible. Theoptimization process is conducted without any robustness considerations whatsoever. Itis based on the same set of parameters as described in table 2.1. The trajectories usedfor both lateral joints (hip and ankle) are the same and based on a sine-like pattern.Moreover the hip bias is π

2 whereas the ankle bias is −π2 in order to maintain the robot’s

trunk in a vertical position. The frontal hip also uses a sine-like waveform but with theaddition of an offset. Finally the remaining frontal knee and ankle joints are driven byan arbitrary pattern. The frontal left leg joints bias are set to 0 while they are set to πfor the right leg hence providing a correct coordination between both legs. Finally thecontroller frequency is fixed to 0.75Hz.For the optimization per se the parameters were set as described in chapter 3.2. The

run duration (tT ) was set to a maximum of 10 s while stopped prematurely should therobot fall down before. The CPG-controller is activated once the initialization phase isfinished (see section 3.1.1) leaving the robot free to move by itself. The PSO algorithm isset with a population of 100 particles during 350 iterations. All the remaining parametersare set as per table 3.2. The optimization process is repeated five times.

4.1 Performance EvaluationEvaluating the performance, here the gait, produced by a controller is an important partof the optimization process. As it is not possible to evaluate the gait as a whole directlysome indirect criteria must be used. One such criteria is the distance walked by therobot during the run. Evaluating the fitness with this criteria has several interestingproperties like ensuring that the robot is moving forward and favoring higher velocities.Mathematically the distance is evaluated using the euclidean distance which also takesthe lateral distance also into account:

f = De =√

(x− x0)2 + (z − z0)2 (4.1)Where:

• x is the walked lateral distance along the x-axis i.e. x(t = tT )

• x0 is the initial x position i.e. x(t = 0)

• z is the walked distance along z-axis i.e. z(t = tT )

• z0 is the initial z position i.e. z(t = 0)

15

• tT is the run duration

Also should the robot move backwards the distance would become negative. Aftersome preliminary testing using only the euclidean distance (De) as fitness function so-lutions produced mainly controllers with the undesirable tendency to move laterally. Toprevent this type of behavior the fitness function was slightly tweaked with the additionof a penalty when the robot moves laterally.First the lateral distance value needs to be evaluated:

Dx =√

(x− x0)2 (4.2)

which is then used to penalize the controller fitness score.

f = De −Dx =√

(x− x0)2 + (z − z0)2 −√

(x− x0)2 (4.3)

4.2 ResultsThe optimizer is able to converge towards a (sub-)optimal solution in around 50 iterations(see figure 4.1(b), once the maximum run time is reached). From that point on thecontroller is reliable enough to move the robot during the whole run duration, 10 s.Limited by the time factor the improvements in the fitness scores are then mainly dueto an increase in the velocity of the robot as can be seen in figure 4.1(a).The results are quiet consistent between the five repetitions. Four out of the five

optimization converged towards a solution. The gait produced can be seen in figure 4.2.The transition of the gait from stance to swing can be broken down into two distinctphases. Firstly the swinging leg is moving upwards and stretching at the same time (seefigure 4.2(a) to 4.2(e)). Once the foot is at its highest position and with the help of thestance leg ankle, the weight of the robot makes it swing so that the foot which is in theair can touch the ground. The forward position of the trunk helps to move the centerof mass forward so the transition is easier to perform. The distance between both feetis important and the center of mass is quiet low (see figure 4.2(f)). The transition itselfgoes from heel-to-toe although the impact of the heel on the ground is stiff. In a secondtime the right leg is brought back close to the body while the robot raises itself and thecycle starts again to complete the full stride (see figures 4.2(g) to 4.2(i)).The trajectories of the controller are presented in figure 4.3. By taking the data

points generated by the optimizer (black crosses in figure 4.3) and after interpolationthe function used to drive the joints of the robot are reconstructed. The first outstand-ing difference between a purely sine-based controller and the reference controller heredescribed is the lack of symmetry from the latter. Obviously the frontal hip and lateraljoints are symmetrical but that is by definition. On the other hand the knee and anklejoints through the possibility they have to generate completely arbitrary waveforms haveboth a non symmetric pattern for swing and stance.The trajectories of the lateral joints present a small amplitude, slightly above 9 °

around half the allowed range of 20 °. Their shape stands between a step and a sine

16

0 50 100 150 200 250 300 350-1

0

1

2

3

4

5

6PSO Optimization

iteration

f = D

e - D

x [m]

max fitness valuemean fitness value

(a) Evolution of the fitness score of the optimized solution i.e. the evolution ofEq. (4.3). In blue are the fitness scores of the best solution for each iterationwhile in red is the average performance of all solutions.

0 50 100 150 200 250 300 3500

2

4

6

8

10

12PSO Optimization

iteration

time

[s]

max timemean time

(b) Evolution of the run duration time of the solutions during the optimizationprocess. In blue is the run time duration taken by the best solution for eachiteration while in red is the average duration of all solutions.

Figure 4.1: Evolution of the the optimization process. The convergence point is reachedafter (only) 50 iterations.

17

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 4.2: Reference controller gait. The transition from a standing still posture intohalf of the stride is cut in steps. Due to the fixed frequency the robot hasto make large steps to increase its velocity. The transition of the feet on theground goes from heel-to-toe. A video of the gait can be seen on the Bioroblaboratory website: http://biorob.epfl.ch/page-65543.html.

18

http://biorob.epfl.ch/page-65543.html

curve. The frontal hip however is closer to a sine. Its amplitude is of 52.1 ° which isclose to the limits of the range boundaries (120 °) explaining thus the large steps madeby the robot while walking. The additional offset of 29.62 ° explains the forward positionof the trunk that can be seen in figure 4.2. The knee peak to peak amplitude is limitedto 28.25 °. The bending of the knee is thus quiet reasonable. And finally the ankle peakto peak amplitude is of 52.32 ° which is close to the 60 ° allowed amplitude range. Thishigh amplitude is necessary in order to provide a heel-to-toe transition when the feettouch the ground because of the large steps gait.

0 0.2 0.4 0.6 0.8 1-10

-5

0

5

10lateral hip and ankle

angl

e [ °

]

0 0.2 0.4 0.6 0.8 1-100

-80

-60

-40

-20

0

20

40frontal hip

angl

e [ °

]

0 0.2 0.4 0.6 0.8 110

15

20

25

30

35

40

45frontal knee

angl

e [ °

]

0 0.2 0.4 0.6 0.8 1-30

-20

-10

0

10

20

30

40frontal ankle

angl

e [ °

]

Figure 4.3: Plot of the trajectories made by the joints of the robot during a referenceoptimization. Fitness function is set by Eq. (4.3). Run duration is limitedto 10 s and the CPG frequency is 0.75Hz. The optimization was conductedwith a population of 100 particles during 350 iterations. Both lateral hipand ankle joints use the same trajectories in order to maintain the trunk ina vertical position. Both the lateral and the frontal hip joints are sine-liketrajectories whereas the frontal knee and ankle are arbitrary patterns. Thebiases described in table 2.1 are not taken into account here.

4.3 DiscussionThe optimization of a reference controller has been successfully achieved. The optimizerwas able to converge towards solutions which were able to reach the maximum durationtime (10 s) after around 50 iterations. Due to the choice of the fitness function theresulting controllers were optimized for high velocities. For instance the first solution

19

where the robot was able to walk for 10 s had a velocity of 0.326m/s which is alreadyquiet fast when considering the height of the robot. The fittest solution i.e. the solutionwhose controller was presented in the previous section had a velocity of 0.536m/s, morethan 50% faster! However the influence of such a high speed on the gait is non-negligible.The trunk of the robot oscillates between 13 and 47°and as the gait is not suitable forrunning, the only way to continue to increase speed is to make larger steps.

20

5 Improving the Reference Controller

The reference controller from chapter 4 proved that the framework developed was fullyfunctional and could sustain a working and minimal gait. However the resulting con-troller was not without its downsides. First of all the controller was indirectly optimizedfor speed thus promoting the usage of large steps and the trunk oscillated with a non-negligible amplitude (34 °). In addition the ground clearance was minimal. Hence thespecial emphasis of this chapter on the definition of a new fitness function which canprevent such behaviors.To improve this situation some robustness parameters must be integrated in the op-

timization process. Two different approaches were tried to achieve such goal. One byusing an implicit fitness function and the other by using sequential fitness functions.Both experiments used the same parameters as the reference controller. The first ex-periment was set to a fixed frequency of 0.75Hz just like the reference controller. Thesecond experiment, however, left the frequency as an open parameter. The frequencyrange was contained between 0.5 and 1.25Hz.

5.1 Trial One: Implicit Fitness FunctionIn order to improve the robustness of the optimized controller, the optimization processwas modified in such a way that undesired behaviors would penalize the fitness score.The main idea is to use the same fitness function as the reference controller (Eq. (4.3)).However this time several robustness parameters are checked during the evaluation run.If one of such parameters is not considered suitable or within boundaries the run isprematurely stopped like the the robot falls down. In fact the inspiration for such processwas to consider that the ultimate goal of the controllers is to successfully transfer on thephysical robot. Thus if the controller is not capable to withstand the conditions imposedin the simulation it will not be possible to make it work on the real robot. Then it wasconsidered that like in reality the robot would fall on the ground should one the testedcriterion not be satisfied. The fitness score can be deeply penalized should the controllernot behave in a robust way.

5.1.1 Robustness ParametersTorso Inclination During the exploration phase the evolved controllers had a strong

tendency to produce torso leaning. The idea is to try to minimize the maximumangle made by the torso during the whole run. The angle is calculated by taking theorientation matrix of the robot at the very first time step. The orientation matrixcontains the rotation of the robot origin node in the world coordinate system.

21

Moreover each column of the matrix is a pairwise orthogonal unit vectors formingan orthonormal basis therefore the rotation of the robot between two measurementscan be easily determined by calculating the cosine angle between the two vectors.The interesting vectors are ~ey0 , the y-axis vector at the time step 0 and the vector~eyt the same vector at time step t. As the robot is in a perfectly standing stillposition at the very beginning of the run ~ey0 is aligned with the world coordinatessystem and can then be used as reference. Hence by calculating the angle betweenthose two vectors the torso inclination is determined.

(~ex ~ey ~ez

)=

x1 y1 z1x2 y2 z2x3 y3 z3

(5.1)

The torso inclination angle αt at the time step t is calculated by Eq. (5.2).

cosαt = < ~eyt , ~ey0 >

‖ ~eyt‖ · ‖ ~ey0‖(5.2)

The torso inclination is measured at each time step. Hence should the torso leanfor more than 15 ° the run is stopped and the fitness function becomes the walkeddistance so far as defined by Eq. (4.3).

Ground Clearance Providing enough ground clearance is one the most important fea-tures of a reliable controller. However, beforehand a ground clearance definitionmust be found. The definition used in this work is consider it as the height betweenthe foot and the ground during the swinging phase. The measure is made whenboth feet are parallel i.e. when the distance between their toes or between theirheels is less than 10 cm. The ground clearance value is the height of the corner ofthe swinging foot which is closer to the ground.To prevent the ground clearance to be measured at the end of the initializationposture i.e. where the feet distance to the ground is lower than the thresholdvalue, the measurements are only started 600ms after the run is started. A groundclearance of at least 1 cm is considered satisfactory.

Feet Inclination The main idea developed by the measurement of the feet inclinationis to “force” the robot to walk with the feet as parallel as possible to the ground.Clearly this kind of constraint can make the optimizer converge into an unnaturalgait. However as the feet of the robot are constituted of a single piece hencewithout toes articulations, the robot should be more stable if the feet are parallelto the ground.Calculating the feet inclination is extremely simple through the Webots physicsplug-in. The inclination is directly available trough the real part of each footquaternion where the inclination angle is the φ angle in:

Q = cosϕ · 1 + x sinϕ · i+ y sinϕ · j + z sinϕ · k (5.3)

22

5.1.2 Results and DiscussionAfter extensive testing what seemed originally a reasonable idea turned out to be difficultto concretise. Out of around 40 different optimizations only two produced reasonableresults. This lack of capability to optimize suitable controllers can be explained by theincreased difficulty of the problem to solve. In the reference controller the optimizer hada large freedom for experimentation and received the corresponding feedback through thefitness score (as given by Eq. (4.3)). However by using those implicit fitness functions theoptimizer is more constraint than with the reference experiment as the fitness criterionare all applied in parallel. For instance a potential solution if not suitable from the verybeginning, although potentially promising, can be attributed the same score as a totallyunsuitable one. Therefore the only way to generate a (sub-)optimal solution is to finddirectly from the beginning as they cannot be refined at each iteration. A better fitnessevaluation system must be found.

5.2 Trial Two: Staged Fitness PSOTo solve the issues presented in the previous section while still checking the robustnessparameters a new system to evaluate the different solutions must be used. Previouslythe all the criterion were tested in parallel. The difference between two solutions withthe same fitness score could be very different. For example a solution which could bepromising but which did not satisfy a single robustness criteria could have exactly thesame score as an inherently unstable controller.A nicer procedure would be to select solutions step by step i.e. only test the advanced

properties of the controllers after ensuring that they can perform the basic ones. Usingthe biped walking robot controller as example, Would it make sense to test the groundclearance if the robot cannot walk more than a few centimeters? Not really. Howeveronce the walked distance is not the limiting factor anymore, testing the gait robustnessbecomes really interesting. Using several fitness functions sequentially is an elegantsolution to the problem posed by the usage of implicit fitness functions. Sequential fitnessfunctions are divided into several stages associated with a respective moving conditionto the next stage. Once the stage condition is fulfilled the next stage function is used.This is the whole purpose of staged fitness PSO. Staged fitness PSO (or stage PSO) is anextension for PSO developed by J. van den Kieboom internally at the Biorob Laboratory,EPFL. Stage PSO is written on top of a standard Particle Swarm Optimization (SPSO)algorithm. The main difference is that instead of using all the particles together as inSPSO here the particles of each stage are gathered in distinct neighborhoods where theirrespective fitness functions are applied. Depending of the crossing conditions betweenstages, the particles can move from one to another.

5.2.1 Stages DescriptionThe optimizer uses five stages; the first two stages ensure that a reasonable walking gaitexists, while the remaining stages are present to improve the controller robustness and

23

performance. The robustness stages are based on the same criterion that were used withthe implicit fitness function i.e. the ground clearance and the torso inclination. However,the feet inclination was not used, as it will not make sense to use when the feet will beadded a toe (see chapter 6). The five stages are as follows:

Stage One: Reaching the Target Speed The drawbacks of a fitness function based onthe euclidean distance were presented in chapter 4. While it was possible to opti-mize reasonable gaits with such a criteria, improvements are possible. By using aslightly different approach, it is possible to still optimize for distance, although in-directly, while still preventing the optimizer to push the controller towards highervelocities exclusively. To do so the fitness function is divided into two distinctstages. Firstly the controller must reach a target speed. Next, it is optimized fortime. Therefore the controller is still optimized for distance, but its speed remainsin the boundaries defined.The robot’s average velocity is calculated by considering the euclidean distance(De, Eq. (4.3)), walked until the run is finished divided by the run duration (tT ).

v = v = De

t(5.4)

However using the velocity directly as a fitness function is not sufficient, as theoptimization process needs to be “guided” to the right solution. Therefore a spe-cially designed Gaussian function (Eq. (5.5)), centered around the desired speed(b), is used.

f(v) = a · exp(−(v − b)2/2c2) (5.5)

Where:• a is set to 1 to bound f(v) in (0, 1]• b is the desired velocity or target speed• c is the variance

A graphical representation of stage 1’s fitness functions are shown in figure 5.1.The main advantage of this kind of fitness function is that it becomes trivial toremove undesired solutions, simply by adjusting the variance of the Gaussian.For instance for negative velocities, the Gaussian is set in such a way that thefitness will converge to zero, and get closer and closer to 1, while the speed isapproaching the target. The experiment is repeated using three different targetspeeds: 0.14m/s, 0.28m/s, and 0.42m/s. The robot’s velocity is considered closeenough to the target speed, once it reaches vt ± ve with vt and ve from table 5.1.The condition to move from stage 1 to stage 2 is that ‖v − vt‖ < ve.

24

-0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.80

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

velocity [ms-1]

fitne

ss v

alue

Stage 1 Fitness Criteria

v = 0.14 [ms-1]

v = 0.28 [ms-1]

v = 0.42 [ms-1]

Figure 5.1: Stage 1 fitness functions. The Gaussian functions matching target velocitiesof 0.14ms−1, 0.28ms−1, and 0.42ms−1.

vt ve a b c

0.14ms−1 0.02ms−1 1 0.14 0.05770.28ms−1 0.04ms−1 1 0.28 0.08160.42ms−1 0.06ms−1 1 0.42 0.1000

Table 5.1: Stage 1 fitness function. Gaussian fitness functions parameters. The variance,or the c, parameter is equal to

√ve/6.

25

Stage Two: Time Maximization During stage 1 the controller was able to make therobot reach the defined target velocity. In stage 2 the optimizer’s goal is to in-crease the time the robot is able to walk. Thus, the fitness function is simply themaximization of the time:

f(t) = t (5.6)

where t is the run duration. When stages 1 and 2 are combined it is possible tosee that the optimization process is trying to maximize the walked distance.The transition between stages 2 and 3 is made once the robot is able to walk up tothe whole run duration i.e. 10 s. Initially, the condition was set to 5 s which ensureda reasonable performance distance-wise. However, once the controller was able toreach the last stage (energy consumption optimization), a trade-off appeared andthe controllers were selected in such a way that they would not walk for more than5 s while still respecting the other stages conditions. Obviously by walking onlyhalf the time the energy consumption was lower. This small example illustratesnicely why extreme care should be taken in preventing undesired trade-offs in thefitness functions design.

Stage Three: Ground Clearance The following stage’s objective is to select and opti-mize the controllers to increase their robustness. Even if the controller can producean interesting walking gait, the swinging leg can be too close to the ground, addingpotential sources of instabilities. The ground clearance is calculated in the sameway as in section 5.1.1 i.e., when both feet are parallel, the corner of the footwhose distance to the ground is the lowest, is used as ground clearance value hgc.As the ground clearance is reduced when the robot moves from its initial standingstill position into the starting posture the ground clearance measurements are onlyconsidered after 600ms. The lowest ground clearance value measured during therun is then used directly as fitness function:

f = mint≥600 ms

hgct(5.7)

The transition criteria to the next stage is to obtain a ground clearance value ofat least 1 cm.

Stage Four: Torso Inclination Stage 4’s objective is to reduce the torso leaning presentin the reference controllers. The trunk rotation is calculated as described in section5.1.1 by calculating the rotation of robot origin node in the world coordinates ateach time step. If ~ey0 and ~eyt are the vectors representing the y-axis at the timesteps 0 and t respectively, then the cosine of the torso’s rotation angle αt is:

cosαt = < ~eyt, ~ey0 >

‖ ~eyt‖ · ‖ ~ey0‖(5.8)

26

The idea is to minimize the maximum angle made by the torso during the wholerun. The torso’s inclination is measured at every time step. The objective of thefitness function is to minimize the highest measured angle during the run thus:

f = ( min0≤t≤tT

αt)−1 (5.9)

An inclination of no more than 15 ° is assumed to be reasonable and is thereforeused as transition condition for stage 5.

Stage Five: Energy Consumption Finally, and if the controller was suitable enough tofulfill all the previous robustness criterion, it is optimized for energy consumptionin this last stage. Unfortunately the robot’s energy consumption cannot be mea-sured directly in the simulator. However, using some assumptions, a reasonableestimation can be made. For instance, in a DC motor, the energy consumed duringa given period of time t is given by:

E = Pt = UIt (5.10)

where P is the motor power, U the motor voltage, and I the motor current.Moreover the voltage is constant because a DC motor is considered meaning thatthe parameters which can influence the energy consumption are the time t and thecurrent I. Also the motor equation gives a direct relationship between the torqueT and the current I through the motor constant Kw:

T = KwI (5.11)

Therefore, it is possible to write with some manipulations:

E = UIt = U1Kw

t︸︷︷︸K′

T = K′T ' T (5.12)

Eq. (5.12) shows that the torque developed by a DC motor is a loose approximationof the energy consumption. It is reasonable, as the motors utilization time is thesame for all joints. Hence summing the torques developed by all the actuated jointsduring the whole run gives a reasonable idea of the energy consumed by the robot.

Ttot = 1tT

∑i

∑j

Tij · dti (5.13)

where Tij is the torque measured on joint j during time step i and tT is the totalrun duration. However Eq. (5.13) is not without limitations. Firstly the motorsdo not have all the same size, therefore it is not really correct to consider that theconstant K ′ from Eq. (5.12) is the same for every motor. Also the robot CPU and

27

sensors do consume energy but it is reasonable to consider that their consumptionis similar for every run, as long as the working period of time is consistent. Besidesthe main goal of this approximation is not to estimate the real energy consumptionbut to obtain an estimation against which the energy consumption of the differentsolutions produced by the optimizer can be evaluated. The fitness function of stage5 is thus obviously the minimization of the energy consumption as per Eq. (5.14):

f = 1/Ttot (5.14)

As stage 5 is the last one, the stop condition reached once all the iterations arefinished.

Stage Fitness Functions Summary

Stage Variable Criteria Next Condition1 v exp(−(v − b)2/2c2) vt ± ve2 t t t > 9.9 s3 hgct

hgcthgct

> 0.01 m4 αt 1/‖αt‖ ‖αt‖ < 15 °5 Ttot 1/Ttot –

Table 5.2: Summary of the different stages fitness functions and next stage conditions.

5.2.2 Additional ConsiderationsOnce all the stages were implemented the first batches of simulation could be started.The first results looked promising but for one point. There was internal collisions be-tween both legs. A potential solution would have been to activate the internal collisiondetection featured by Webots but it was problematic for two reasons. Firstly it inducesa huge performance penalty and secondly due to the way the bounding geometries arebuilt into the HOAP-2 model there is internal collisions by design which is fine as long asthe Webots internal collisions detection is disabled. That means that internal collisionsmust be handled separately and by hand.A reduced collision detection algorithm was thus implemented in the physics plug-in.

The physics plug-in provides a callback function which is called whenever a collisionoccurs between two geometries1. By checking that each geometry belongs to a differentleg it is possible to discard all the collisions which are not problematic. However toprevent false positives the penetration depth also needs to be calculated. Only whenthe penetration depth is non-zero, is the collision considered a problem. When it is the

1Geometries are the bounding objects used by the Webots physics engine to compute the collisionsbetween simulated objects.

28

case, the run is stopped inflicting an indirect penalty to the fitness score as if the robotwould have fallen down.

5.2.3 ResultsThe results section is divided into three parts to reflect all the different variants proposed.To analyze the controller behavior in different conditions, the experiment proposal from§ 5.2.1 was repeated with three different stage one target speeds: 0.14m/s, 0.28m/s,and 0.42m/s. Each variant results will be presented independently and will be analyzedtogether in the discussion. The effects of the different stages on the controller parameterswill also be discussed.

Target Speed: 0.14m/s

As was the case with the first proposal with implicit fitness functions, the experimentswhere the target speed was set to 0.14m/s did not produce meaningful results. In fact,it seems that the optimizer is not able to find any working solution at such (reduced)speeds. The usual solution attempted by the optimizer is to produce trajectories for thefrontal hip joint which are very close to a step. Hence the swinging leg moves quickly toits extremal positions (figures 5.2(a) and 5.2(b)). Due to its mass repartition, the robotthen swings altogether until the its foot touches the ground (figures 5.2(c)). Then therobot tries to bring back its legs together until it finishes to fall down (figures 5.2(d) to5.2(i)). The problem with this walking “technique” is that the gait is extremely unstableand the robot falls down really quickly. Hence none of the ten repetitions was able toproduce a controller which could make the robot walk for at least 10 s.Attempts to increase the iterations number to 1000 or to increase the population size

to 200 were tried but did not produce better results as the type of controllers trajectoriesremained the same. All the robustness and stability testing which will be presented inchapter 7 will then be realized only for target speeds of 0.28m/s and 0.42m/s.


Finding a good controller can be a difficult task. To illustrate this difficulty some statis-tics are presented. Out of the ten repetitions realized only 30% of the fittest solutions2

were able to reach the last stage, 20% reached stage four, 10% stage three and the remain-ing 40% got stuck at stage two. Two solutions stood out due to their fitness scores on theone hand and/or also by qualitative visual gait inspection on the other. In addition it isinteresting to note that the time for the optimizer to converge towards a (sub-)optimalsolution can present important discrepancies. As shown in figure 5.3, repetition (a) wasable to reach stage five after only 62 iterations whereas repetition (b) needed 211 iter-ations. As such the final properties from both controllers are deeply influenced by thetime that each solution spends in each stage of the optimization process.

2To be clear, is considered the fittest solution the solution whose fitness score is the highest in itsrepetition. The small statistics presented at the beginning of each section consider only the fittestsolutions of each one of the 10 repetitions.

29

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 5.2: “Best” gait from the 0.14m/s target speed experiment. The transition from astanding still posture into a full stride is cut in steps. The gait is typical of theresults obtained for low target speeds. A video of the gait can be seen on theBiorob laboratory website: http://biorob.epfl.ch/page-65543.html.

30


Stages Repartition Between Solutions

Iterations

Sol

utio

ns

50 100 150 200 250 300 350 4000

10

20

30

40

50

60

70

80

90

100stage 1stage 2stage 3stage 4stage 5

(a)


Iterations

Sol

utio

ns

50 100 150 200 250 300 350 4000

10

20

30

40

50

60

70

80

90


(b)

Figure 5.3: Solutions repartition between the stages at each iteration. The data comesfrom the two best repetitions from the 0.28m/s target speed experiment.Each color represents a given stage. Although both repetitions reached stage5, the time necessary to do so was not the same. As the iterations go by thenumber of solutions in the lower stages tends to decrease while the numberof solutions in the higher stages increase.

When figures 5.3(a) and 5.3(b) are compared they present a very different reality. Inthe first case the optimizer is able to converge quickly towards (sub-)optimal solutionsi.e. solutions present in stage five. From that moment on the more iterations aremade the more solutions are in the highest stages. However stage four contains onlya minimal number of solutions. This can be explained by the fact that the crossingcondition between stages four and five is rather loose as a 15 ° trunk inclination is arelatively easy target to reach. In the second case the optimizer had more difficultiesto converge towards a (sub-)optimal solution. However once a solution is found theoptimizer converges successfully.The gait produced by the fittest solution from repetition (a) can be seen in figure 5.4.

This gait looks very natural and human-like. The movements are smooth and limitedin amplitude. The foot transition on the ground goes from heel-to-toe. The trunkoscillations are limited and do not increase beyond 11.35 ° and contrary to the referencecontroller the robot motion is realized by small steps. Figure 5.4(i) shows the momentwhere the knee is starting to bend in order to provide enough ground clearance to therobot.The gait from repetition (b), which can be seen in figure 5.5 is not as natural as

the the gait from repetition (a). This is due to several factors. The first factor is thebending of the knee which moves the foot higher than previously. Also the transitionof the foot on the ground is a potential source of instabilities. Instead of effectuatinga nice heel-to-toe transition there is no real transition. For instance during the swingphase the “toe” part of the foot is closer to the ground (figures 5.5(a) to 5.5(d)) thanthe heel. Hence at the very end of the swing phase the ankle rotates the foot in such

31

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 5.4: Gait of the 0.28m/s (a) repetition. The transition from a standing stillposture into half of the stride is cut in steps. The gait looks very natural,every movement is smooth and the foot provides a transition from heel-to-toewhen touching the ground. A video of the gait can be seen on the Bioroblaboratory website: http://biorob.epfl.ch/page-65543.html.

32


a way that it becomes parallel to the ground just before contact. The contact which isa bit harsh makes the robot swing slightly forward (figures 5.5(e) to 5.5(f) compared tofigure 5.5(g)) thus creating this particular transition.Obviously the controllers trajectories are directly linked to the gait of the robot. It

is interesting to look for the little particularities of the gaits described above in thetrajectories. For instance the repetition (a) controller presented a very nice heel-to-toetransition during the contact between a foot and the ground. The transition can beobserved in the frontal ankle joint trajectory in figure 5.6. The stance phase of the a legoccurs roughly between t0.25 and t0.75 (with the ground contact occurring at t0.25 and thefoot leaving the ground at t0.75) hence the peak of the ankle joint is ideally situated inorder produce a nice transition. In the first half of the stance phase the angular positionof the ankle is increasing which corresponds to the damping of the contact between theheel and the ground. After the peak the ankle joint helps the rest of the body to moveforward. The steps that the robot make are relatively small and this can be linked tothe frontal hip amplitude of 29.14 ° against the almost 60 ° of the reference controllerfrom chapter 4.Repetition (b) on the other hand has a different gait. The contact between the ground

and the feet is harsher and the toe is closer to the ground than the heel during the majorpart of the stride. The peak in the ankle trajectory is necessary to provide the transitionmotion during the stance. The motion of the feet is the due to the particularly highmaximum angular position of the knee joints which is superior to 100 ° hence explainingthe height of the foot in figure 5.5(a).Performance-wise the controller from repetition (a) produces a more efficient gait as

could be expected. The robot motion is faster (0.311m/s vs 0.290m/s) and the energyconsumption is also lower (19.468KNrad vs 21.851KNrad) which can easily be explainedby the more limited/restrained motion of the joints of repetition (a). The frequency ofrepetition (b) is 22.3% lower however the speed decrease is more reduced. Part of thedifference is compensated through the higher frontal hip joint amplitude of repetition(b). Finally table 5.3 summarizes the main performance values of both controllers.

Repetition (a) (b)velocity 0.311 0.290 m/shgc 0.0114 0.0103 mα 13.751 14.884 °Ttot 19.468 21.851 KNrad

frequency 0.856 0.665 Hz

Table 5.3: Best repetitions from the experiment with a 0.28m/s target velocity. Rep-etition (a) converged quickly to stage 4 hence the controller was optimizedfor energy consumption while repetition (b) spent more time reducing thetrunk inclination. This can be observed directly on the numerical scores ofthe optimization process and on figure 5.3.

33

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 5.5: Gait of the 0.28m/s (b) repetition. The transition from a standing stillposture into half of the stride is cut in steps. The gait is a bit rough resultingin a an transition which is not smooth. A video of the gait can be seen on theBiorob laboratory website: http://biorob.epfl.ch/page-65543.html.

34


0 0.2 0.4 0.6 0.8 1-8

-6

-4

-2

0

2

4

6


angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-40

-30

-20

-10

0

10

20frontal hip

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 10

10

20

30

40

50

60

70

80frontal knee

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-30

-25

-20

-15

-10

-5

0

5

10

15frontal ankle

angl

e [d

eg]

Figure 5.6: Trajectories of the 0.28m/s target speed experiment, repetition (a) solution.

0 0.2 0.4 0.6 0.8 1-8

-6

-4

-2

0

2

4

6


angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-50

-40

-30

-20

-10

0

10

20

30frontal hip

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 10

20

40

60

80

100

120frontal knee

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-30

-20

-10

0

10

20

30frontal ankle

angl

e [d

eg]

Figure 5.7: Trajectories of the0.28m/s target speed experiment, repetition (b) solution.

35


When the target speed is increased the optimizer can find solutions more easily. Whilethe target speed was 0.28m/s only 30% of the best solutions were able to reach stagefive. With a 50% increase in the target speed the fraction of the fittest solutions reachingthe last stage increased to 60%, stage four represents 20% and the remaining 20% beingequally shared between stage two and stage three.


Iterations

Sol

utio

ns

50 100 150 200 250 300 350 4000

10

20

30

40

50

60

70

80

90


(a)


Iterations

Sol

utio

ns

50 100 150 200 250 300 350 4000

10

20

30

40

50

60

70

80

90


(b)

Figure 5.8: Solutions repartition between the stages at each iteration. The data comesof the the two best repetitions of the 0.42m/s target speed experiment. Eachcolor represents a given stage.

Both repetitions were able to generate stage five solutions in around 100 iterations(133 for (a) and 113 for (b)). Two interesting remarks can be made on figure 5.8. Firstlyin both repetitions the number solutions which are in stage three (i.e. optimizing forground clearance) is almost null. Secondly, and especially for repetition (a) the stagefour solutions increased significantly as the number of iterations moved by. Dependingon the type of gait it can alternatively more difficult to optimize for ground clearanceor for torso inclination.The gait from repetition (a) is interesting in many ways. It looks reasonably natural

in the sense that the transition of the foot on the ground is “almost” heel-to-toe-like(see figures 5.9(d) to 5.9(h)). Secondly the way the transition between stance and phaseis realized (figure 5.9(i)) gives the sensation that the robot is running even though itis not. The ankle movement is quiet complex. On the one side it helps the transitionbetween stance to swing phase however it provokes an oscillation of the foot when itis in contact with ground (figure 5.9(g)). This oscillation makes the robot stand on asingle heel when both legs are parallel, therefore the ground clearance is unusually highas can be verified in figures 5.9(f) and 5.9(g). In fact this gait is a real motivation forthe addition of a toe on the robot’s feet.Repetition (b) gait is more elegant and it also looks more natural. It’s features show

a really nice heel-to-toe transition. It does look like a faster version of the gait from

36

repetition (a) of the previous experiment. However the gait has some stability issueswhich cannot directly be seen on figure 5.10.The trajectories from repetition (a) present two main differences compared to the other

trajectories shapes discussed up to now. The frontal knee trajectory was interpolatedonly on three points. It happens sometimes to controllers when the distance along thex-axis between two data points is null. When that happens, the problematic data pointis simply removed and the curve is interpolated from only three point. Nonetheless,this does not mean that the shape of the curve would be completely different from oneinterpolated on four points. In this case the knee trajectory from the first repetitionpresents the same type of shape (figure 4.3) as the reference controller from chapter4. The second unique feature of this controller is that the ankle joint has two peaksinstead of one. The second peak corresponds to the propulsion furnished by the anklethat can be seen in figures 5.9(c) and 5.9(i). This movement occurs at the end of thestance and helps the robot to move forward. It also justifies the addition of an additionalarticulation to the feet of the robot.Visually the gait from repetition (b) is comparable to the gait from repetition (a),

when the target speed was 0.28m/s. Hence it is logical to compare both controllerstrajectories . Interestingly enough they do look similar. The shape of the lateral hipand ankle joints do resemble each other. The lateral hip and ankle trajectories have thesame amplitude but the plateau of the faster controller is larger. On the other side theplateau from the frontal hip joint is so reduced in the faster version that the shape ofthe hip curve is almost a sine. Its amplitude is also increased to account for the highervelocity of the robot.Finally the numerical scores of the two controllers are compared in table 5.4. The

velocity from repetition (a) is slightly higher than for repetition (b). Both solutionspresent ground clearance values which are quite outstanding. This relates to the lack ofsolutions in stage 3 that could be seen in figure 5.8. From an energy point of view thesecond solution is more efficient, although the robot moves more slowly.

Repetition (a) (b)velocity 0.417 0.395 m/shgc 0.023 0.0172 mαt 14.38 13.73 °Ttot 24.69 22.07 KNrad


Table 5.4: Performance score from the two best repetitions of the experiment with a0.42m/s target velocity.

37

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 5.9: Gait of the 0.42m/s (a) repetition. The transition from a standing stillposture into half of the stride is cut in steps. The gait motivates the additionof a toe to the feet. A video of the gait can be seen on the Biorob laboratorywebsite: http://biorob.epfl.ch/page-65543.html.

38


(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 5.10: Gait of the 0.42m/s (b) repetition. The transition from a standing stillposture into half of the stride is cut in steps. The gait has natural lookhowever it is a little unstable. The foot provides a transition from heel-to-toe when touching the ground. A video of the gait can be seen on theBiorob laboratory website: http://biorob.epfl.ch/page-65543.html.

39


0 0.2 0.4 0.6 0.8 1-8

-6

-4

-2

0

2

4

6


angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-60

-50

-40

-30

-20

-10

0

10

20frontal hip

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 10

20

40

60

80

100

120frontal knee

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-25

-20

-15

-10

-5

0

5

10

15frontal ankle

angl

e [d

eg]

Figure 5.11: Trajectories of the 0.42m/s target speed experiment, repetition (a) solution.

0 0.2 0.4 0.6 0.8 1-8

-6

-4

-2

0

2

4

6


angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-50

-40

-30

-20

-10

0

10

20

30frontal hip

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 110

20

30

40

50

60

70frontal knee

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-30

-25

-20

-15

-10

-5

0

5frontal ankle

angl

e [d

eg]

Figure 5.12: Trajectories of the 0.42m/s target speed experiment, repetition (b) solution.

40

5.3 Influence of the Robustness Stages on the ControllersTo analyze properly the resulting controllers it it necessary to understand what are theeffects of the staged fitness functions. The objective of this section is to determine whatparameters are the most affected during a single stage optimization. The analysis of acontroller from stage i will be made by comparing the first solution which reached stagei (i.e. the reference, non-optimized solution) with the first solution to reach stage i + 1(i.e. the optimized solution).

5.3.1 Influence of the Ground Clearance StageThe ground clearance is of major importance for the gait robustness, hence it is therobustness parameter which is optimized first. As explained in the stages descriptions(see § 5.2.1) a ground clearance of 1 cm is considered reasonable which, although small,is not negligible for a 50 cm tall robot such as the HOAP-2. Several questions can thenarise. What parameters are the most sensitive to ground clearance? And does increasingthe ground clearance affect the controller performances?Intuitively two main possibilities are offered to the optimizer to increase the ground

clearance: by contracting the knees or the ankles or even a mix of the two. As an exampletwo controllers of the same repetition are used to illustrate the impact of the groundclearance improvements. The target speed used during stage one of those controllers is0.42m/s. The reference controller is the first solution able to reach stage three during theoptimization process, hence it has not been tweaked yet to increase the ground clearance.On the other side the optimized controller was the first to reach stage 4. Therefore itrepresents a ground clearance optimized version of the reference controller.The trajectories of the reference controllers can be seen in figures 5.13 and 5.14. They

present interesting similarities in both shape and amplitude of each joint trajectory.Nonetheless two important differences stand out: the reference lateral joints trajectoryis closer to a step than to a sine, and the knee trajectory of the optimized controllerpresents the exact same shape as the reference but for the angular position at t0 and t1(t0 being the beginning and t1 the end of the stride).The transition between the step shaped trajectory of the lateral joints into a more

sine-like curve is not necessarily directly linked to the increase of the ground clearance.It may simply have been adjusted to provide more stability to the gait. However, theangular position of the knee increase at t0 and t1 is linked to the way the ground clearanceis calculated. As previously described in table 2.1, the bias of the knee is identical to thefrontal hip bias. Thus when the knee trajectory is at t0 and t1, both legs are parallel.It is then logical to see a link between the ground clearance and the angular positionsof the knee at t0 and t1. After the optimization stage the height of the knee increasedfrom 23.97 ° to 61.36 ° while the ground clearance increased from 0.0031m to 0.0112m.The qualitative results of the ground clearance optimization can be seen in figure 5.15.

Two factors are interesting to observe. On the one hand a visual inspection shows thatthe ground clearance is improved and on the other hand the foot inclination is higherwhich is due to the superior bending of the knee in figure 5.15(b).

41

0 0.2 0.4 0.6 0.8 1-6

-4

-2

0

2

4


angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-80

-60

-40

-20

0

20

40frontal hip

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 110

20

30

40

50

60

70frontal knee

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-30

-20

-10

0

10

20frontal ankle

angl

e [d

eg]

Figure 5.13: Controller trajectories before ground clearance optimization.

0 0.2 0.4 0.6 0.8 1-6

-4

-2

0

2

4


angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-80

-60

-40

-20

0

20frontal hip

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 110

20

30

40

50

60

70

80frontal knee

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-30

-20

-10

0

10

20frontal ankle

angl

e [d

eg]

Figure 5.14: Controller trajectories after ground clearance optimization but before torsoinclination optimization.

42

(a) (b)

Figure 5.15: Qualitative results of the ground clearance optimization in stage 3. Theground clearance increased from 0.0031m to 0.012m

From a performance point of view the changes brought by the ground clearance seemnegligible. The velocity of the robot decreases from 0.47m/s to 0.438m/s is linked tothe reduced frequency after optimization. The energy consumption was also slightlyreduced due to a lower velocity.

Controller Reference After Ground Clearance Optimizationvelocity 0.470 0.438 m/shgc 0.0031 0.0112 mαt 34.75 30.16 °Ttot 27.54 27.50 KNrad


Table 5.5: Performance results comparison between the reference controller before andafter ground clearance optimization.

5.3.2 Influence of the Torso Inclination StageThe torso inclination stage was added to reduce the leaning of robot trunk which waspresent in the first batch of optimizations. The interesting part is now to analyze howthe controller can improve the trunk inclination. The same procedure as for the groundclearance will be applied i.e. two solutions from the same repetition will be compared.The selection is made by considering as a reference the first solution to reach stage 4.First solution to reach stage five is considered an optimized version of the referencecontroller.Repetition (a) from the experiment of section 5.2.3 was selected because of the figure

5.8(a) indication that there is a large population of solutions that are in stage 4. Theyare therefore optimizing for torso inclination. The number of iterations required toreduce the torso inclination below 15 ° was reduced. Indeed the reference controller isfrom iteration 37 whereas the optimized controller is from iteration 45. By comparingfigure 5.16 and 5.17 it is clear that both controllers are very similar. For instance the

43

trajectories shapes remained the same however there were changes in their respectiveamplitudes. All of them did present some change. The lateral joints trajectories presenta slight increase in amplitude (from 4.19 ° to 5.03 °) and the plateau is also marginallylarger. There was no real change in the frontal hip amplitude nor offset. The kneetrajectory presents two newsworthy changes. First, its shape did not change, and second,the angular position of the knee at t0 and t1 increased substantially from 78.52 ° to108.35 °! In figure 5.18 the real effect of the modification that the knee amplitude bringsto the gait, can be observed. Before optimization, the angular position of the knee,when the foot enters in contact with the ground, was of 30 ° while after optimization theangular position at the same moment increased to 45 °. This change forces the robot tostand in a more upright position hence reducing the trunk inclination. With respect tothe frontal ankle joint the differences appeared after the optimization are there to matchthe gait imposed by the knees.

0 0.2 0.4 0.6 0.8 1-5

0


angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-60

-50

-40

-30

-20

-10

0

10

20frontal hip

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 110

20

30

40

50

60

70

80frontal knee

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-30

-20

-10

0

10

20

30frontal ankle

angl

e [d

eg]

Figure 5.16: Controller trajectories before torso inclination optimization.

When comparing the numerical performance of both controllers, it interesting to notethat besides torso inclination the other parameters values remain close. There is animportant increase in the ground clearance score, which is explained by the modificationsof the knee trajectory, as previously discussed in section 5.3.1.

5.3.3 Influence of the Energy Consumption StageAll the fitness functions used during the optimization process were designed for twomain purposes: producing a walking gait on the one side and improving its robustnesson the other side. In the last stage, the energy consumption optimization, however is

44

0 0.2 0.4 0.6 0.8 1-6

-4

-2

0

2

4


angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-60

-50

-40

-30

-20

-10

0

10

20frontal hip

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 10

20

40

60

80

100

120frontal knee

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-25

-20

-15

-10

-5

0

5

10

15frontal ankle

angl

e [d

eg]

Figure 5.17: Controller trajectories after torso inclination optimization.

(a) (b)

Figure 5.18: Comparison of the gaits from the controllers used for torso inclination anal-ysis. In (a) the reference controller presents a knee angle of 30 ° when thefoot touches the ground. In (b) the optimized controller knee has an an-gle of 45 ° when the foot touches the ground. The change in the way theknee behaves forces the robot to stand in a more upright position. Hencereducing the torso inclination.

45

Controller Reference After Torso Inclination Optimizationvelocity 0.438 0.454 m/shgc 0.010 0.025 mαt 28.56 14.90 °Ttot 29.08 27.58 KNrad


Table 5.6: Performance results comparison between the reference controller before andafter torso inclination optimization.

not present for neither of those reasons although it is an interesting property to optimizefor as it can improve the robot autonomy. It would then be interesting to compare acontroller from the same repetition before and after energy consumption optimization.The first controller to reach stage five is considered to not have been optimized for

energy consumption and will be used as reference. On the other side the best controllerof the repetition i.e. the controller whose stage five fitness is the higher will be consideredto have been optimized for energy consumption. The main points of interest are threefold. Firstly is the energy consumption reduction really effective? What controllerparameters are changed in order to reduce energy consumption? And finally how is theglobal performance of the controller affected within the allowed condition boundaries ofthe staged fitness?The controller considered is from repetition (a). As indication the reference controller

(i.e. without energy optimization) is from iteration number 62 while the optimizedcontroller is from iteration number 378. The energy-savvy controller profited from alarge period of time to be improved by the optimization algorithm.

1. Is the energy consumption really effective?The easiest way to check for the presence or effectiveness of the energy consump-tion reduction is to look at the numbers. The reference controller consumption isof 25.243KNrad whereas the energy-savvy controller reduced its consumption to19.468KNrad. The energy consumption optimization was therefore quiet effective.

2. What controller parameters are changed in order to reduce energy consumption?The first parameters to check are the data points coordinates used to generate thetrajectories used to drive the joints of the robot. The main interest in comparingthe trajectories of the joints between the reference and energy-savvy controllers isto ensure that both controllers are still similar. Thus the comparison will be faireras it ensures that optimized controller is really an improvement of the referencedesign and not a whole new controller however efficient it may be. By inspectingfigures 5.6 and 5.19, it can be seen that both controllers are really similar. Themain difference consisting in the reduction of the lateral joints plateau width whichtransformed the reference trajectory from a very sine-like shape into the moreconventional flattened sine curve as described in chapter 2.

46

0 0.2 0.4 0.6 0.8 1-8

-6

-4

-2

0

2

4

6


angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-40

-30

-20

-10

0

10

20

30frontal hip

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 110

20

30

40

50

60

70

80frontal knee

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-30

-25

-20

-15

-10

-5

0

5

10

15frontal ankle

angl

e [d

eg]

Figure 5.19: Reference controller joint trajectories coming from the same repetition asthe trajectories from figure 5.6. Both solutions present almost identicaltrajectories besides for the lateral hip joint which presents a sine-like shapein the reference controller and a wider plateau for the optimized solution.

47

The remaining parameter is the frequency which determines the robot motionspeed. Common sense would indicate that a reduction in the robot velocity wouldreduce its energy needs. Surprisingly in this example it is the contrary. The fre-quency increased from 0.779Hz up to 0.860Hz driving the robot velocity from a0.270m/s to 0.311m/s. This surprising result tends to indicate that the conjunc-tion of the controller trajectories and the robot dynamics favor higher velocities.The procedure of realizing small adjustments to the controller itself while at thesame time tweaking the frequency is quiet interesting and elegant. On the onehand to optimize the energy consumption the controller must be able to reach thelast stage therefore if the controller is too different from the reference it may notable to do so. Plus by adjusting the motion velocity of the robot through a singleparameter, the frequency, the optimizer can make the controller operate the robotwhere its dynamic properties make it more efficient.

3. How is the global performance of the controller affected within the allowed condi-tion boundaries of the staged fitness?

Controller Reference Energy-Savvyvelocity 0.270 0.311 m/shgc 0.0111 0.0114 mα 13.7209 13.751 °Ttot 25.243 19.468 KNrad

Table 5.7: Performance results comparison between the reference controller before andafter energy consumption optimization. Besides the measured torque, theremaining results are similar indicating that the optimization of the controlleris realized by small adjustments to the controller trajectories shapes.

By simply comparing the fitness values of each stage from table 5.7 it is possibleto see that the optimizer was able to successfully reduce energy consumption whileleaving the controller capabilities virtually unchanged. It would be therefore in-teresting to see if this type of optimization is still possible with a fixed frequencyon the one hand and with differently shaped trajectories on the other.

The other repetitions which were able to reach stage five also indicate the same sortof properties as the ones which were presented above. Namely a pronounced similaritybetween the controllers trajectories and an increase in the frequency and the robotvelocity.

5.4 DiscussionAfter the optimization of a reference controller several improvements to the fitness func-tion were proposed, implemented, and tested. The replacement of a single fitness func-tion with a sequential fitness evaluation through staged PSO proved to be able to gener-

48

ate suitable gaits which look more natural and are more efficient than the gaits producedby the reference controller while ensuring that minimal robustness features are present.The experiment was realize three times with three different target speeds: 0.14m/s,

0.28m/s, and 0.42m/s. Each variation was repeated 10 times leading to 30 completeoptimizations. The first variant, where the target velocity is equal to 0.14m/s, did notproduced any meaningful results. The problem seems to lie in the shape of the sine-likefunction from the lateral and hip joints which present a symmetrical stance and swingphase. A possible idea to test for is to allow the optimizer to choose the duration of theswing and the stance phases.The second variant, with a target speed of 0.28m/s, lead into the optimization of a

natural, very human-like gait. However the ratio of solutions reaching the last stage ofthe optimization process is quiet reduced, only 30%.The last variant where the target velocity was set to 0.42m/s was the more prolific in

the ratio of repetitions able to reach stage five. The robot dynamics coupled with thetype of trajectories of the controllers tend to favor higher speeds. Also the first repetitionof this variant has property to illustrate clearly why a foot built without articulationsat the toes is problematic even if the controller seems to accommodate this fact and stillgenerate a reasonable gait.The effects of each stage on the controller parameters was also studied. It was shown

that the ground clearance is usually increased through a larger knee contraction.

49

6 In Need for a Toe

6.1 MotivationsThe repetition (a) controller from section 5.2.3 illustrates perfectly why a toe could beuseful. In figure 5.9(i) the robot the only contact with the ground is made trough thevery tip of the foot. The robot is therefore in a quite unstable position. Should anexternal perturbation be applied on the robot at this moment it would most certainlyfall. Adding a passive toe to the feet would also allow the robot to store energy at thebeginning of the stance phase. It could then release it at the end of the stance when thefoot is used to move the robot forward.

6.2 Adding a Toe to the HOAP-2 Webots ModelThe toe is designed to improve the ground contact of the foot during stance and toprovide energy storage in the spring-damper system. Hence, it does not have any activecomponents and is strictly realized as a passive joint. The toe was modeled in Webotsas a servo node. It is made out of two distinct pieces: the servo and the toe. The servois a cylinder which provides the rotation while the toe itself is the element which is usedas ground contact interface.The toe is 62mm wide (i.e., the same width as the foot), 12mm thick, and 25mm

long which represents 25% of the foot length. The mass of the toe was evaluated byconsidering that its structure is realized in POM (Polyoxymethylene) and that the jointitself is a leaf-spring made of steel. Hence the whole toe mass is 0.0289 kg which is quitelightweight. Also the toe maximum angular amplitude is limited to 60 °. A picture oftoe model can be seen in figure 6.1.

Figure 6.1: HOAP-2 foot with the additional toe under the Webots simulator. The graycylinder is the toe passive joint while the black block on the extremity is thetoe itself, which provides the interface with ground.

50

The Webots servo model is present in figure 6.2. The servo is modeled through threedistinct components: a motor, a spring, and a damper. They are all set in parallel andlink the mass of the parent node to the mass of the servo itself. When a passive jointservo is considered, the motor force is set to zero.

Figure 6.2: Mechanical diagram of a servo as modeled by Webots. Picture taken fromthe Webots reference manual [7].

The torque applied on the servo is the sum of the spring and damping torques. Thespring torque is calculated to move the servo back to its initial position. Accordingto Hook’s law, the spring torque is proportional to the servo position: Fsp = −Kx,where K is the spring constant and x is the current servo position. The damping torqueis proportional to the effective servo velocity: Fdp = −Dv, where B is the dampingconstant, and v = dx/dt is the effective servo velocity.

6.3 Controller Optimization with Additional ToesThe optimization process of the controller used for the HOAP-2 robot model with ad-ditional passive toes was divided into two parts. The first part was realized by usinglinear stiffness and damping profiles i.e. that the spring and damping constant really areconstants. The second part was realized with non-linear stiffness and damping profileswhich means that the spring and damping constants are position and velocity dependentrespectively. The optimization setup used for both parts are rigorously identical to thesetup used in chapter 5. Each experiment is tested with a 0.28m/s and 0.42m/s stage1 target speeds. Each variant is repeated 10 times.The maximum spring and damping constants were hand-tuned to ensure that the

simulation would not explode. The maximum spring constant is 2.4N/rad and themaximum damping constant is 0.0003Ns/rad. Those values were used as boundaries forthe optimizer.

51

6.3.1 Linear Stiffness/Damping ProfilesFor the linear stiffness and damping profiles optimization, the spring and damping con-stants were added to the list of parameters to optimize. Each parameter value can bechosen from a range going from 0 to the maximum value still ensuring the numericalstability of the simulation. At the beginning of the run, the toe constants are read at thesame time as the other parameters. They are then used to set the “springConstant” and“dampingConstant” fields of the toes servos. After this point, the simulation is launchedas usual.


Generating stage 5 solutions for this experiment was difficult. Out of the 10 repetitionsonly 20% reached stage 5, 10% stage 4, 10% stage 3 and the remaining 60% were not ableto pass stage 2. Moreover, and that is the unfortunate part, the two repetitions whichreached stage five developed a gait where the robot was walking on the toes exclusively.Therefore, the solution presented here was not optimized for energy consumption, andbecause it only reached stage 4, presents a torso inclination which is too pronounced .The gait visible in figure 6.3 presents an interesting heel-to-toe transition. While at

the end of the stance, the foot is oriented in a specific way in to order to bring theheel closer to the ground (figures 6.3(a) to 6.3(c)). The contact with the ground isharsh, hence making the robot swing forward. The weight applied on the back of footis transferred to the top (figures 6.3(d) and 6.3(e)). Then the robot leans backward andthe point of contact on the ground becomes the heel again. This can be observed infigures 6.3(e) in particular where the weight is applied on the toe. In figure 6.3(d), theweight was applied on the heel instead and in figure 6.3(f), the weight is applied againon the heel. This weight repartition back and forth motion of the foot makes the gaitunnatural.When observing the trajectories of the controller without any other information it

is difficult to say if the robot was using toes or not. Nonetheless the more interestingfeatures of the trajectories is the presence of the second peak which corresponds to the“propulsion” part in the gait as can be seen in figures 6.3(h) and 6.3(i).Performance-wise the controller presents a minimal ground clearance (0.0103m) and a

high torso inclination of 26.30 °. The energy consumption is also quite high (28.54KNradbut it can be explained by the fact that this solution did not reached stage 5. Thefrequency of the controller is set to 0.639Hz, which in turn is quite low but the hip jointamplitude does compensate and allows the robot to walk at a speed of 0.319m/s.


As for the optimizations of the regular robot, the target speed increase is beneficial tothe the optimizer convergence to stage 5 solutions. In fact 40% of the repetitions wereable to converge towards stage 5, 30% to stage 4, and the remaining 30% to stage 2.In figure 6.5 the gait heel-to-toe transition can be seen step by step. The foot touches

the ground on the heel first (figure 6.5(c)) although its inclination is reduced. Then,

52

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 6.3: Gait of the 0.28m/s target speed experiment with additional toe and linearstiffness and damping experiment. The transition from a standing still pos-ture into the full stride is cut in steps. A video of the gait can be seen on theBiorob laboratory website: http://biorob.epfl.ch/page-65543.html.

53


0 0.2 0.4 0.6 0.8 1-8

-6

-4

-2

0

2

4

6


angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-60

-50

-40

-30

-20

-10

0

10

20frontal hip

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 10

10

20

30

40

50

60

70

80frontal knee

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-25

-20

-15

-10

-5

0

5

10

15frontal ankle

angl

e [d

eg]

Figure 6.4: Trajectories 0.28m/s target speed experiment with additional toes and linearstiffness and damping profiles.

Parameters Valuesvelocity 0.319m/shgc 0.0103mαt 26.30 °Ttot 28.54KNrad

frequency 0.816Hzspring constant 0.8164N/rad

damping constant 1.5356 · 10−3 Ns/radtoe angular position 44.38 °

Table 6.1: Performance score of the toe-augmented controller of the 0.28m/s target ve-locity experiment with linear stiffness and damping profiles.

54

once the robot’s weight has moved forward the toe is used to provide a smoother andmore natural transition when the swinging leg starts to leave the ground.The controller trajectories (figure 6.6 present numerous similarities with those from

figure 5.11. In can be said that the solution currently discussed is a sort of toes version ofthe solution from section 5.2.3. In the main the shape of their respective trajectories aresimilar. However with toes addition the second peak of the frontal ankle joint trajectoryis no longer present. The variations in amplitudes of both solutions curves are due thenecessary adaptations to be made for the toe.Performance-wise it is interesting to note that this controller produces a very fast gait.

The robot velocity being 0.453m/s, slightly higher than the variant without toes. Theground clearance is minimal. The torso inclination is reduced. The controller is alsoquite efficient considering the speed at which the robot walks. With respect to the toeparameters, the spring constant value is clearly reduced. For instance, setting it to 0does not really influence the gait. That means that the robot does not make any use ofthe toe as an energy storage unit which was one of the motivations for the toes addition.

Parameter Valuevelocity 0.453m/shgc 0.0103mαt 13.81 °Ttot 23.75KNrad

frequency 1.0005Hzspring constant 0.0077397N/rad

damping constant 1.1694 · 10−3 Ns/radtoe angular position 46.07 °

Table 6.2: Performance score of the toe-augmented controller of the 0.42m/s target ve-locity experiment with linear stiffness and damping profiles.

6.3.2 Non-Linear Stiffness and Damping ProfilesThe main goal of this experiment is to test in simulation the usefulness of a non-linearstiffness and damping profiles. The profile is generated in the same way as the controllerstrajectories. Namely, data points are given by the optimizer and then interpolated usingpiecewise monotonic cubic polynomials. A requirement of the non-linear profile is to beable to generate linear profiles. This requirement motivated the optimization of only they-coordinates of the data points used for interpolation.The interpolation is realized with three data points. In the case of the stiffness profile,

the three x-coordinates are 0, 30 °, and 60 °. They were chosen in this way to allow theutilization of the full range of rotation of the toes. Then the three y-coordinates of thedata points are added to the list of parameters to optimize. The boundaries remainthe same as for the linear spring constant i.e. between 0 and 2.4Nrad. For instance togenerate a linear profile, all the data points would need to have the same y-coordinate.

55

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 6.5: Gait of the 0.42m/s target speed experiment with additional toe and lin-ear stiffness and damping experiment. The transition from a standing stillposture into the full stride is cut in steps. The gait present a heel-to-toetransition. A video of the gait can be seen on the Biorob laboratory website:http://biorob.epfl.ch/page-65543.html.

56


0 0.2 0.4 0.6 0.8 1-8

-6

-4

-2

0

2

4

6


angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-60

-40

-20

0

20

40frontal hip

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 10

20

40

60

80

100frontal knee

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-30

-25

-20

-15

-10

-5

0frontal ankle

angl

e [d

eg]

Figure 6.6: Trajectories of the 0.42m/s target speed experiment with additional toes andlinear stiffness and damping profiles.

The damping profile is constructed in a very similar way but the x-axis represents theangular velocity of the toe and not the angular position anymore. Thus, a maximum an-gular velocity had to be chosen. The maximum velocity considered is vmax = ‖dx‖/dt =60 °/200 ms. If the angular velocity measured on the toe becomes higher than vmax,the damping constant B(v > vmax) will be equal to B(vmax). The boundaries of thedata points y-coordinates are the sames than for the linear profiles i.e. between 0 and0.0003Ns/rad. The position and velocity measurements are effectuated every 200ms.After what, the spring and damping constants are calculated. The “springConstant”and “dampingConstant” fields of the toes servos are then updated with the new toesconstants values.As usual the experiment was realized with both a 0.28m/s and 0.42m/s target speeds

variants. Each variant was repeated 10 times. The number of iterations, however, wasreduced to 350.


The same problems presented by the controllers from sections 5.2.3 and 6.3.1 are present.Due to the lower target speed the optimizer has more difficulties to find solutions. Infact, in this experiment, the best repetition was only able to converge to stage 4. Pluswhen the gait of the controller was analyzed, the toe was not even used.

57


A target speed increase in stage 1 presents a strong positive effect on the optimizerconvergence. This time 60% of the solutions converged to stage 5, 20% to stage 4 and20% to stage 2.The heel-to-toe transition of figure 6.7 is interesting because the toe is closer to the

ground than the heel (figures 6.7(a) to 6.7(e)). It is due to the contraction of the toefrom the leg in stance in order to orient the swinging foot for the ground contact.The shapes of the joints trajectories present similarities to the reference controller

from chapter 4. The most interesting feature of those trajectories is the frontal anklejoint curve shape at t1. The third and fourth data points being very close, once theyare interpolated, they present this little artifact on the joint trajectory. However, due tothe P-controller of the servos, the delays between the real trajectory of the servos andthe desired position make this type of artifacts almost impossible to follow.The stiffness and damping profiles can be seen in figure 6.9. In the linear profile

experiment, the stiffness of the toe was close to zero. With the non-linear profile thestiffness is higher and increases steadily between a toe angular position of 0 to 30 °.The damping of the toe continuously increases in par with the toe angular velocity.

Nonetheless between 0 and 150 °/s, the damping value is almost constant while between150 and 300 °/s, the increase is more pronounced.The maximum angular position reached by the toe is 28.21 ° and the maximum angular

velocity is 141.027 °/s. The fact is that when comparing those values with the profilecurves (figure 6.9), only the first half of the profile is used. Hence the additional degreeof freedom provided by the third data point is not used.A possible improvement in the selection of the data points would either to add another

point between the first two. Or it would displace the second data point closer to thefirst. Both solution would allow the introduction of supplementary degrees of freedomwhere the toes working points really are.

Parameter Valuevelocity 0.458m/shgc 0.0115mαt 14.15 °Ttot 26.857KNrad

frequency 0.835Hz

Table 6.3: Performance score of the toe-augmented controller of the 0.42m/s target ve-locity experiment with non-linear stiffness and damping profiles.

Due to the relatively toe high stiffness, it can be said than the robot is storing energy inthe first part of the stance (toe contracting) and releasing it during the propulsion phaseof stance. If the energy storage is efficient a benefit should be visible in the total energyconsumption of the controller. However this is not the case as the energy consumptionindicated by table 6.3 is in the high range of energy consumption of all the solutions

58

(a) (b) (c)

(d) (e) (f)

(g) (h) (i)

Figure 6.7: Gait of the 0.42m/s target speed experiment with additional toe and non-linear stiffness and damping profile experiment. The transition from a stand-ing still posture into the full stride is cut in steps. The gait present a heel-to-toe transition. A video of the gait can be seen on the Biorob laboratorywebsite: http://biorob.epfl.ch/page-65543.html.

59


0 0.2 0.4 0.6 0.8 1-6

-4

-2

0

2

4


angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-50

-40

-30

-20

-10

0

10

20

30frontal hip

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 10

10

20

30

40

50

60

70

80

90frontal knee

angl

e [d

eg]

0 0.2 0.4 0.6 0.8 1-30

-20

-10

0

10

20

30frontal ankle

angl

e [d

eg]

Figure 6.8: Trajectories of the 0.42m/s target speed experiment with additional toes andnon-linear stiffness and damping profiles.

0 10 20 30 40 50 601.1

1.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9

toe position [deg]

stiff

ness

pro

file

0 50 100 150 200 250 3001.2

1.3

1.4

1.5

1.6

1.7

1.8

1.9x 10

-4

toe velocity [deg/s]

dam

ping

pro

file

Figure 6.9: Non-linear stiffness and damping profiles of the toe. The black crosses cor-respond to the data points used to interpolate the curves.

60

discussed up to now. However this can be explained by the lower number of iterations(350 instead of 400) of the repetition.

6.4 DiscussionA passive toe was modeled and added to the HOAP-2 Webots model. Two experimentswere realized. One with a linear stiffness and damping profiles. The second with non-linear profiles. The optimization process was conducted in a similar way as for theregular robot (without toes). Two target velocities were used during the first stage ofthe optimization. The toe did not influence the ability of the optimizer to convergetowards stage 5 solutions. As with the regular robot, when the target velocity is set to0.28 ° the number of stage 5 solutions decreases.The ground clearance results indicate that with the toe, the values are closer to the

condition threshold (i.e., 10mm) than they were with the regular robot experiments.This is explained that by adding the toe to the feet, they became longer. It is then moredifficult to provide sufficient ground clearance.The most natural gait obtained with the toe presented a spring close the 0 indicating

the there was no energy storage. In fact, the toe was used a kind of hinge. On theother hand, when the non-linear profiles are considered, only half of the profile regionwas used. It would be interesting to modify the profile generation in order to generatemore complex shapes in the position and velocity regions where the toe is used.The solutions will now be tested for robustness. It will be interesting to see how the

controllers with behave compared to the controllers for the regular robot.

61

7 Stability

In chapters 4, 5, and 6 several controllers were optimized with and without toes. Theirgaits and controller trajectories were described and analyzed. It is now time to see howthose controllers behave in a more dangerous environment.

7.1 Stability Testing ProtocolThe stability testing will be realized in three parts. The first part is the easiest one forthe controllers. They are free to walk without any perturbation. The idea is to see ifthey can walk for a longer period of time than the 10 s they were optimized for. Thiswill allow to assess their inherent stability. Their response to a perturbation with a stepprofile and with a linear profile will be tested in parts two and three.The perturbation forces will be applied on four different parts of the robot’s body:

on the left and right shoulders, and on the back and on the front of the torso. Theperturbation forces are oriented in the same direction as the body part they are appliedon. The force applied on back is along the ~ez vector of the Webots world coordinates.The force applied on the torso is along the −~ez vector. The force applied on the leftshoulder is along the −~ex vector while when applied on the right shoulder the force isalong the ~ex vector. The testing simulation is started and the robot is free to moveduring 5 s in order to stabilize its gait after which the perturbation is added. The Forceswhich are applied range from 0.1N to 10N by steps of 0.1N. If the force profile is a step,it is applied during 500ms. The run duration is then 15 s. If the profile is linear, theforce is applied during 10 s and the run duration is increased to 25 s.In this chapter, the controllers will be named by their type e.g. the controller without

toe and optimized for a target speed of 0.42m/s will be named std042. When toes areadded the same type of nomenclature applies e.g. toe028lin or toe042prof indicate if thestiffness and damping profiles of the toes are linear or not.For each experiment a single controller was selected. Are selected, the reference con-

troller from chapter 4. For the controllers optimized with staged PSO and without toeswere selected: repetition (a) for the medium speed optimization and repetition (b) forthe fastest speed optimization. The controllers with toes selected were the ones presentedin chapter 6.

7.2 Part One: Inherent StabilityIn this section the inherent stability of the controllers is tested. For each controller therun maximum time is set to 2min. No perturbations of any kind are applied. The robot

62

can then move freely for as long as he can. The results are presented in table 7.1. Thetoe028lin controller presents stability issues as it can barely walk for more than 10 s. Thecontrollers making use of non-linear stiffness and damping profiles present intermediateperformances. The remaining controllers were all able to make the robot walk for 2min.

Controllers Time Until Fall [s]ref 120std028 120std042 118.822toe028lin 11.018toe042lin 120toe028prof 45.946toe042prof 41.912

Table 7.1: Results of the inherent stability testing of the robot controllers.

7.3 Part Two: Step Perturbation ForceThe results presented in the tables 7.2 to 7.5 correspond to the minimal force necessaryto make the robot fall on the ground. The time between the beginning of the simulationand the moment the robot fell on the ground is also indicated. The maximum forcethat any controller was able to stand is no more than 5.7N. The controllers who presentthe highest robustness to step perturbations are the ref, std042 and toe042lin. All ofthem generate a fast robot motion. This correlates with the conclusions made after theanalysis of the gaits produced by the different experiments. The slowest controllers suchas the toe028lin and toe028prof present a low resilience to perturbations. The former isso unstable, it falls down as soon as a perturbation is applied. The latter is slightly moreresistant when the perturbation is applied frontally i.e. on the torso or on the back.The std028 controller, whose gait is the most natural and human-like, presents an

intermediate resilience to perturbations. Two interesting solutions to compare are std042and toe042lin as the former is a nice illustration of the necessity to add a toe the robot’sfeet. The latter was optimized in the same conditions but with additional toes. In thefour experiments it was able to withstand larger forces than std042.

7.4 Part Three: Linear Perturbation ForceThe third robustness testing phase consisted in replacing the step force profile by alinear one. In this experiment the same results pattern found with the step profile canbe observed. The toe028lin and toe028prof controllers still present the lowest resilience.The std042 and the toe042lin still present the highest performance.

63

Max Force AppliedControllers on the Left Shoulder [N] Time Until Fall [s]ref 2.1 7.558std028 5.7 7.436std042 3.3 10.062toe028lin 0.1 14.174toe042lin 3.6 6.292toe028prof 0.1 10.672toe042prof 2.1 11.358

Table 7.2: Results of the step force perturbation applied on the left shoulder.

Max Force AppliedControllers on the Right Shoulder [N] Time Until Fall [s]ref 4.4 6.334std028 2 11.44std042 0.9 7.65toe028lin 0.1 13.32toe042lin 4.1 9.83toe028prof 0.1 12.412toe042prof 1.2 7.956

Table 7.3: Results of the step force perturbation applied on the right shoulder.

Max Force AppliedControllers on the Torso [N] Time Until Fall [s]ref 1.2 7.96std028 0.6 7.59std042 2.8 9.584toe028lin 0.1 13.576toe042lin 5.6 6.346toe028prof 0.9 10.714toe042prof 0.3 12.788

Table 7.4: Results of the step force perturbation applied on the torso.

64

Max Force AppliedControllers on the Back [N] Time Until Fall [s]ref 2.4 8.996std028 3 9.284std042 3.8 10.304toe028lin 0.1 8.8toe042lin 4.2 8.48toe028prof 0.6 13.98toe042prof 0.8 7.966

Table 7.5: Results of the step force perturbation applied on the back.

Max Force AppliedControllers on the Left Shoulder [N] Time Until Fall [s]ref 1.5 15.07std028 0.9 14.324std042 0.8 8.67toe028lin 0.1 9.046toe042lin 2.2 8.566toe028prof 0.1 17.684toe042prof 0.2 11.094

Table 7.6: Results of a linear force perturbation applied on the left shoulder.

Max Force AppliedControllers on the Right Shoulder [N] Time Until Fall [s]ref 1.6 13.908std028 0.9 16.112std042 0.8 8.116toe028lin 0.1 9.688toe042lin 1.9 15.254toe028prof 0.1 13.144toe042prof 0.3 21.766

Table 7.7: Results of a linear force perturbation applied on the right shoulder.

65

Max Force AppliedControllers on the Torso [N] Time Until Fall [s]ref 1.6 10.432std028 0.3 10.842std042 2.3 9.444toe028lin 0.1 9toe042lin 1.8 13.532toe028prof 0.2 16.144toe042prof 0.2 20.402

Table 7.8: Results of a linear force perturbation applied on the torso.

Max Force AppliedControllers on the Back [N] Time Until Fall [s]ref 3.6 10.906std028 2.3 7.262std042 4.3 11.476toe028lin 0.1 10.274toe042lin 3.3 13.91toe028prof 0.1 16.142toe042prof 0.4 10.016

Table 7.9: Results of a linear force perturbation applied on the back.

66

7.5 DiscussionThe fittest solutions of all experiments variations realized were tested to assess theirresilience to external perturbation. Three tests were conducted, a reference test were noperturbation was applied. It allowed to identify the problematic controllers right fromthe beginning. The second consisted in the application of a step force on four body partsof the robot. Finally, the same was realized with a perturbation force whose profile waslinear.It is not so easy to draw definitive conclusions from the results presented here. However

some interesting trends are present. The slowest controllers performed more badly ingeneral. This tends to reflect the difficulty of the optimizer to find good solutions atlower speeds. Both std042 and toe042lin controllers presented the highest resilience inmost experiments. The controller with the toe also tends to resist to higher forces thanthe controller without. It most certainly is due to the increased ground contact providedby the toe.The controllers with the non-linear stiffness and damping profiles both performed

badly. However the fastest controller was also more resilient than the slowest one. Asconclusion, it must be said that the robot dynamics in conjunction with the controllertrajectories really tend to favor robot motion at higher speeds.

67

8 Conclusion

8.1 Discussion and ConclusionThe project was focused on the study of biped locomotion using a CPG-controller.Sine-based trajectories already have demonstrated their capability to be used for bipedlocomotion. However sine-shaped trajectories are not really adapted to a precise robotmechanical structure. More adapted trajectories were generated by a monotonic cubicinterpolation. The points to interpolate were generated by the optimizer. As such,simultaneous influence on the amplitude and shape of the curves was possible.A new fitness evaluation procedure (staged PSO) was used in order to integrate ro-

bustness features in the optimization process. It was then possible to obtain gaits whichwere capable of heel-to-toe transitions and which looked human-like. The experimentswere realized for three different walking velocities: 0.14m/s, 0.28m/s, and 0.42m/s. Theconjunction of the robot’s dynamics with the controllers tended to favor high velocities.In this case, it was not possible to optimize a working controller for a target speed of0.14m/s.The HOAP-2 robot’s feet degrees of freedom are present only in the ankle. There

is no articulation at the top of the feet. This can lead to controllers which presenttwo undesirable properties. Either the foot is maintained as parallel to the ground aspossible which then results in a slow and clumsy gait, or the robot presents a heel-to-toetransition and all the weight of the robot is applied on a very small part of the foot.This behavior can lead to instabilities in the gait such as spinning. The Webots modelof the robot was modified to integrate an additional toe on each foot. The goal wasto increase the duration of the foot-ground contact, and also to offer the possibility ofenergy storage. Two variants were realized: linear and non-linear stiffness and dampingprofiles. The experiments did not indicate the presence of energy storage. In fact thesolution which presented the most human-like gait used a spring constant close to 0.In all the experiments, the frequency of the controller was left as an open parameter.

For a given controller, the higher the CPG frequency, the higher the robot velocity.Therefore, it was expected that when the robot’s motion was faster (slower), the fre-quency would go up (down). However this was not the case due to the large freedomgiven to the optimizer to generate trajectories for the controllers. Fast controllers wereobtained even at low frequencies.Finally the better controllers of each experiment were evaluated for robustness by

applying localized perturbations. The results indicate that the slower controllers aremore unstable, which was expected, due to the difficulties of the optimizer to convergetowards suitable solutions, when the target speed is in the lower range. The utilizationof stiffness and damping profiles did not turned worthwhile as those controllers were

68

outperformed by the other ones. The controllers optimized for 0.42m/s target speed(with and without toes) turned out to be the most resilient to the external perturbations.Both were optimized targeting the faster robot speed. The former was optimized for theregular robot model whereas the later used a toe with a linear stiffness and dampingprofiles. The addition of the toe increased the force amplitude the robot could withstandin most cases.

8.2 Future WorksThe future works possibilities are very large. To begin with it would be interesting to finetune the calculation of the stages used in the optimization process. The changes whichwould be needed the most are the controller torso oscillation stage. At the momentthe optimization is realized on the maximal measured value during the run. However,the most interesting properties is the torso oscillation and not inclination. The groundclearance offset used to determine if the feet are parallel should be extended to largervalues when the toe is used. A check for heel-to-toe transitions could also be desirableto prevent the apparition of solutions, where the robot walks on its toes or heels.The difficulty of the optimizer to find solutions for slower controllers needs to be

investigated more thoroughly. A nice place to start would be to allow the sine-liketrajectories to have variable stance and swing phases durations.The combination of feedback into the CPG and staged PSO could allow the opti-

mization of controllers in unpredictable environments with perturbation forces, varyingground friction coefficients and ground slopes. The transition of the controller to thephysical robot could also require an hybrid approach i.e. that during optimization in-termediate solutions are transferred to the physical robot and the feedback is integratedinto the next simulation iteration. Some work has been realized in this direction in [10]although not for a biped robot.Finally, due to its simple structure, the toe can be easily realized and mounted on the

real HOAP-2 robot. Several types of solutions are possible. Leaf-springs, torsion-springsor compression-springs, could be used. However the leaf-spring would be the most simplesolution mechanically-wise. Its main drawbacks are present in the manufacturing phaseas the steel used for spring is difficult to work with. Another solution would be to use acompliant system realized in silicon or polyurethane, which would be mold around therobot’s foot. However a complete functional analysis is required.

69

Bibliography

[1] T. Sugihara, Y. Nakamura, and H. Inoue, “Real-time humanoid motion genera-tion through zmp manipulation based on inverted pendulum control,” in Roboticsand Automation, 2002. Proceedings. ICRA ’02. IEEE International Conference on,vol. 2, pp. 1404 – 1409 vol.2, 2002.

[2] A. J. Ijspeert, “Central pattern generators for locomotion control in animals androbots: A review,” Neural Networks, vol. 21, no. 4, pp. 642 – 653, 2008. Roboticsand Neuroscience.

[3] J. van den Kieboom, “Biped locomotion and stability: a practical approach,” Mas-ter’s thesis, EPFL, 2009.

[4] Y. Ogura, K. Shimomura, A. Kondo, A. Morishima, T. Okubo, S. Momoki,H. ok Lim, and A. Takanishi, “Human-like walking with knee stretched, heel-contactand toe-off motion by a humanoid robot,” in Intelligent Robots and Systems, 2006IEEE/RSJ International Conference on, pp. 3976 –3981, oct. 2006.

[5] J. van den Kieboom, “Arbitrary wave-form oscillator.” Biorob Laboratory, EPFL.

[6] F. N. Fritsch and R. E. Carlson, “Monotone piecewise cubic interpolation,” SIAMJournal on Numerical Analysis, vol. 17, no. 2, pp. pp. 238–246, 1980.

[7] O. Michel, “Webots: Professional mobile robot simulation,” Journal of AdvancedRobotics Systems, vol. 1, no. 1, pp. 39–42, 2004.

[8] P. Cominoli, “Development of a physical simulation of a real humanoid robot,”Master’s thesis, EPFL, 2005.

[9] J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Neural Networks,1995. Proceedings., IEEE International Conference on, vol. 4, pp. 1942 –1948 vol.4,nov/dec 1995.

[10] B. Adams and R. A. Brooks, “Evolutionary, developmental neural networks forrobust robotic control,” 2006.

[11] D. Floreano and C. Mattiussi, Bio-Inspired Artificial Intelligence: Theories, Meth-ods, and Technologies. The MIT Press, 2008.

70

Advanced Biped Locomotion in Real/Simulated Humanoid … · Advanced Biped Locomotion in...

Documents

Transcript of Advanced Biped Locomotion in Real/Simulated Humanoid … · Advanced Biped Locomotion in...