The speed of walking robots is a critical factor in many applications, especially in autonomous robot soccer games. Our aim is to increase the walking speed of our robots through automated hardware-in-the-loop optimization.
Many different approaches have been investigated for improving the walking speed of bipedal and quadrupedal robots. For example, numerical optimal control techniques enable the computation of stable and fast walking motions based on three dimensional computational models of the legged robot dynamics. However, all model-based optimization approaches have in common that their outcome critically depends on the quality and accuracy of the robot model. The derivation of highly accurate enough robot models to achieve the best possible walking speed may require too many efforts considering, e.g., the effects of gear backlash, elasticity and temperature dependent joint friction or of different ground properties. An alternative approach is to start with a reasonable, initial walking motion and then to use online hardware-in-the-loop optimization of the physical robot. For example, reinforcement learning techniques for four-legged and biped robots have successfully been used to improve fast and stable walking motions, where a (low dimensional) parametrization of the walking motion is utilized to optimize the gait by evaluating the merit function performing walking experiments with the real robot.
In the following our current work on quadrupedal (videos) and
bipedal (videos) walking
optimization is described briefly.
Walking optimization of four-legged Sony Aibo robots
Optimization for our robots of type Sony Aibo was performed in two scenarios, (1) straight foward walking without ball, (2) turning around (it's vertical axis) without losing the ball, whereby the ball is placed in between the front legs of the robot, and it has to stay there during the turning motion. These two cases often arise during a robot scoccer game, the first when the robot has to reach the ball, the second one when it is aligning for passing the ball to another robot, or when it tries to shoot a goal. The walking trajectories of the four legs are determined in space and time by a 31 dimensional parameter vector. This low dimensional parametrization is achieved by approximating the trajectory by polygons and by taking advantage of symmetry and redundancies of a four legged walking robot. The walking optimization is performed on the standard RoboCup 4-legged league field, where the walking or turning speed of the robot is measured by a ceiling camera. The position and orientation of the robot is detected using circular markers attached to the robot's back. The speed of the robot is then approximated from two consecutive measurements over a fixed time interval. The average speed measured over a fixed number of runs is returned to the optimizer as the objective function value. When optimizing turning while holding the ball, a distance sensor in the chest of the robot is applied in order to determine whether the ball was lost while walking. For such an unsuccessful run, an evaluation value of zero is returned to the optimizer. Therefore walking motions are penalized where the ball is not held securely.
This way a hardware-in-the-loop layout is applied, where the robot control is coupled with the optimization tool, in our case APPSPACK (ver. 4.0.2). This method is originally designed for deterministic nonlinear optimization problems, but for the here considered problem with averaged walking speed respectively turning speed the chosen optimization method performed well. Besides the restricting factor in this application is the number of walking experiments, so that no implemented stopping criteria of APPSPACK is used, which would also be disturbed by the underlying stochastic character of this application.
The starting parameter set for the forward walking speed optimization is obtained by an optimization experiment with an evolutionary algorithm, for the turning speed optimization we start with a hand tuned initial parameter set. Beside of bounds for the parameter space representing reachable leg positions, we are using the default parameters of APPSPACK.
For walking forward the measured speed of the initial parameter set was 40 cm/s. The resulting parameters obtained by optimization using APPSPACK was measured at an average speed of 43 cm/s. Thus yielding an improvement in the walking speed of about 7.5%. This result was achieved after evaluating 83 parameter sets in about half an hour optimization time. When optimizing the turning motion of the robot, the turning speed could be increased about 50% from 120 deg/s to 180 deg/s without losing the ball while turning. Finding this result required 206 parameter set evaluations which took about 45 minutes.
The resulting walking motions can be observed in two short videos. One demonstrating forward walking, the other demonstrating turning with grabbed ball. The robot on top is performing the optimized motion, the robot below is using the initial parameter set.As quality or objective function value of a walking motion we measure the distance the robot covers, when it starts walking with a small step length and increases it linearly during the experiment until the robot falls or reaches a final step length.
During the numerical online optimization the real-valued parameters influencing the main characteristics of the walking behavior of the robot are varied to maximize the defined objective function. The robot is included as hardware-in-the-loop for the walking experiment to evaluate the objective function. In this context a non deterministic black-box optimization problem arises, where besides of a noise function value no further information, especially no objective gradients, is provided. By this definition of the objective function, it is not necessary to formulate additional constraints for maintaining walking stability and incorporate them explicitly into the definition of the optimization problem. The only constraints to be considered are the constant lower and upper bounds on the optimization parameters.
We start the optimization process with one experiment, where the parameters are chosen by expert knowledge to provide a stable, initial walking motion (video). An initial set of experiments is generated around the initial motion by varying each parameter on its own. This set builds the basis points for the use of design and analysis of computer experiments, which is applied to approximate the original objective function on the whole feasible parameter domain. A sequential quadratic programming method is applied next to compute the maximizer of the resulting surrogate function. The objective function value for this maximizer is determined by performing the corresponding walking experiment with the robot. If the distance of a found maximizer to a point already evaluated by experiments falls below a defined limit, not the actual maximizer, but the maximizer of the expected mean square error of the surrogate function is searched, evaluated, and added to the set of basis points for approximation. This procedure improves the approximation quality of the surrogate function in unexplored regions of the parameter domain and provides not to get stuck in a local maximum. After a new point is added, a new surrogate function is approximated, and the optimization starts again. From our experience this approach for online optimization of walking speed is much more ecient then genetic or evolutionary algorithms which are usually applied to cope with the robust minimization of noisy functions.The solutions found during the optimization were successively adjusted by setting up a constant step length and step time dependent on different floor coverings. The results of this approach are a stable walking motion with a speed of 30 cm/sec for the HR18 robot prototype (video) and a stable walking motion with a speed of more than 40 cm/sec for an improved hardware design of the HR18 robot ("Bruno") (video), which is so far the fastest motion reported for a humanoid robot in the kid size league of RoboCup.