Abstract
Robot swarms are decentralized collective systems of simple embodied agents that act autonomously and rely on local information only. Such large-scale multi-robot systems can be beneficial over single robots due to higher potential for robustness and scalability. However, the development of swarm robot controllers is challenging because when implementing a desired swarm behavior one has to take into account local interactions between robots and between robots and the environment. An alternative is the automatic design of swarm robot controllers using methods of evolutionary robotics. Since evolutionary algorithms maximize fitness potentially by every possible way, undesired side effects may occur if a goal-directed fitness function was not specified accurately enough. By contrast,
task-independent fitness functions avoid the specific formulation of rewards but do not guarantee that desired behaviors emerge. Our minimize surprise approach relies on such a task-independent fitness function to evolve diverse collective behaviors for robot swarms. Surprise, in its simplest form here, is the difference between observed and predicted sensor values. We minimize surprise over generations by equipping each swarm member with an actor-predictor pair of artificial neural networks and putting direct selection pressure on the predictor. The actor is only indirectly rewarded by being paired with the predictor and thus swarm behaviors emerge as a desired by-product.
In the first part of this thesis, we study minimize surprise as a method. In a simple simulated self-assembly scenario, we show the effectiveness of our approach in comparison to random search, the scalability of the evolved behaviors with swarm density, as well as the robustness of evolution against sensor noise and of the emergent behaviors against damage to the self-assembled structure. We also show that the resulting behavioral diversity of our standard minimize surprise approach is competitive to the behavioral diversity generated by task-independent novelty search and MAP-Elites variants. In addition, we demonstrate that self-organization in minimize surprise can be engineered towards desired behaviors by predefining some or all sensor predictions. In a more realistic simulation, we illustrate how modifications of the environment (e.g., dynamically changing obstacle positions), the agents (e.g., enabling battery level sharing), and the fitness function (e.g., adding a reward for homing) can influence the evolution of behaviors.
In the second part of this thesis, we study the evolution of collective behaviors with minimize surprise in different application scenarios. We evolve collective decision-making behaviors for a collective perception task in the realistic BeeGround simulator, and collective construction behaviors on a simple 2D torus grid. Furthermore, we make the step to realworld setups and evolve basic swarm behaviors and object manipulation behaviors in the realistic Webots simulator and on swarms of real Thymio II robots using an online onboard evolutionary approach to minimize surprise.
Overall, we show that our minimize surprise approach allows the effective evolution of diverse, robust, and scalable swarm behaviors for a variety of application scenarios in simple simulations, realistic simulators, and real-world experiments. Moreover, evolution can be pushed towards desired behaviors through the modification of the environment, robot model, and predictor outputs. Potentially allowing open-ended adaptation to nonanticipated situations, minimize surprise can help tackle the challenges of robotics.
task-independent fitness functions avoid the specific formulation of rewards but do not guarantee that desired behaviors emerge. Our minimize surprise approach relies on such a task-independent fitness function to evolve diverse collective behaviors for robot swarms. Surprise, in its simplest form here, is the difference between observed and predicted sensor values. We minimize surprise over generations by equipping each swarm member with an actor-predictor pair of artificial neural networks and putting direct selection pressure on the predictor. The actor is only indirectly rewarded by being paired with the predictor and thus swarm behaviors emerge as a desired by-product.
In the first part of this thesis, we study minimize surprise as a method. In a simple simulated self-assembly scenario, we show the effectiveness of our approach in comparison to random search, the scalability of the evolved behaviors with swarm density, as well as the robustness of evolution against sensor noise and of the emergent behaviors against damage to the self-assembled structure. We also show that the resulting behavioral diversity of our standard minimize surprise approach is competitive to the behavioral diversity generated by task-independent novelty search and MAP-Elites variants. In addition, we demonstrate that self-organization in minimize surprise can be engineered towards desired behaviors by predefining some or all sensor predictions. In a more realistic simulation, we illustrate how modifications of the environment (e.g., dynamically changing obstacle positions), the agents (e.g., enabling battery level sharing), and the fitness function (e.g., adding a reward for homing) can influence the evolution of behaviors.
In the second part of this thesis, we study the evolution of collective behaviors with minimize surprise in different application scenarios. We evolve collective decision-making behaviors for a collective perception task in the realistic BeeGround simulator, and collective construction behaviors on a simple 2D torus grid. Furthermore, we make the step to realworld setups and evolve basic swarm behaviors and object manipulation behaviors in the realistic Webots simulator and on swarms of real Thymio II robots using an online onboard evolutionary approach to minimize surprise.
Overall, we show that our minimize surprise approach allows the effective evolution of diverse, robust, and scalable swarm behaviors for a variety of application scenarios in simple simulations, realistic simulators, and real-world experiments. Moreover, evolution can be pushed towards desired behaviors through the modification of the environment, robot model, and predictor outputs. Potentially allowing open-ended adaptation to nonanticipated situations, minimize surprise can help tackle the challenges of robotics.
Original language | German |
---|---|
Qualification | Doctorate / Phd |
Awarding Institution |
|
Supervisors/Advisors |
|
Award date | 29.08.2022 |
Publication status | Published - 2022 |
Research Areas and Centers
- Centers: Center for Artificial Intelligence Luebeck (ZKIL)
DFG Research Classification Scheme
- 407-01 Automation, Control Systems, Robotics, Mechatronics, Cyber Physical Systems
Prizes
-
Bernd Fischer Award 2023
Kaiser, Tanja Katharina (Award Recipient), 10.11.2023
Prize: Awards of the University of Luebeck