Learning control policies in robotic tasks requires a large number of interactions due to small learning rates, bounds on the updates or unknown constraints. In contrast humans can infer protective and safe solutions after a single failure or unexpected observation. In order to reach similar performance, we developed a hierarchical Bayesian optimization algorithm that replicates the cognitive inference and memorization process for avoiding failures in motor control tasks. A Gaussian Process implements the modeling and the sampling of the acquisition function. This enables rapid learning with large learning rates while a mental replay phase ensures that policy regions that led to failures are inhibited during the sampling process. The features of the hierarchical Bayesian optimization method are evaluated in a simulated and physiological humanoid postural balancing task. The method outperforms standard optimization techniques, such as Bayesian Optimization, in the number of interactions to solve the task, in the computational demands and in the frequency of observed failures. Further, we show that our method performs similar to humans for learning the postural balancing task by comparing our simulation results with real human data.
|Number of pages||7|
|Publication status||Published - 2020|
|Event||Proceedings of International Conference on Intelligent Robots and Systems 2020 - Las Vegas, United States|
Duration: 25.10.2020 → 29.10.2020
|Conference||Proceedings of International Conference on Intelligent Robots and Systems 2020|
|Abbreviated title||IROS 2020|
|Period||25.10.20 → 29.10.20|