Abstract
Gradient-free reinforcement learning algorithms often fail to scale to high dimensions and require a large number of rollouts. In this paper, we propose learning a predictor model that allows simulated rollouts in a rank-based black-box optimizer Covariance Matrix Adaptation Evolutional Strategy (CMA-ES) to achieve higher sample-efficiency. We validated the performance of our new approach on different benchmark functions where our algorithm shows a faster convergence compared to the standard CMA-ES. As a next step, we will evaluate our new algorithm in a robot cup flipping task.
Originalsprache | Englisch |
---|---|
Seitenumfang | 6 |
Publikationsstatus | Veröffentlicht - 2020 |
Veranstaltung | 2nd International Conference on Advances in Signal Processing and Artificial Intelligence - Berlin, Deutschland Dauer: 01.03.2020 → 03.03.2020 |
Tagung, Konferenz, Kongress
Tagung, Konferenz, Kongress | 2nd International Conference on Advances in Signal Processing and Artificial Intelligence |
---|---|
Kurztitel | ASPAI' 2020 |
Land/Gebiet | Deutschland |
Ort | Berlin |
Zeitraum | 01.03.20 → 03.03.20 |