Betreuer: Mohammd Karam Daaboul
Forschungsgruppe: Angewandte Technisch-Kognitive Systeme
Partner: FZI Forschungszentrum Informatik
Abschlussarbeitsstatus: In Bearbeitung
Beginn: 30. März 2020
Safety and sample efficiency are among the most urgent challenges faced by real-world applications of current reinforcement learning algorithms. Recent developments in model-based reinforcement learning have made significant progress in both sample efficiency and asymptotic performance which is often associated with the model bias problem. The overall goal of this thesis is to derive a sample efficient, safe policy search algorithm by leveraging recent results in model-based reinforcement learning and safety-driven reinforcement algorithms. The combination of model-based methods with safety-driven approaches is motivated by the notion that the agent can safely explore and improve within model rollouts, thus reducing the total needed amount of risky real domain interactions.