Thema4652: Unterschied zwischen den Versionen
Aktuelle Version vom 7. August 2020, 09:24 Uhr
Betreuer: Mohammd Karam Daaboul
Forschungsgruppe: Angewandte Technisch-Kognitive Systeme
Beginn: 01. Oktober 2020
Reinforcement learning has achieved remarkable results in areas such as simulated robotics or playing Atari computer games. Since reinforcement learning agents learn by trial and error, training in the real world would result in undesirable actions that could cause possible damage to the system or surrounding objects. Offline reinforcement learning, offline RL, is a variant of reinforcement learning in which the agent must learn from a fixed set of data without exploration. Offline RL takes rewards into account and trains by minimizing the Bellman Error. Offline RL is similar to Imitation Learning (IL) in that the latter also learns from a fixed data set. Most IL problems require an optimal or a high-performance demonstrator that delivers data, while offline IL may have to deal with very suboptimal data.
The goal of this thesis is to train a driving policy offline with the help of a data set generated from Carla Simulation. After the training, the policy should act reliably if it is accurate and confident about the situations. Due to the complexity of the world, the trained policy is inevitably confronted with new situations and should be able to recognize and cope with them.
- A. Filos, et al., 2020; "Can Autonomous Vehicles Identify, Recover From, and Adapt to Distribution Shifts."
- S. Levine, et al., 2020; "Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems."
- J. Fu, et al., 2020; "D4RL: Datasets for Deep Data-Driven Reinforcement Learning."
- an interdisciplinary research environment with partners from science and industry
- constructive cooperation with bright, motivated employees
- a comfortable working atmosphere
- Knowledge in the field of artificial intelligence and Machine Learning
- Ability to implement both state of the art and experimental algorithms
- Good Python knowledge
- High creativity and productivity
- Experience with Reinforcement Learning is an advantage
- current grade report
Mohammd Karam Daaboul