Thesis defence T. Mannucci: UAV’s

12 October 2017 15:00 - Location: Aula, TU Delft - By: Webredactie

Safe Online Robust Exploration for Reinforcement Learning Control of Unmanned Aerial Vehicles. Promotor: M. Mulder (LR).

This dissertation examines the problem of Safe, Online, Robust reinforcement learning (RL) for unmanned aerial vehicles (UAVs). RL is an autonomous learning scheme which permits an agent to learn optimal control by independently interacting with its environment. When applied to UAVs, however, RL presents three explicit challenges. If the UAV interacts improperly with the environment, it can damage itself or others (challenge of safety). If the UAV does not learn by interacting with the actual environment, and does so with a replica instead, it can learn a policy that is not directly applicable, and that can be unsafe, in the real environment (challenge of robustness). Finally, the UAV agent must be able to learn, modify and apply in real time the action indicated by its policy, as well as to compute any predictions required by the challenge of safety (challenge of online efficiency). This dissertation introduces three main classes of methods to address the challenges: heuristic methods, graph methods, and hierarchical methods. Furthermore, it introduces a class of hybrid methods, which combine the contributions of the three methods above and that can thus jointly address multiple challenges. As a result, these methods permit, under given assumptions, to perform RL exploration in a safe, online and robust setting. 

More information?
For access to theses by the PhD students you can have a look in TU Delft Repository, the digital storage of publications of TU Delft. Theses will be available within a few weeks after the actual thesis defence.