Mechatronics: Drone Reinforcement Learning Arena

Background and Motivation

The Drone Reinforcement Learning Arena offers a safe physical environment specifically designed for UiA drones. Engineered to facilitate extended drone experiments, the arena features mechanisms to reset and initialize crashed drones, along with provisions for wireless charging and dedicated compartments for AI computer and other essential hardware. Our UiA drone is built to be crash-resistant, requiring only minor modifications to withstand most crash scenarios within the arena.

This project aims to establish a complete solution for Reinforcement Learning (RL) flight controller development including simulation and physical environment. Additionally, it enables the showcasing of drone flight performance at different stages of training, even without undergoing training in the RL arena itself.

The initiative builds upon years of development on our sub-250-gram UiA drone and the compact Drone Arena from 2021.

Scope and Adaptability

This project can be customized to fit the group's size and specific interests, making it suitable for both BSc and MSc students in Mechatronics. For MSc-level students, the focus will likely be more on modeling and simulating the setup. For BSc-level students, the emphasis should probably be on the mechanical system, electronics, and basic software components. The project owner can accommodate up to two groups for this project.

RL and AI will be used on top of this, so the RL algorithm development does not need to be within scope.

Project Components

Literature Review: Conduct research on dynamic models and reinforcement learning to inform the choice of RL agent. This will establish the theoretical basis for the project.
Concept Development: Explore various options for running RL—either on the drone's onboard flight computer or remotely from the arena's computer.
Concept Selection: Decide on the most viable concept based on research and initial tests.
Environment Development for RL: Develop a dynamic model to serve as the environment in RL, simulating physical properties, collision detection, and reference positions.
State Estimation: Utilize a top-mounted camera in the arena to estimate the drone's state (position, orientation, speed).
Choice of RL Agent: Based on the literature review, select an RL agent (e.g., DDPG, SAC, PPO).
Framework Selection for RL: Choose an RL framework that allows for real-time predictions on either the drone or the arena's computer.
System Enhancement: Further develop and adapt the drone's airframe and arena. Integrate wireless communication links between the drone and the arena.
Software Stack Selection: Contribute to the choice of software stack for the training computer, possibly considering ROS2.
RL Training in Simulator: If applicable, perform RL training using simulations to validate the conceptual choices.
Flight Testing and Evaluation: Conduct test flights to evaluate the chosen RL model and overall system performance, comparing real-world outcomes to simulated results.

Possible Technologies, Software Frameworks, and Components

Flight Computer: Raspberry Pi, NVIDIA Jetson
RL Frameworks: TensorFlow, PyTorch, OpenAI Gym
Communication: MQTT, Websockets
Simulation Software: Gazebo, AirSim
Computer Vision: OpenCV
Software Stack: ROS2, MAVROS
Sensors: Depth cameras, IMU
Wireless Charging: Qi standard components

Each component can be tailored to fit the students' skill level, making it suitable for both BSc and MSc level projects. The project owner can accommodate up to two groups for this work.

Contact

For questions aboute scope and possibilities, contact Kristian Muri Knausgård <kristimk@uia.no>.