Student Projects


How to apply

To apply, please send your CV, your Ms and Bs transcripts by email to all the contacts indicated below the project description. Do not apply on SiROP . Since Prof. Davide Scaramuzza is affiliated with ETH, there is no organizational overhead for ETH students. Custom projects are occasionally available. If you would like to do a project with us but could not find an advertized project that suits you, please contact Prof. Davide Scaramuzza directly to ask for a tailored project (sdavide at ifi.uzh.ch).


Upon successful completion of a project in our lab, students may also have the opportunity to get an internship at one of our numerous industrial and academic partners worldwide (e.g., NASA/JPL, University of Pennsylvania, UCLA, MIT, Stanford, ...).



Foundation models for vision-based reinforcement learning - Available

Description: Vision-based reinforcement learning (RL) is more sample inefficient and more complex to train compared to state-based RL because the policy is learned directly from raw image pixels rather than from the robot state. In comparison to state-based RL, vision-based policies need to learn some form of visual perception or image understanding from scratch, which makes them way more complex to learn and to generalise. Foundation models trained on vast datasets have shown promising potential in outputting feature representations that are useful for a large variety of downstream tasks. In this project, we investigate the capabilities of such models to provide robust feature representations for learning control policies. We plan to study how different feature representations affect the exploration behavior of RL policies, the resulting sample complexity and the generalisation and robustness to out-of-distribution samples. This will include training different RL policies on various robotics tasks using various intermediate feature representations.

Goal: Study the effect of feature representations from different foundation models on learning robotic control tasks with deep RL and imitation learning.

Contact Details: Elie Aljalbout [aljalbout (AT) ifi (DOT) uzh (DOT) ch], Jiaxu Xing [jixing (AT) ifi (DOT) uzh (DOT) ch], Ismail Geles [geles (AT) ifi (DOT) uzh (DOT) ch]

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Offline-to-Online (model-based) Reinforcement Learning Transfer and Finetuning for Vision-based Robot Control - Available

Description: Vision-based reinforcement learning (RL) is often sample-inefficient and computationally very expensive. One way to bootstrap the learning process is to leverage offline interaction data. However, this approach faces significant challenges, including out-of-distribution (OOD) generalization and neural network plasticity. The goal of this project is to explore methods for transferring offline policies to the online regime in a way that alleviates the OOD problem. By initially training the robot's policies system offline, the project seeks to leverage the knowledge of existing robot interaction data to bootstrap the learning of new policies. The focus is on overcoming domain shift problems and exploring innovative ways to fine-tune the model and policy using online interactions, effectively bridging the gap between offline and online learning. This advancement would enable us to efficiently leverage offline data (e.g. from human or expert agent demonstrations or previous experiments) for training vision-based robotic policies. This would/could involve (but is not limited to) developing methods for uncertainty estimation and handling, domain adaptation for model-based RL, pessimism (during offline training), and curiosity (during finetuning) in RL methods.

Goal: Develop methods for transferring control policies learned offline to the online inference/finetuning regime.

Contact Details: Elie Aljalbout [aljalbout (AT) ifi (DOT) uzh (DOT) ch]

Thesis Type: Master Thesis

See project on SiROP

Incorporating expert data into model-based reinforcement learning - Available

Description: Model-based reinforcement learning (MBRL) methods have greatly improved sample efficiency compared to model-free approaches. Nonetheless, the amount of samples and compute required to train these methods remains too large for real-world training of robot control policies. Ideally, we should be able to leverage expert data (collected by human or artificial agents) to bootstrap MBRL. The exact way to leverage such data is yet unclear and many options are available. For instance, it is possible to only use such data for training high-accuracy dynamics models (world models) that are useful for multiple tasks. Alternatively, expert data can (also) be used for training the policy. Additionally, pretraining MBRL components can itself be very challenging as offline expert data is typically sampled from a very narrow distribution of behaviors, which makes finetuning non-trivial in out-of-distributions areas of the robot’s state-action space. In this thesis, you will look at different ways of incorporating expert data in MBRL and ideally propose new approaches to best do that. You will test these methods in both simulation (simulated drone, wheeled, legged) and in the real world on our quadrotor platform. You will gain insights into MBRL, sim-to-real transfer, robot control. Requirements: Applicants should have a strong background in machine learning, computer vision, and proficiency in Python programming. Familiarity with deep learning frameworks such as PyTorch is desirable.

Goal: Study and propose methods for leveraging expert data in model-based reinforcement learning for quadrotor flight control.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Elie Aljalbout [aljalbout (at) ifi (dot) uzh (dot) ch], and Jiaxu Xing [jixing (at) ifi (dot) uzh (dot) ch]

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Model-based Reinforcement Learning and World Models for Autonomous Drone Racing - Available

Goal: The objective of this project is to implement state of the art model-based RL and world models for drones to a model-based Reinforcement Learning pipeline. The goal is to investigate potential performance improvements of the reinforcement learning algorithm by incorporating a model of the drone's dynamics, which will allow the algorithm to make more informed decisions. This will result in faster learning and better generalization, leading to better performance in real-world scenarios. To accomplish this goal, the student will need to research and implement various model-based reinforcement learning algorithms and evaluate their performance in a simulation environment for drone navigation. The student will also need to fine-tune the parameters of the algorithm to achieve optimal performance. The final product will be a pipeline that can be used to train a drone to navigate in a variety of environments with improved efficiency and accuracy. Applicants should have a strong background in both model-free and model-based reinforcement learning techniques, programming in C++ and Python, and a good understanding of nonlinear dynamic systems. Additional experience in signal processing, machine learning, as well as being comfortable operating in a hands-on environment is highly desired.

Contact Details: Please send your CV and transcripts (bachelor and master), and any projects you have worked on that you find interesting to Angel Romero (roagui AT ifi DOT uzh DOT ch), Ismail Geles (geles AT ifi DOT uzh DOT ch) and Elie Aljalbout (aljalbout AT ifi DOT uzh DOT ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Energy-Efficient Path Planning for Autonomous Quadrotors in Inspection Tasks - Available

Description: Autonomous quadrotors are increasingly used in inspection tasks, where flight time is often limited by battery capacity. In these operations, reducing energy consumption is essential, especially when quadrotors must navigate complex paths near inspection targets. Traditional path planning methods often overlook energy costs, which limits their effectiveness in real-world applications. This project aims to explore and evaluate state-of-the-art path planning approaches that incorporate energy efficiency into trajectory optimization. Various planning techniques will be tested to identify the most suitable methods for minimizing energy consumption, ensuring smooth navigation, and maximizing inspection coverage within a single battery charge. Strong programming skills in Python/C++ and a background in robotics or autonomous systems are required. Experience in motion planning, machine learning, or energy modeling is beneficial but not essential.

Goal: The goal of this project is to develop, implement, and test an energy-efficient waypoint path planning method that improves quadrotor endurance in inspection tasks, maximizing inspection coverage within a single battery cycle.

Contact Details: Leonard Bauersfeld (bauersfeld AT ifi DOT uzh DOT ch), Elie Aljalbout (aljalbout AT ifi DOT uzh DOT ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Neural Architecture Knowledge Transfer for Event-based Vision - Available

Description: Processing the sparse and asynchronous data from event-based cameras presents significant challenges. Transformer-based models have achieved remarkable results in sequence modeling tasks, including event-based vision, due to their powerful representation capabilities. Despite their success, their high computational complexity and memory demands make them impractical for deployment on resource-constrained devices typical in real-world applications. Recent advancements in efficient sequence modeling architectures offer promising alternatives that provide competitive performance with significantly reduced computational overhead. Recognizing that Transformers already demonstrate strong performance on event-based vision tasks, we aim to leverage their strengths while addressing efficiency concerns.

Goal: Study knowledge transfer techniques to transfer knowledge from complex Transformer models to simpler, more efficient models. Test the developed models on benchmark event-based vision tasks such as object recognition, optical flow estimation, and SLAM.

Contact Details: Nikola Zubic (zubic@ifi.uzh.ch), Giovanni Cioffi (cioffi@ifi.uzh.ch)

Thesis Type: Master Thesis

See project on SiROP

Improving Event-Based Vision with Energy-Efficient Neural Networks - Available

Description: Event-based cameras, also known as neuromorphic vision sensors, capture visual information through asynchronous pixel-level brightness changes, offering high temporal resolution, low latency, and a wide dynamic range. These characteristics make them ideal for applications requiring rapid response times and efficient data processing. However, deploying deep learning models on resource-constrained devices remains challenging due to computational overhead and energy consumption. This project explores novel approaches to developing energy-efficient neural networks tailored for event-based vision tasks. By designing models that significantly reduce computational demands and memory footprint while maintaining high performance, we can make real-time processing on embedded hardware feasible. The focus will be on balancing training efficiency and model accuracy, minimizing energy consumption without sacrificing the quality of results.

Goal: Investigate existing energy-efficient neural network architectures that can be applied to event-based vision. Design and implement energy-efficient neural networks specifically for event-based vision tasks. Explore techniques to optimize model architectures for efficiency without compromising accuracy. Test the developed models on benchmark event-based datasets, such as N-Caltech101, N-CARS, and Neuromorphic ImageNet.

Contact Details: Nikola Zubic (zubic@ifi.uzh.ch), Marco Cannici (cannici@ifi.uzh.ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Leveraging Long Sequence Modeling for Drone Racing - Available

Description: Recent advancements in machine learning have highlighted the potential of Long Sequence Modeling as a powerful approach for handling complex temporal dependencies, positioning it as a compelling alternative to traditional Transformer-based models. In the context of drone racing, where split-second decision-making and precise control are of greatest importance, Long Sequence Modeling can offer significant improvements. These models are adept at capturing intricate state dynamics and handling continuous-time parameters, providing the flexibility to adapt to varying time steps essential for high-speed navigation and obstacle avoidance. This project aims to bridge this gap by investigating the application of Long Sequence Modeling techniques in RL to develop advanced autonomous drone racing systems. The ultimate goal is to improve autonomous drones' performance, reliability, and adaptability in competitive racing scenarios.

Goal: Develop a Reinforcement Learning framework based on Long Sequence Modeling tailored for drone racing. Simulate the framework to evaluate its performance in controlled environments. Conduct a comprehensive analysis of the framework’s effectiveness in handling long sequences and dynamic racing scenarios. Ideally, the optimized model should be deployed in real-world drone racing settings to validate its practical applicability and performance.

Contact Details: Nikola Zubic (zubic@ifi.uzh.ch), Angel Romero Aguilar (roagui@ifi.uzh.ch)

Thesis Type: Master Thesis

See project on SiROP

Autonomous Drone Navigation via Learning from YouTube Videos - Available

Description: Be it mastering a new recipe, performing complex mechanical repairs, or excelling at video games, humans have the ability to learn new skills by just watching an expert perform the task. They often do so by simply watching online video demonstrations, despite significant variations in body shapes, environments, or sensing modalities and, most importantly, without knowing the precise action commands of the expert demonstrator. Inspired by how humans learn, this project aims to explore the possibility of learning flight patterns, obstacle avoidance, and navigation strategies by simply watching drone flight videos available on YouTube. State-of-the-art Large Vision Models for processing and encoding videos, as well as unsupervised training techniques, will be evaluated and designed during the project. Applicants should have a strong background in machine learning, computer vision, and proficiency in Python programming. Familiarity with deep learning frameworks such as PyTorch is desirable.

Goal: Investigate the feasibility and effectiveness of using large vision models along with self-supervised learning techniques to teach drones to navigate autonomously from YouTube videos. Develop a prototype system capable of learning from online videos and demonstrate its effectiveness on a real drone platform.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Marco Cannici (cannici AT ifi DOT uzh DOT ch), Angel Romero (roagui AT ifi DOT uzh DOT ch) and Elie Aljalbout (aljalbout AT ifi DOT uzh DOT ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Electrical Flow-Based Graph Embeddings for Event-based Vision and other downstream tasks - Available

Description: Besides RPG, this project will be co-supervised by Simon Meierhans (from Alg. & Opt. group at ETH) and prof. Siddhartha Mishra. This project explores a novel approach to graph embeddings using electrical flow computations. By leveraging the efficiency of solving systems of linear equations and some properties of electrical flows, we aim to develop a new method for creating low-dimensional representations of graphs. These embeddings have the potential to capture unique structural and dynamic properties of networks. The project will investigate how these electrical flow-based embeddings can be utilized in various downstream tasks such as node classification, link prediction, graph classification and event-based vision tasks.

Goal: The primary goal of this project is to design, implement, and evaluate a graph embedding technique based on electrical flow computations. The student will develop algorithms to compute these embeddings efficiently, compare them with existing graph embedding methods, and apply them to real-world network datasets. The project will also explore the effectiveness of these embeddings in downstream machine learning tasks. Applicants should have a strong background in graph theory, linear algebra, and machine learning, as well as proficiency in Python and ideally experience with graph processing libraries like NetworkX or graph-tool.

Contact Details: Nikola Zubic (zubic@ifi.uzh.ch), Simon Meierhans (simon.meierhans@inf.ethz.ch)

Thesis Type: Master Thesis

See project on SiROP

What can Large Language Models offer to Event-based Vision? - Available

Description: Event-based vision algorithms process visual changes in an asynchronous manner akin to how biological visual systems function, while large language models (LLMs) specialize in parsing and generating human-like text. This project aims to explore the intersection of Large Language Models (LLMs) and Event-based Vision, leveraging the unique capabilities of each domain to create a symbiotic framework. By marrying the strengths of both technologies, the initiative aims to develop a novel, more robust paradigm that excels in challenging conditions.

Goal: The primary objective is to devise methodologies that synergize the capabilities of LLMs with Event-Based Vision systems. We intend to address identified shortcomings in existing paradigms by leveraging the inferential strengths of LLMs. Rigorous evaluations will be conducted to validate the efficacy of the integrated system under various challenging conditions.

Contact Details: Nikola Zubic (zubic@ifi.uzh.ch), Nico Messikommer (nmessi@ifi.uzh.ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Hybrid Spiking-Deep Neural Network System for Efficient Event-Based Vision Processing - Available

Description: Event cameras are innovative sensors that capture changes in a scene dynamically, unlike standard cameras that capture images at fixed intervals. They detect pixel-level brightness changes, providing high temporal resolution and low latency. This results in efficient data processing and reduced power consumption, typically just 1 mW. Spiking Neural Networks (SNNs) process information as discrete events or spikes, mimicking the brain's neural activity. They differ from standard Neural Networks (NNs) that process information continuously. SNNs are highly efficient in power consumption and well-suited for event-driven data from event cameras. In collaboration with SynSense, this project aims to integrate the rapid processing capabilities of SNNs with the advanced analytic powers of deep neural networks. By distilling higher-level features from raw event data, we aim to significantly reduce the volume of events needing further processing by traditional NNs, improving data quality and transmission efficiency. System will be tested on computer vision tasks like object detection and tracking, gesture recognition, and high-speed motion estimation.

Goal: The primary goal is to develop a hybrid system that combines Spiking Neural Networks (SNNs) and deep neural networks to process event data efficiently at the sensor level. We will demonstrate its versatility and effectiveness in various computer vision tasks. Rigorous testing in simulation will assess the impact on data quality and processing efficiency, followed by deployment on real hardware to evaluate real-world performance.

Contact Details: Nikola Zubic (zubic AT ifi DOT uzh DOT ch), Marco Cannici (cannici AT ifi DOT uzh DOT ch)

Thesis Type: Master Thesis

See project on SiROP

Learned Event Generation from Images - Available

Description: Event cameras offer a unique approach to capturing scenes, detecting changes in light intensity rather than using fixed time intervals like traditional cameras. This project focuses on overcoming the scarcity of event-based datasets by generating synthetic event data from standard frame-based images. Using advanced deep learning techniques, the goal is to create high-quality synthetic events that closely resemble real-world data, helping to bridge the gap between simulated and actual event-based data.

Goal: In this project, you will apply cutting-edge deep learning models to generate artificial events from conventional image frames. You will gain a strong understanding of how event cameras work and how to produce realistic event data. Since the project involves exploring multiple state-of-the-art deep learning methods, a solid background in deep learning is essential. If you're interested, we would be happy to provide further details.

Contact Details: Nico Messikommer [nmessi (at) ifi (dot) uzh (dot) ch], Marco Cannici [cannici (at) ifi (dot) uzh (dot) ch]

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Enhancing Robotic Motor Policies with Event Cameras - Available

Description: In recent robotic advancements, motor policies trained in simulations have achieved impressive real-world performance. This project aims to leverage the high-temporal resolution of event cameras to improve the robustness of these motor policies by incorporating event data as an additional sensor input. Current simulation methods for generating event data are inefficient, as they rely on rendering many high-frame-rate images. This project focuses on creating a shared embedding space between events and frames, allowing motor policies to be trained on simulated frames and deployed using real-world event data. Depending on the project’s progress, the proposed methods can be tested on various robotic platforms, including quadrotors and miniature cars.

Goal: Building on previous student projects (ECCV22), participants will explore Unsupervised Domain Adaptation (UDA) techniques to transfer motor policies from frame-based to event-based data. The project includes testing the approach in simulations and potentially conducting real-world experiments in our drone arena. Special emphasis will be placed on demonstrating the benefits of event cameras in challenging scenarios, such as low-light conditions and fast-moving environments. A strong background in deep learning is required for participants due to the use of advanced techniques for task transfer. If you're interested, we'd be happy to provide more details.

Contact Details: Nico Messikommer [nmessi (at) ifi (dot) uzh (dot) ch], Jiaxu Xing [jixing (at) ifi (dot) uzh (dot) ch]

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Event Keypoint Extraction for Real-Time Pose Estimation - Available

Description: Neuromorphic cameras, known for their high dynamic range (HDR) capabilities, high-temporal resolution, and low power consumption, have opened up new possibilities in camera pose estimation, especially in fast-moving and challenging environments. This project aims to enhance camera pose estimation by developing a data-driven approach for keypoint extraction from event data, building on recent advancements in frame-based keypoint extraction. The project will also integrate a Visual Odometry (VO) pipeline to enable real-time feedback and tracking.

Goal: The goal of this project is to create a data-driven keypoint extractor that identifies key interest points within event data streams. Building on work from a previous student project (CVPR23), participants will apply neural network architectures to extract keypoints from event data. Additionally, the project will adapt existing Visual Odometry (VO) algorithms to work with the newly developed keypoint extractor and tracker. Students should have prior programming experience with deep learning frameworks and have completed at least one computer vision course. This is an exciting opportunity to work at the intersection of neuromorphic imaging and computer vision, contributing to cutting-edge research in camera pose estimation. If you're interested in this project, we'd be happy to provide more details.

Contact Details: Nico Messikommer [nmessi (at) ifi (dot) uzh (dot) ch], Nikola Zubic [zubic (at) ifi (dot) uzh (dot) ch]

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Leveraging Event Cameras for 3D Gaussian Splatting Reconstruction under fast motion - Available

Description: Building on the advancements of 3D Gaussian Splatting methods for scene reconstruction and synthesis, this project aims to push the field forward by improving the efficiency and robustness of current techniques in the context of event cameras. While image-based methods have demonstrated impressive results in ideal conditions, challenges arise when dealing with scenes involving fast motion, low light, and high dynamic range. This project will tackle these challenges by exploiting how events, with their high temporal resolution and robustness to motion blur, can enhance reconstruction quality under fast motion scenarios.

Goal: The primary objective of this project is to explore novel techniques for 3D scene reconstruction using event cameras, with a particular focus on improving performance under fast motion and challenging lighting conditions. In addition to algorithm development, the project will involve the construction of a hardware setup that integrates both traditional and event-based cameras. Experiments will be conducted both in simulation environments and on the physical hardware setup to demonstrate the efficacy of the proposed methods. Applicants with expertise in programming (Python/Matlab), computer vision, and experience with machine learning frameworks (e.g., PyTorch) are invited to apply.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Marco Cannici (cannici AT ifi DOT uzh DOT ch) and Manasi Muglikar (muglikar AT ifi DOT uzh DOT ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Enhancing Neural Scene Reconstruction through Multimodal Fusion - Available

Description: Recent advancements in neural radiance field and 3D Gaussian splatting have shown remarkable success by fusing together vision and semantic modalities for improved reconstruction quality. In this project, we build upon this recent trend and investigate how the use of modalities such as depth, semantic classes, normals and event data, coming from different sensors, can improve 3D reconstruction. The project aims to explore how prior 3D information can assist in reconstructing fine details and how the help of high-temporal resolution data can enhance modeling in the case of scene and camera motion. By exploring the fusion of these modalities, we aim to achieve more accurate and detailed representations of complex environments.

Goal: The primary goal of this project is to evaluate the fusion of multiple sensor modalities, including RGB, depth, and event cameras, for enhanced scene reconstruction quality. We aim to leverage the unique strengths of each modality to achieve finer detail reconstruction and effectively handle complex scenes. Experiments will be conducted both in simulation, as well as on a prototype stereo system developed during the project. Applicants with a background in programming (Python/Matlab), experience in computer vision, and familiarity with machine learning frameworks (PyTorch) are encouraged to apply.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Marco Cannici (cannici AT ifi DOT uzh DOT ch) and Manasi Muglikar (muglikar AT ifi DOT uzh DOT ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Advancing Space Navigation and Landing with Event-Based Camera in collaboration with the European Space Agency - Available

Description: Event-based cameras offer significant benefits in difficult robotic scenarios characterized by high-dynamic range and rapid motion. These are precisely the challenges faced by spacecraft during landings on celestial bodies like Mars or the Moon, where sudden light changes, fast dynamics relative to the surface, and the need for quick reaction times can overwhelm vision-based navigation systems relying on standard cameras. In this work, we aim to design novel spacecraft navigation methods for the descent and landing phases, exploiting the power efficiency and sparsity of event cameras. Particular effort will be dedicated to developing a lightweight frontend, utilizing asynchronous convolutional and graph neural networks to effectively harness the sparsity of event data, ensuring efficient and reliable processing during these critical phases. The project is in collaboration with European Space Agency at the European Space Research and Technology Centre (ESTEC) in Noordwijk (NL).

Goal: Investigate the use of asynchronous neural networks (either regular or spiking) for building an efficient frontend system capable of processing event-based data in real-time. Experiments will be conducted both pre-recorded dataset as well as on data collected during the project. We look for students with strong programming (Pyhton/Matlab) and computer vision backgrounds. Additionally, knowledge in machine learning frameworks (pytorch, tensorflow) is required.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Marco Cannici (cannici AT ifi DOT uzh DOT ch), Nikola Zubic (zubic AT ifi DOT uzh DOT ch)

Thesis Type: Master Thesis

See project on SiROP

Fine-tuning Policies in the Real World with Reinforcement Learning - Available

Description: Training sub-optimal policies is relatively straightforward and provides a solid foundation for reinforcement learning (RL) agents. However, these policies cannot improve online in the real world, such as when racing drones with RL. Current methods fall short in enabling drones to adapt and optimize their performance during deployment. Imagine a drone equipped with an initial sub-optimal policy that can navigate a race course but not with maximum efficiency. As the drone races, it learns to optimize its maneuvers in real-time, becoming faster and more agile with each lap.

Goal: This project aims to explore online fine-tuning in the real world of sub-optimal policies using RL, allowing racing drones to improve continuously through real-world interactions.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to Ismail Geles [geles (at) ifi (dot) uzh (dot) ch], Elie Aljalbout [aljalbout (at) ifi (dot) uzh (dot) ch]

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Inverse Reinforcement Learning from Expert Pilots - Available

Description: Drone racing demands split-second decisions and precise maneuvers. However, training drones for such races relies heavily on crafted reward functions. These methods require significant human effort in design choices and limit the flexibility of learned behaviors. Inverse Reinforcement Learning (IRL) offers a promising alternative. IRL allows an AI agent to learn a reward function by observing expert demonstrations. Imagine an AI agent analyzing recordings of champion drone pilots navigating challenging race courses. Through IRL, the agent can infer the implicit factors that contribute to success in drone racing, such as speed and agility.

Goal: We want to explore the application of Inverse Reinforcement Learning (IRL) for training RL agents performing drone races or FPV freestyle to develop methods that extract valuable knowledge from the actions and implicit understanding of expert pilots. This knowledge will then be translated into a robust reward function suitable for autonomous drone flights.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to: Ismail Geles [geles (at) ifi (dot) uzh (dot) ch], Elie Aljalbout [aljalbout (at) ifi (dot) uzh (dot) ch], Angel Romero [roagui (at) ifi (dot) uzh (dot) ch]

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Language-guided Drone Control - Available

Description: Imagine controlling a drone with simple, natural language instructions like "fly through the gap" or "follow that red car” – this is the vision behind language-guided drone control. However, translating natural language instructions into precise drone maneuvers presents a unique challenge. Drones operate in a dynamic environment, requiring real-time interpretation of user intent and the ability to adapt to unforeseen obstacles.

Goal: This project focuses on developing a novel system for language-guided drone control using recent advances in Vision Language Models (VLMs). Our goal is to bridge the gap between human language and drone actions. We aim to create a system that can understand natural language instructions, translate them into safe and efficient flight instructions, and control the drone accordingly, making it accessible to a wider range of users and enabling more intuitive human-drone interaction.

Contact Details: Interested candidates should send their CV, transcripts (bachelor and master), and descriptions of relevant projects to: Ismail Geles [geles (at) ifi (dot) uzh (dot) ch], Elie Aljalbout [aljalbout (at) ifi (dot) uzh (dot) ch]

Thesis Type: Semester Project / Master Thesis

See project on SiROP

From Floorplan to Flight - Available

Description: Drone racing is considered a proxy task for many real-world applications, including search and rescue missions. In such an application, doorframes, corridors, and other features of the environment could be used to as “gates” the drone needs to pass through. Relevant information on the layout could be extracted from a floor plan of the environment in which the drone is tasked to operate autonomously. To be able to train such navigation policies, the first step is to simulate the environment.

Goal: This project aims to develop a simulation of environments that procedurally generate corridors and doors based on an input floor plan. We will compare model-based approaches (placing objects according to some heuristic/rules) with learning-based approaches, which directly generate the model based on the floorplan. Requirements: - Machine learning experience (PyTorch) - Excellent programming skills in C++ and Python - 3D Modeling experience (CAD, Blender) is a plus

Contact Details: Leonard Bauersfeld (bauersfeld@ifi.uzh.ch), Marco Cannici (cannici@ifi.uzh.ch)

Thesis Type: Master Thesis

See project on SiROP

Vision-based End-to-End Flight with Obstacle Avoidance - Available

Description: Recent progress in drone racing enables end-to-end vision-based drone racing, directly from images to control commands _without explicit state estimation_. In this project, we address the challenge of unforeseen obstacles and changes to the racing environment. The goal is to develop a control policy that can race through a predefined track but is robust to minor track layout changes and gate placement changes. Additionally, the policy should avoid obstacles that are placed on the racetrack, mimicking real-world applications where unforeseen obstacles can be present at any time. Requirements: - Machine learning experience (PyTorch) - Excellent programming skills in C++ and Python

Contact Details: Leonard Bauersfeld (bauersfeld@ifi.uzh.ch), Ismail Geles (geles@ifi.uzh.ch)

Thesis Type: Master Thesis

See project on SiROP

Event-based Particle Image Velocimetry - Available

Description: When drones are operated in industrial environments, they are often flown in close proximity to large structures, such as bridges, buildings or ballast tanks. In those applications, the interactions of the induced flow produced by the drone’s propellers with the surrounding structures are significant and pose challenges to the stability and control of the vehicle. A common methodology to measure the airflow is particle image velocimetry (PIV). Here, smoke and small particles suspended in the surrounding air are tracked to estimate the flow field. In this project, we aim to leverage the high temporal resolution of event cameras to perform smoke-PIV, overcoming the main limitation of frame-based cameras in PIV setups. Applicants should have a strong background in machine learning and programming with Python/C++. Experience in fluid mechanics is beneficial but not a hard requirement.

Goal: The goal of the project is to develop and successfully demonstrate a PIV method in the real world.

Contact Details: Leonard Bauersfeld (bauersfeld@ifi.uzh.ch), Koen Muller (kmuller@ethz.ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Sim-to-real transfer of event-camera-based RL policies - Available

Description: This project aims to develop and evaluate drone navigation policies using event-camera inputs, focusing on the challenges of transferring these policies from simulated environments to the real world. Event cameras, known for their high temporal resolution and dynamic range, offer unique advantages over traditional frame-based cameras, particularly in high-speed and low-light conditions. However, the sim-to-real gap—differences between simulated environments and the real world—poses significant challenges for the direct application of learned policies. In this project we will look try to understand the sim-to-real gap for event cameras and how this gap influences downstream control tasks, such as flying in the dark, dynamic obstacle avoidance and, object catching. This would include learning representations for event data ( ideally while reducing the sim-real domain gap) and training navigation policies using either reinforcement or imitation learning methods.

Goal: Train drone navigation policies on various tasks in simulation using event-based images and transfer them to the real-world.

Contact Details: Elie Aljalbout [aljalbout (AT) ifi (DOT) uzh (DOT) ch], Marco Cannici [cannici (AT) ifi (DOT) uzh (DOT) ch], Ismail Geles [geles (AT) ifi (DOT) uzh (DOT) ch]

Thesis Type: Master Thesis

See project on SiROP

Hierarchical reinforcement learning for 3D object navigation tasks - Available

Description: This project aims to simplify the learning process for new drone control tasks by leveraging a pre-existing library of skills through reinforcement learning (RL). The primary objective is to define a skill library that includes both established drone controllers and new ones learned from offline data (skill discovery). Instead of teaching a drone to fly from scratch for each new task, the project focuses on bootstrapping the learning process with these pre-existing skills. For instance, if a drone needs to search for objects in a room, it can utilize its already-acquired flying skills. A high-level policy will be trained to determine which low-level skill to deploy and how to parameterize it, thus streamlining the adaptation to new tasks. This approach promises to enhance efficiency and effectiveness in training drones for a variety of complex control tasks by building on foundational skills. In addition it facilitates training multi-task policies for drones.

Goal: Develop a hierarchical RL framework that leverages a library of low-level skills. The latter can be either learned using interaction, discovered from offline data, or designed.

Contact Details: Elie Aljalbout [aljalbout (AT) ifi (DOT) uzh (DOT) ch], Angel Romero [roagui (AT) ifi (DOT) UZH (DOT) CH]

Thesis Type: Master Thesis

See project on SiROP

Learning Robust Agile Flight via Adaptive Curriculum - Available

Description: Reinforcement learning-based controllers have demonstrated remarkable success in enabling fast and agile flight. Currently, the training process of these reinforcement learning controllers relies on a static, pre-defined curriculum. In this project, our objective is to develop a dynamic and adaptable curriculum to enhance the robustness of the learning-based controllers. This curriculum will continually adapt in an online fashion based on the controller's performance during the training process. By using the adaptive curriculum, we expect the reinforcement learning controllers to enable more diverse, generalizable, and robust performance in unforeseen scenarios. Applicants should have a solid understanding of reinforcement learning, machine learning experience (PyTorch), and programming experience in C++ and Python.

Goal: Improve the robustness and generalizability of the training framework and validate the method in different navigation task settings. The approach will be demonstrated and validated both in simulated and real-world settings.

Contact Details: Jiaxu Xing (jixing@ifi.uzh.ch), Nico Messikommer (nmessi@ifi.uzh.ch)

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Gaussian Splatting Visual Odometry - Available

Description: Recent works have shown that Gaussian Splatting (GS) is a compact and accurate map representation. Thanks to their properties GS maps are appealing for SLAM systems. However, recent works including GS maps in SLAM struggle with map-to-frame mapping. In this project, we will investigate the potential of GS maps in VO. The goal is to achieve robust map-to-frame tracking. We will benchmark our solution against feature-based and direct-based tracking baselines. This project will be done in collaboration with Meta.

Goal: The goal is to investigate the use of Gaussian splatting maps in visual-inertial systems. We look for students with strong programming (C++ preferred), computer vision (ideally have taken Prof. Scaramuzza's class), and robotic backgrounds.

Contact Details: Giovanni Cioffi, cioffi (at) ifi (dot) uzh (dot) ch, Manasi Muglikar, muglikar (at) ifi (dot) uzh (dot) ch

Thesis Type: Semester Project / Master Thesis

See project on SiROP

IMU-centric Odometry for Drone Racing and Beyond - Available

Description: Our recent work has shown that it is possible to estimate the state of a racer drone only using a low-grade IMU. This project will build upon our previous work and try to extend its applicability to scenarios beyond racing. To achieve this goal, we will investigate an "unconventional" way of using camera images inside the odometry pipeline. The developed VIO pipeline will be compared to existing state-of-the-art model-based algorithms, with a focus on application in agile flights in the wild, and deployed on embedded platforms (Nvidia Jetson TX2 or Xavier).

Goal: Development of an IMU-centric odometry algorithm. Benchmark against state-of-the-art VIO method. A successful thesis will lead to the deployment of the proposed odometry algorithm on the real drone platform. We look for students with strong programming (C++ preferred), computer vision (ideally have taken Prof. Scaramuzza's class), and robotic background. Hardware experience (running code on robotic platforms) is preferred.

Contact Details: Giovanni Cioffi [cioffi (at) ifi (dot) uzh (dot) ch], Jiaxu Xing [jixing (at) ifi (dot) uzh (dot) ch]

Thesis Type: Master Thesis

See project on SiROP

Navigating on Mars - Available

Description: The first ever Mars helicopter Ingenuity flew over a texture-poor terrain and RANSAC wasn’t able to find inliers: https://spectrum.ieee.org/mars-helicopter-ingenuity-end-mission Navigating the Martian terrain poses significant challenges due to its unique and often featureless landscape, compounded by factors such as dust storms, lack of distinct textures, and extreme environmental conditions. The absence of prominent landmarks and the homogeneity of the surface can severely disrupt optical navigation systems, leading to decreased accuracy in localization and path planning.

Goal: This project aims to address these challenges by developing a navigation system that is resilient to Mars' sparse features and dust interference, employing advanced computational techniques to enhance environmental perception and autonomy.

Contact Details: Manasi Muglikar muglikar (at) ifi (dot) uzh (dot) ch, Giovanni Cioffi cioffi (at) ifi (dot) uzh (dot) ch

Thesis Type: Semester Project / Master Thesis

See project on SiROP

HDR NERF: Neural Scene reconstruction in low light - Available

Description: Implicit scene representations, particularly Neural Radiance Fields (NeRF), have significantly advanced scene reconstruction and synthesis, surpassing traditional methods in creating photorealistic renderings from sparse images. However, the potential of integrating these methods with advanced sensor technologies that measure light at the granularity of a photon remains largely unexplored. These sensors, known for their exceptional low-light sensitivity and high dynamic range, could address the limitations of current NeRF implementations in challenging lighting conditions, offering a novel approach to neural-based scene reconstruction.

Goal: his project aims to pioneer the integration of SPAD sensors with neural-based scene reconstruction frameworks, specifically focusing on enhancing Neural Radiance Fields. The primary objective is to investigate how photon derived data can be utilized to improve scene reconstruction fidelity, depth accuracy, and rendering quality under diverse lighting conditions. By extending NeRF to incorporate event-based data from SPADs, we anticipate a significant leap in the performance of neural scene synthesis methodologies, particularly in challenging environments where traditional sensors falter.

Contact Details: Manasi Muglikar muglikar (at) ifi (dot) uzh (dot) ch, Marco Cannici cannici (at) ifi (dot) uzh (dot) ch

Thesis Type: Master Thesis

See project on SiROP

Low Latency Occlusion-aware Object Tracking - Available

Description: In this project, we will develop a low-latency, robust to occlusion, object tracker. Three main paradigms exist in the literature to perform object tracking: Tracking-by-detection, Tracking-by-regression, and Tracking-by-attention. We will start with a deep literature review to evaluate the current solutions to our end goal of being fast and robust to occlusion. Starting from the conclusions of this study, we will design a novel tracker that can achieve our goal. In addition to RBG images, we will investigate other sensor modalities such as inertial measurement units and event cameras. This project is done in collaboration with Meta.

Goal: Develop a low-latency object tracker that is robust to occlusions. We look for students with strong computer vision background and familiar with common software tools used in Deep Learning (for example, PyTorch or TensorFlow).

Contact Details: Giovanni Cioffi [cioffi (at) ifi (dot) uzh (dot) ch], Nico Messikommer [nmessi (at) ifi (dot) uzh (dot) ch]

Thesis Type: Semester Project / Master Thesis

See project on SiROP

Event-based occlusion removal - Available

Description: Unwanted camera occlusions, such as debris, dust, raindrops, and snow, can severely degrade the performance of computer-vision systems. Dynamic occlusions are particularly challenging because of the continuously changing pattern. This project aims to leverage the unique capabilities of event-based vision sensors to address the challenge of dynamic occlusions. By improving the reliability and accuracy of vision systems, this work could benefit a wide range of applications, from autonomous driving and drone navigation to environmental monitoring and augmented reality.

Goal: The goal of this project is to develop an advanced computational framework capable of identifying and eliminating dynamic occlusions from visual data in real-time, utilizing the high temporal resolution of event-based vision sensors.

Contact Details: Manasi Muglikar, muglikar (at) ifi (dot) uzh (dot) ch, Nico Messikommer nmessi (at) ifi (dot) uzh (dot) ch

Thesis Type: Semester Project / Master Thesis

See project on SiROP