CRII: RI: Vision-Anchored Automation of Bird-Sized UAVs in Unknown Cluttered Indoor Environments
Rochester Institute Of Tech, Rochester NY
Investigators
Abstract
In many real-world applications, autonomous unmanned aerial vehicles (UAVs) can be used to explore unknown, cluttered indoor spaces where GPS access and communication are often denied. To accommodate the confined working space, however, UAVs have a small body size (roughly the size of a bird). Such small size UVAs require lightweight and power-efficient sensors. Therefore, this research project aims to develop full automation for bird-sized UAVs within unknown and cluttered indoor environments using only an RGB-D camera. Even though vision-only UAVs have advantages for system assembly, their automation becomes increasingly difficult when other sensors, such as radars and LiDARs, are not available. This project addresses two fundamental challenges for UAV automation: (1) how to construct visual perception to have a holistic yet computationally efficient understanding of the surrounding environment using only a vision sensor and (2) leveraging the established perception system, how to employ visual navigation to perform target-driven, safety-critical operations without relying on maps or GPS. The research of this project is closely integrated with education and outreach activities at the Rochester Institute of Technology. This research develops technologies for vision-anchored automation of bird-sized UAVs in unknown cluttered indoor environments. The project introduces a method for multi-task fusion to simultaneously detect, segment, and track regions of interest within the video frames. The proposed perception system can achieve a holistic, sensor-fusion-like understanding of the scene using only measurements from the RGB-D camera sensor. The computational tractability of the perception model will be investigated to ensure sufficient operation endurance by leveraging a novel paradigm of recursive knowledge distillation. For navigation tasks, the project includes a novel policy-learning scheme—empowered by multimodal representations conditioned on observations—in conjunction with an online domain adaptation technique that could enable bird-sized UAVs to make appropriate decisions in various critical operations, e.g., target searching and collision avoidance during navigation in a cluttered and confined working space. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →