CRII: RI: Understanding Activities of Daily Living in Indoor Scenarios

$185,000FY2023CSENSF

University Of North Carolina At Charlotte, Charlotte NC

Investigators

Abstract

It is projected that, by 2050, one-in-six people in the world will be over the age of 65. This is up from one-in-11 in 2019. Thus, older adults are a growing demographic group in society. Because aging is related to increased healthcare utilization, as are pandemics, this increasing demographic translates into the need to increase the workforce in healthcare. The rising demand for healthcare can be combated by deploying activity monitoring systems, which could help monitor the health status of older patients and support the early detection of health issues. Thus, building such monitoring systems requires an automated understanding of Activities of Daily Living (ADL) performed by humans. Most of the investigations towards modeling human activities owing to the advances in computer vision are targeted for generic internet videos. Existing models are fabricated for recognition in web videos whereas viewpoint variation and subtle motion that characterize ADLs generally cannot handle uncertainty and tend to underperform in real-world scenarios. Moreover, they have difficulties distinguishing similarly looking activities. Thus, the key objective of this project is to build a framework for recognizing ADLs that can be deployed in monitoring systems. The project will also perform complementary educational and outreach activities that engage STEM students. This project will develop a multi-modal framework predominantly based on the RGB color model and pose modalities due to their easy accessibility in indoor scenarios. This framework aims at addressing two important challenges - the limited availability of labeled ADL videos and how to combine different heterogeneous modalities (RGB and Poses) for classifying activities. Thus, this project will explore the integration of two interrelated research directions: (1) a study on learning from limited training distribution; and, (2) a study on combining modalities like RGB and depth. In the first study, the project will explore the possibilities of mitigating the limitation of the scale of available data in the ADL domain for effective training of neural networks for video understanding. In the second study, the team will aim at improving the effectiveness of RGB-based human activity recognition by leveraging the human localized regions within a scene. Finally, we will develop a multi-modal neural network model for ADL by integrating human localized RGB and 3D poses of a human actor. This research study will reveal several new dimensions towards understanding ADLs that will also benefit the computer vision community. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →