CAREER: Learning and Optimization for Robust Multimodal Interpretation in Conversation Systems
Michigan State University, East Lansing MI
Investigators
Abstract
Multimodal systems allow users to interact with computers through multiple modalities such as speech, gesture, and gaze. One challenge for multimodal systems is multimodal interpretation, which is the process of understanding what a user intends to communicate. Despite recent progress in multimodal interpretation, most systems still have problems handling unexpected or unreliable user inputs. This project seeks to improve the robustness of multimodal interpretation through two objectives: 1) to adapt system interpretation capability over time through automated knowledge acquisition, 2) to optimize interpretation through probabilistic reasoning. Specifically, this research will develop supervised and unsupervised learning approaches to automatically acquire knowledge, both offline from annotated data and online from real time interactions. Furthermore, this project will develop mechanisms to account for uncertainties that occur at different stages in the interpretation process and to derive an optimal interpretation through probabilistic reasoning about context and user intent. To support these approaches, large corpora of multimodal data will be collected from human machine conversation and annotated in terms of user intent and interaction context. This research will provide a benchmark for algorithmic advancement and evaluation. The enhanced robustness and reliability in multimodal interpretation will make multimodal systems more effective for real world applications. Through a multimodal conversation system that helps students explore options for undergraduate and graduate study at various institutions and locations, the results of this research will be directly applied to an education plan that includes outreach activities, curriculum development, and student mentoring. This tight integration of research and education will offer a unique multidisciplinary opportunity to synergize other programs within the university such as speech processing, psycholinguistics, and cognitive science.
View original record on NSF Award Search →