RI: Small: Texture2Text: Rich Language-Based Understanding of Textures for Recognition and Synthesis

$450,000FY2016CSENSF

University Of Massachusetts Amherst, Amherst MA

Investigators

Abstract

This project develops techniques at the interface of vision and natural language to understand and synthesize textures. For example, given a texture the project develops techniques that provide a description of the pattern (e.g., "the surface is slippery", "red polka-dots on a white background"). Techniques for semantic understanding of textures benefit a large number of applications ranging from robotics where understanding material properties of surfaces is key to interaction, to analysis of various forms of imagery for meteorology, oceanography, conservation, geology, and forestry. In addition, the project develops techniques that allow modification and synthesis of textures based on natural language descriptions (e.g., "make the wallpaper more zig-zagged", "create a honeycombed pattern"), enabling new human-centric tools for creating textures. In addition to the numerous applications enabled by this project, the broader impacts of the work include: the development of new benchmarks and software for computer vision and language communities, undergraduate research and outreach, and collaboration with researchers and citizen scientists in areas of conservation. This research maps visual textures to natural language descriptions and vice versa. The research advances computer vision by providing texture representations that are robust to realistic imaging conditions, clutter, and occlusions in natural scenes; content retrieval by providing new ways to search and retrieve textures using descriptions; and image manipulation by providing new ways to create and modify textures using descriptions. The main technical contributions of the project are: (1) principled architectures that combine aspects of texture models with deep learning to enable end-to-end learning of texture representations; (2) techniques for understanding the properties of these representations through visualizations; (3) a large-scale benchmark to evaluate techniques for language-based texture understanding; (4) new models for texture captioning; (5) applications of texture representations for fine-grained recognition and semantic segmentation; and (6) techniques for retrieving and creating textures using natural language descriptions.

View original record on NSF Award Search →