ITR: High Performance Imaging Using an Array of Low-Cost Cameras
Stanford University, Stanford CA
Investigators
Abstract
This project will explore the capabilities of an array of 128 cameras, which can be combined computationally in many different configurations for a wide range of scientific, commercial, and communication applications. Over the past few years, the ability of image-based rendering (IBR) techniques to create photorealistic images of real scenes has generated great interest in building sensor systems that can capture environments from multiple viewpoints. At the same time, we have witnessed the advent of CMOS image sensors, which are inexpensive and easy to use because of their digital interface. Furthermore, because they are manufactured in a CMOS process, processing power can be placed on the sensors themselves. Finally, advances in semiconductor technology are making increasing computing power available for decreasing cost, power, and chip area. These trends raise the questions: What can we do with many inexpensive CMOS image sensors, equally inexpensive optics, and a lot of processing power? Can we use more cameras of lesser quality to enable more robust IBR algorithms? Can we use clusters of inexpensive imagers and processors to create virtual cameras that outperform real ones? Each camera in the 128-camera array contains a CMOS image sensor, MPEG encoder, and programmable processor, in order to investigate these questions. The device is designed to record 128 synchronized video datasets through three PCs to a disk array. This project will explore applications of the array to scientific imaging and computer vision and graphics. Multi-camera systems can function in many ways. If the cameras are packed close together, then the system effectively functions as a single-center-of-projection synthetic camera, which can be configured to provide high performance along one or more imaging dimensions, such as resolution, signal-to-noise ratio, dynamic range, depth of field, frame rate, or spectral sensitivity. For example, one configuration could produce high-resolution images 10,240 x 3,830 pixels, and another could generate 7,680 frames per second. Such capabilities are unprecedented for a video system, and they will have many scientific, engineering, and military uses. If the cameras are placed farther apart, then the system functions as a multiple-center-of-projection camera, and the data it captures is called a light field. Of particular interest are novel methods for estimating 3D scene geometry from the dense imagery captured by the array. This information can be used to improve compression of the light field and to interpolate smoothly between widely spaced cameras, allowing smooth virtual navigation through the scene. Potential applications include evaluation of design models for manufacturing, medical and forensic consultation, online shopping, and virtual museum displays.
View original record on NSF Award Search →