SI2-SSI: LIMPID: Large-Scale IMage Processing Infrastructure Development
University Of California-Santa Barbara, Santa Barbara CA
Investigators
Abstract
Scientific imaging is ubiquitous: From materials science, biology, neuroscience and brain connectomics, marine science and remote sensing, to medicine, much of the big data science is image centric. Currently, interpretation of images is usually performed within isolated research groups either manually or as workflows over narrowly defined conditions with specific datasets. This LIMPID (Large-scale IMage Processing Infrastructure Development) project will have a transformative impact on such discipline-centric workflows through the creation of an extensive and unique resource for the curation, distribution and sharing of scientific image analysis methods. The project will create an image processing marketplace for use by a diverse community of researchers, enabling them to discover, test, verify and refine image analysis methods within a shared infrastructure. As a freely available, cloud-based resource, LIMPID will facilitate participation of underrepresented groups and minority-serving institutions, as well as international scientists, allowing them to address questions that would otherwise require expensive software. The potential impacts of the project are significant: from wide dissemination of novel processing methods, to development of automatic methods that can leverage data and human feedback from large datasets for software training and validation. For the broader scientific community, this immediately provides a resource for joint data and methods publication, with provenance control and security. This in turn will facilitate faster development and deployment of tools and foster new collaborations between computer scientists developing methods and scientific users. The project will prepare a diverse cadre of students and researchers, including women and members of under-represented groups, to tackle complex problems in an interdisciplinary environment. Through workshops, participation at scientific meetings, and summer undergraduate research internships, a broad community of users will be engaged to actively contribute to all aspects of research, development, and training during the course of this project. The primary goal is to create a large scale distributed image processing infrastructure, the LIMPID, though a broad, interdisciplinary collaboration of researchers in databases, image analysis, and sciences. In order to create a resource of broad appeal, the focus will be on three types of image processing: simple detection and labelling of objects based on detection of significant features and leveraging recent advances in deep learning, semi-custom pipelines and workflows based on popular image processing tools, and finally fully customizable analysis routines. Popular image processing pipeline tools will be leveraged to allow users to create or customize existing pipeline workflows and easily test these on large-scale cloud infrastructure from their desktop or mobile devices. In addition, a core cloud-based platform will be created where custom image processing can be created, shared, modified, and executed on large-scale datasets and apply novel methods to minimize data movement. Usage test cases will be created for three specific user communities: materials science, marine science and neuroscience. An industry supported consortium will be established at the beginning of the project towards achieving long-term sustainability of the LIMPID infrastructure. This project is supported by the Office of Advanced Cyberinfrastructure in the Directorate for Computer & Information Science and Engineering and the Division of Materials Research in the Directorate for Mathematical and Physical Sciences.
View original record on NSF Award Search →