Collaborative Research: Planning Grant: I/UCRC for Assured and SCAlable Data Engineering (CASCADE)
University Of Maryland, College Park, College Park MD
Investigators
Abstract
A data and information revolution is transforming all aspects of our life, all disciplines, and all sectors of the economy. The "big data" market is predicted to reach $50 billion by 2017, with 40% and 22% market share for services and software, respectively. While products and services continue to mature, assured and scalable data services remain major challenges. This necessitates data architectures and tools that can match the scale of the data and support timely and assured decision making. The vision of the proposed NSF I/UCRC Center for Assured and SCAlable Data Engineering (CASCADE) is to enable a fundamental shift from current ad hoc approaches to the engineering of data systems, towards a principled framework for the engineering of data systems that support reliable and timely data-driven decision making. The center will support the innovation of data architectures and tools that can match the scale of the data, and that support timely and assured decision making. Methods for information integration, analytics and visualization will help non-data-experts (in governmental and commercial sectors) to make decisions and to generate value. The key audience for the proposed NSF I/UCRC Center for Assured and SCAlable Data Engineering (CASCADE) include (a) small, medium, and large companies that rely heavily on data services (especially in the finance and energy sectors), (b) small, medium, and large companies in data technologies, and (c) government agencies and regulators. CASCADE will play an important role in developing assured and scalable data technologies that in turn will enable applications and services with significant economic and environmental impact. This includes financial fraud prevention, monitoring financial supply chains and applications in the energy and sustainability sectors. The broader impacts of the proposed project will include technology and knowledge transfer to the industrial sector, graduate and undergraduate education through mentoring of PhD students, and updates to the CSE curriculum through the incorporation of research into existing undergraduate and graduate classes. CASCADE activities will train computer science students in the methodologies that support scalable and secure data engineering and will familiarize them with real world challenges in critical domains including the financial sector and clean energy. CASCADE will also contribute to diversity on the workforce through recruitment of female and minority students. If we want to fundamentally alter the way data systems are designed and significantly change current practices, we need to ensure that data analysis, data assurance, and data management technology components are developed synergistically to achieve the following targets: (1) The design and development of each component is informed by the requirements and limitations of the others; (2) Each takes full advantage of the services and capabilities provided by the others; (3) They continuously adapt as the analysis, assurance, and management contexts evolve with the needs of the deployed application systems that they all support. A key CASCADE goal is to empower domain experts and decision makers through assured and scalable data systems, and to provide reliable and timely decision making through a sense&integrate, simulate&predict, validate&interpret, and act&adapt feedback loop. This planning grant's objective is to organize a meeting with industry partners and the universities to outline a research agenda for CASCADE. The industrial/academic partnerships of CASCADE will enable new algorithms, tools, and systems that securely manage, share, access, and analyze heterogeneous sets of static or transient data to accommodate diverse security requirements, including trust, availability, confidentiality, and integrity. Through synergistic industry/academy partnerships, CASCADE will enable a strategic framework that includes multi-disciplinary teams that translate technological insights obtained from fundamental research on (a) trusted and privacy preserving data processing and analysis, (b) real-time data processing and analysis, (c) parallel and distributed data processing and analysis, and (d) high dimensional and multi-modal data processing and analysis, into new key technology elements whose different instantiations are deployed for direct impact to various critical industries including in the energy and finance sectors.
View original record on NSF Award Search →