III: EAGER: A Framework for Large Data Analysis
University Of Florida, Gainesville FL
Investigators
Abstract
Modern multicore architectures, that provide high raw gigaflops and teraflops, have deep memory hierarchies and low overhead threading capabilities. Lack of support for directly exploiting these capabilities leads to severe under-utilization especially for data intensive applications. This project expects to develop methods that efficiently use the available computational power to provide cost improvement for large scale data processing systems. This project will develop a highly efficient computation framework called GLADE that will support a large class of data intensive applications, and will be based on a novel computational model called generalized linear aggregates. The commutative and associative properties of Generalized Linear Aggregates facilitate highly efficient parallel and distributed computation as well as exploitation of deep memory hierarchies, especially when multiple queries are simultaneously executed as is typical in many data-processing tasks. The resulting one to two orders of magnitude improvement in computational efficiency can be expected to yield corresponding reduction in cost and energy requirements of data processing tasks which in turn will make it feasible to analyze much larger data sets than currently possible. The proposed work will make the synergistic combination of high performance computing and large scale data analysis widely available to researchers, and other interested groups in government, industry, and education. The enabling of a large number of data intensive application using inexpensive computers that cost in low tens of thousands of dollars will broaden the use of data analysis, exploration and mining for a wide variety of existing and emerging applications. Examples of such applications include network intrusion detection, social network analysis, climate data, ecosystem analysis, and customer relationship management. Additional information about the project can be found at: http://sites.google.com/site/sanjayranka/glade.
View original record on NSF Award Search →