GGrantIndex
← Search

ITR: Distributed Analysis of Large Distributed Datasets with Java (BlueOx)

$119,000FY2002MPSNSF

Princeton University, Princeton NJ

Investigators

Abstract

The scale of data storage and analysis for the high energy physics experiments at the Large Hadron Collider (LHC) at CERN, Switzerland, provides new challenges and necessitates research into new concepts of CPU utilization. The quantity of data produced is too large to store at any single university. Rather, the data will be archived in a central location. This project will develop techniques and software packages to enable the efficient distributed analysis of these data by researchers at remote sites. This proposal would support the investigation into how to send a user's request for a particular analysis and some code to the locations where the data are stored and run the code there. The system would then collect and assemble the results and present them to the user. The framework would be the BlueOx system and Java will be the implementation language for the BlueOx framework so as to take advantage of its cross-platform compatibility. The most important challenge facing the framework is that of scalability. This project plans to study and improve the scalability by testing it with large dummy datasets of simulated data on a moderate-sized computing cluster. The aim is to keep the generality that can make it useful to many large scientific projects.

View original record on NSF Award Search →