Optimal Scheduling of Parallelizable Jobs in Cloud Computing Environments
Carnegie Mellon University, Pittsburgh PA
Investigators
Abstract
This award will contribute to the advancement of national prosperity and economic welfare by deriving methods to improve response times for processing jobs in cloud computing environments. Today, data centers of major companies and supercomputing centers are heavily occupied in processing machine learning jobs, where each job occupies multiple servers/cores in parallel. While there is a long history on scheduling for serial jobs, less is known about parallel job scheduling, and existing heuristics perform poorly in the cloud computing environment. This award will develop algorithms for efficiently allocating a finite number of servers across a set of parallelizable jobs. Optimal scheduling is difficult because an individual parallel job receives decreasing marginal benefit from each additional server that it is allocated. The PIs will develop new analytical methods to address this complex scheduling problem. The results of this research are highly relevant to industry and will inform state-of-the-art cloud scheduling systems. Algorithms, protocols, and experimental results will be disseminated via journal publications, online code and open access data repositories. As part of this award, the PIs will provide outreach to middle school girls to increase exposure and skills in mathematics and computing, as well as training of both undergraduates and PhD students in the areas of parallel computing, scheduling, and queueing. This award will support research on optimally allocating a finite number of servers across parallel jobs, so as to minimize mean flow time, mean slowdown, and related metrics. Motivated by real-world benchmarks and measurements, parallel jobs are modeled via a concave speedup function which specifies the speedup benefit to the job as a function of the number of servers which are allocated to it. The research plan addresses a wide variety of situations, including the case where jobs have different speedup functions, where jobs have different priorities, where job sizes are not known a priori, and where jobs arrive over time. Deriving optimal scheduling strategies will require the development of new analytic techniques to vastly reduce the search space of possible allocations and reveal the structure of the optimal solution. Towards that goal, the award will develop a series of dimensionality reduction techniques, including a scale-free property, a size-invariant property, an online completion order property, a technique for trading off job sizes and different parallelization levels, and a parallel counterpart to the Gittins Index scheduling policy. Results will be validated first via stochastic simulation and then via trace-driven simulation using traces from industry and supercomputing centers. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →