CSR: NeTS: Medium: Achieving High-Availability in the Face of Rapid Network Evolution in Large Content Providers
University Of Southern California, Los Angeles CA
Investigators
Abstract
Global Internet services, such as web search, social networking, video dissemination, and cloud computing are built on data centers that are large warehouses full of computers. The computers in the data centers are connected together, and to the Internet, by a network. If that network goes down then the computers it connects and the Internet services they support also go down. This project aims to decrease, by one to two orders of magnitude, how long those networks are down each month. The project will explore the design of two components that can contribute significantly towards decreasing network downtime. One component will be a new highly-available control plane that will include a new replication protocol that provides strong consistency and resilience to the failure of an external failure detector while retaining as many of the desirable properties of current less-available designs (e.g., small resource footprint) as possible. The other component will be an automated management plane that will raise the level of abstraction of specifying a management operation (MOp) on the network, thereby allowing automated software to synthesize and manage the sequence of steps necessary to achieve a complex network MOp such as upgrading software on a large network. Global Internet services are an integral part of modern life and increasing their availability will benefit society. The project plans include developing software prototypes and working closely with large Internet services to increase the likelihood of technology transfer. The research in the project will be conducted by graduate students and undergraduate students that will learn how networks work, how to build and improve them, and how to build and improve the computer systems that control them. Each of these skills is likely to be increasingly important in the coming years.
View original record on NSF Award Search →