CNS Core: Small: Designing Efficient Cloud Datacenter Network Fabrics
University Of California-San Diego, La Jolla CA
Investigators
Abstract
Cloud datacenter networks are tasked with providing connectivity between an ever-increasing number of end hosts whose link rates improve by orders of magnitude every few years. What network operators would ideally like is a single, full-bandwidth switch that could connect every endpoint at full rate. Such an idealized network would enable them to place jobs and data where it is convenient, without worrying about bandwidth bottlenecks, hotspots, and other network-induced limitations. Unfortunately, preserving this ``big-switch'' illusion of a single network with full bandwidth is increasingly cost prohibitive and likely soon infeasible. This project will explore an alternative method of constructing datacenter network fabrics based upon a provably optimal topological construct, an expander graph. If successful, the project will result in network fabrics that are more flexible, capable, and scalable than existing state-of-the-art approaches. This project will develop a family of cloud datacenter network topologies based on expander graphs that eliminate the capacity bottlenecks inherent in hierarchical Clos-based topologies while minimizing the bandwidth tax incurred due to indirect routing. A single, large expander-graph network topology can be constructed out of multiple, disjoint expander graphs; this project will show how judicious tenant placement can then provide both isolation and dynamic capacity while minimizing the bandwidth tax. Moreover, by employing reconfigurable network components (i.e., circuit switches), it is even possible to evolve the set of constituent expander graphs over various time scales, allowing cloud datacenter operators to better suit the needs of their current tenants. Indeed, if the timescales are sufficiently small (e.g., 100s of milliseconds) tenants may then choose to buffer traffic until a particularly favorable (set of) path(s) is available, further decreasing the overall bandwidth inefficiency or "tax". If the network topology evolves at a rapid rate, it is possible to choose, on a per-packet basis, whether to either (1) immediately send a packet over whatever static expander is currently instantiated, incurring a modest tax on this small fraction of traffic, or (2) buffer the packet and wait until a direct link is established to the ultimate destination, eliminating the bandwidth tax on the vast majority of bytes. This project will engage graduate and undergraduate students through structured courses, intense mentorship, and hands-on research activities through participation in the NSF-funded UC San Diego Early Research Scholars Program (ERSP). This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
View original record on NSF Award Search →