CAREER: A Scalable Multiplane Data Center Network
University Of California-San Diego, La Jolla CA
Investigators
Abstract
Large Internet data center providers, both public and private, must support ever-increasing data rates between literally hundreds of thousands of servers to meet processing and storage demand. Operators have relied on similar scale-out network fabrics (typically folded-Clos topologies) to construct their networks. Since their deployment in the mid-2000s, these scale-out designs have leveraged the steadily increasing performance and decreasing cost of complementary metal-oxide semiconductor (CMOS)-based switching silicon to keep pace with demand. Unfortunately, these trends cannot continue: network switches face the same CMOS process-scaling limitations that currently hamper central processing unit (CPU) manufacturers. Just as CPUs have moved to multi-core designs to side-step their scaling limitations, so too will data center operators need to adopt alternative architectures to scale to next-generation link rates. This project will demonstrate a hybrid electrical/optical nework topology, called SelectorNet, which scales to hundreds of thousands of servers at link rates reaching 1.6 terabits per second. Unlike recent proposals which utilize two dimensional- or three-dimensional microelectromechanical systems (2D or 3D-MEMS) optical crossbar switches, SelectorNet relies on a novel optical device that abandons the crossbar abstraction. Instead, it relies on indirection to deliver packets between hosts that are not directly connected by our novel "selector" switches. The result is a network fabric that is not only cost-competitive with state-of-the-art Clos-based designs in 2020, but continues to scale in terms of cost, energy, performance, and reliability as link rates surpass 400 gigabits per second. Broader Impact: Ensuring that the benefits of this work have impact beyond the traditional metrics of research is integral to its design. The results of this research will make it easier to design and build scalable, efficient, and highly-available cloud and data center services. By reducing the cost to deploy cloud infrastructure, the researchers hope to lower costs for the largest operators, while reducing the barrier to entry of the cloud for smaller organizations. They will further expand the research skills of graduate and undergraduate students to address necessary datacenter efficiency and cloud computing research challenges in a hands-on manner. Exposing undergraduate students to cloud computing technologies in their courses and through mentored research will enhance their marketability at graduation and has the potential to inspire their curiosity and encourage the pursuit of graduate studies. Teaching students how to build state-of-the-art networked systems that are grounded in rigorous analysis and practical constraints is essential in our increasingly networked world. An additional component of this research will be the creation and dissemination of videos that will broaden public awareness and appreciation of the science and engineering challenges facing large-scale computing, machine learning, and Internet systems.
View original record on NSF Award Search →