CAREER: A Programmable Measurement Architecture for Network Operations
University Of Southern California, Los Angeles CA
Investigators
Abstract
Many companies (e.g., Google, Microsoft, Facebook, Amazon) put huge capital investments into building larger data center networks with higher link speeds. It becomes increasingly important to carefully manage such large data center networks to support multiple tenants, meet service level agreements of applications, and reduce operation cost. Most network management tasks, such as traffic accounting, traffic engineering, load balancing, and performance diagnosis, rely on accurate and timely measurement of time-varying traffic across the entire network. Unfortunately, there are three key problems in today's measurement support: (1) Network device vendors often treat measurement as a second-class citizen, devoting most of the resources to meet the networking control functions (forwarding, firewalls, load balancing, etc.), leaving limited resources for supporting measurement. (2) Operators have limited control over what (not) to measure and when to measure. As a result, the already limited measurement resources are sometimes wasted to measure flows that operators do not care, leaving even fewer resources for measuring important flows. (3) These solutions are device oriented instead of network wide. Operators have to dive into the limited measurement support at multiple devices, taking great efforts to understand these measurement results offline, and find it challenging to answer their network-wide queries. This project aims to design and build a programmable measurement architecture MAP, which transforms today's network measurement practice in enterprise and data center networks. Inspired by software-defined networking, MAP allows operators to flexibly program their network-wide measurement queries in a controller. To answer these queries, MAP automatically configures and coordinates new measurement primitives at different places across the network stack. MAP also allows operators to dynamically change their queries and automatically reconfigure the primitives accordingly to handle network dynamics. There are three key research directions: First, the project will bring novel measurement algorithms and designs throughout the network stack (VMs, hypervisors, switches, and packet sniffers). The researcher will redesign the measurement primitives at these devices to make them both generic in supporting diverse measurement requirements, and efficient in packet processing performance with limited resources and capabilities. Second, the researcher will design new declarative measurement abstractions for operators to clearly express what to measure at the network level and their accuracy/timeliness requirements. The researcher will also design and implement a runtime system that automatically matches the measurement abstractions with primitives at devices, dynamically allocates resources across tasks and handles network dynamics such as mobile hosts and routing changes. Finally, the researcher will study how to support several important and novel measurement tasks with MAP?s primitives, abstractions, and runtime, and evaluate them on MAP. The proposed research, if successful, will fundamentally change network measurement and management practice in enterprise and data centers, and lead to new designs of network devices. The research will facilitate inter-discipline research between theory, programming languages, and networking. The project also has a significant education component, including the design and innovations of graduate courses and course material on software-defined networking and computer networking, as well as research experiences for undergraduate students and members of under-represented groups.
View original record on NSF Award Search →