SHF: Small: A Distributed Scalable End-to-End Tail Latency SLO Guaranteed Resource Management Framework for Microservices

$599,953FY2022CSENSF

University Of Texas At Arlington, Arlington TX

Investigators

Abstract

Microsevice architecture typically yields highly scalable, maintainable, testable and loosely coupled systems. For this reason, cloud-computing services are shifting from traditional monolithic architectures to microservice architectures to enable fast development, easy debugging and frequent software updating. A primary design goal of cloud computing is to provide user-centric services that meet diverse end-to-end application query tail-latency Service Level Objective (SLO) requirements, while maximizing system-resource utilization. This proposal aims at developing such an approach, known as Distributed scalable End-to-end Tail lAtency SLO Guaranteed rEsource Management framework for microservices (DETAGEM). DETAGEM explores foundational principles to establish a solid theoretical foundation towards a comprehensive solution for resource allocation with guaranteed per-query tail-latency SLO. The algorithms and tools developed in DETAGEM will be made available to benefit a broad set of researchers, engineers and educators, and thereby advancing research across a range of other areas. Moreover, the obtained results from this project will be integrated into existing undergraduate and graduate curricula on a continuous basis. The involvement of underrepresented students in this proposed research at The University of Texas at Arlington will enhance their competitiveness in the future job market. DETAGEM is a comprehensive distributed scalable solution rooted in fundamental principles. The approach taken by DETAGEM is to (a) derive a mathematical foundation that maps the SLO for Directed Acyclic Graph (DAG) workflows for queries to the task queuing budgets for tasks in the queies at their respective microservices, based on the statistics of unloaded task response time measured at individual microservices at runtime; and (b) develop task-queuing-budget-aware queue-management schemes and resource-auto-scaling schemes based on measured task-queuing budget-violation ratios. The objective is to ensure that all the microservices meet their respective task-queuing budgets, which will then guarantee that the end-to-end query tail-latency SLO for the queries are satisfied. The successful completion of the proposed project is expected to greatly advance the state-of-the-art in terms of methodologies, theories, and practical solutions for the microservice architecture. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →