SHF:Small: Learning-based Fast Analysis and Fixing for Electromigration Damage

$500,000FY2023CSENSF

University Of California-Riverside, Riverside CA

Investigators

Abstract

Electromigration (EM) has a significant reliability issue and limiting factor for advanced very large scale integration (VLSI) designs due to the shrinking size and increasing current density of copper-based interconnects in sub-3nm technology. As a result, future chips are expected to age faster than previous generations. While recent advances in EM modeling and assessment techniques have been made, fast and accurate EM analysis and automatic optimization for large-scale power grid networks remain challenging due to the need for physics-based modeling that involves solving partial differential equations for hydrostatic stress in large interconnects. This becomes even more difficult for full-chip level EM management. Machine learning techniques, particularly deep learning based on deep neural networks (DNNs), such as convolutional neural networks (CNNs), and scientific machine learning (SciML) approaches, have emerged as promising solutions to traditional numerical analysis techniques for solving partial differential equations (PDEs). The unsupervised physics-informed/constrained neural network (PINN/PCNN) framework in the SciML field shows powerful capabilities such as mesh-free and parametrized numerical solutions. However, existing PINN/PCNN works can only solve small PDE problems with simple boundary conditions. For large engineering problems with millions of variables commonly seen in design automation, PINN/PCNN approaches show slow convergence, if they converge at all. Additionally, fixing EM-induced failure or damage to achieve the expected EM mean time to failure at both design and run times remains challenging due to the sensitivity-based optimization framework and lack of sufficient on-chip temperature sensors. The tools to be developed in this project will be valuable to advancing the understanding of this important problem and curtailing the lifetime reliability issue of complementary metal-oxide semiconductor (CMOS) chips. This project will develop novel learning-based EM analysis based on PINN/PCNN framework and efficient machine learning-accelerated full-chip EM fixing and run-time management methods for VLSI chips in the nanometer regime. First, the project will explore new SciML-based solutions such as enhanced PINN/PCNN methods, for hydrostatic stress analysis for multi-segment interconnect trees. On top of those methods, full-chip multi-physics coupled EM induced IR drop (i.e., reduction in voltage) analysis and lifetime estimation for power grid networks considering Joule heating effects will be developed. Second, the project will develop efficient full-chip EM-aware power grid optimization techniques, aided by DNNs, along with a dynamic run-time EM-aware management method to identify the real hotspots of processors. The project will investigate DNN-accelerated full-chip power grid optimization by exploiting the differentiability of the trained DNN models to allow for rapid sensitivity calculations based on a sequence of linear programming techniques. Finally, we will devise a DNN-based method to estimate hotspots and develop a dynamic run-time EM lifetime management method, considering the actual hotspots of processors. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →