GGrantIndex
← Search

Modeling and Inference for Data with Network Dependency

$150,000FY2022MPSNSF

University Of Pittsburgh, Pittsburgh PA

Investigators

Abstract

Data with complex interpersonal dependency characterized by networks are increasingly encountered in many scientific areas. For example, in some survey studies for school students, a friendship network among students may also be collected in addition to traditional variables collected on each unit such as drug use, smoking and mental health status. The response of interest such as drug use is likely to have dependency across units through friendship networks. As another example, brain functional connectivity studies have consistently discovered functional linkage among brain regions, where dependency among brain regions arises through a network structure. The analysis of network-linked data calls for statistical inference tools and theories that consider network dependency. The developed methods will be applied to data analyses of stress and suicidal studies, helping understand the pathological and biological mechanisms underlying suicidal behaviors. The principal investigator (PI) plans to develop open-source software packages to disseminate the results and provide training opportunities for graduate students. The first part of the research focuses on developing methods and theory for the inference of regression coefficients and dependency measures between two variables of interest, when there is network dependency across sample units. In the second part of the research, the PI will focus on analyzing network-linked data with replicates. In some applications, multiple independent units may exist, and the observed multivariate data within each unit may have dependency through a network structure. The analysis of this type of network-dependent data exhibits its own features due to the availability of independent realizations. The challenges of dealing with network dependency are at least twofold. The first challenge is the infinite dimensionality, which can be understood through the notion of network neighborhood growth. The second challenge is node heterogeneity, which means a general network does not have the symmetric structure as in a Euclidean lattice space. This research will develop new statistical inference tools and theories addressing these challenges. This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

View original record on NSF Award Search →