Nonparametric Estimation and Inference Methods for the Analysis of Longitudinal Data
Johns Hopkins University, Baltimore MD
Investigators
Abstract
The aim of this project is to develop a series of statistical methods for the analysis of longitudinal data. The investigator studies the theoretical and practical properties of these methods through a series of asymptotic and simulation studies. This type of data involves either equally or unequally spaced repeated measurements over time from a collection of independent subjects. Because of the possible intra-subject correlations, two major tasks involved in a typical longitudinal analysis are: (1) to model and estimate the mean time-varying covariate effects on the response variables of interest; and (2) to quantify the possible correlations and individual effects in a statistical estimation and inference process. The investigator provides a range of nonparametric tools for accomplishing the above tasks through the investigation of four research topics: (a) developing and comparing the large sample properties of several local smoothing methods for the estimation of coefficient curves in varying coefficient models; (b) developing a class of local and global inference and model diagnostic procedures to assess the validity of parametric and semi-parametric regression models; (c) evaluating the theoretical and practical properties of the "leave-one-subject-out" cross-validation and other procedures for the selection of smoothing parameters; and (d) investigating the theoretical and practical properties of global approximation through a class of nonparametric mixed-effects models. In addition to the methodological publications, results of this project also include algorithms that allow for easy implementations of the developed methods. The theoretical results of this project provide useful insights for guiding the development of new statistical procedures in longitudinal analysis. The investigator and his collaborators demonstrate the usefulness and the potential impacts of their methods by applying them to a number of biomedical and epidemiological studies. The rapid development of computing technology has enabled scientists in various fields of social and natural sciences easy access to large datasets involving variables repeatedly observed over time. This type of data, known as longitudinal data, is common in biomedicine, epidemiology, economics, and sociology, among others. Statistical research plays the crucial role of providing theoretically sound and practically feasible tools for extracting useful information from the data. In biomedical and epidemiological studies, such useful information may include, for example, the effects of treatments on disease progression over time, the potential association between a mother's habit of cigarette smoking and the fetal growth pattern during pregnancy, and other findings that are of biomedical and public health interests. Despite considerable progress made by many talented researchers, there is still a large demand for more reliable and efficient modeling and diagnostic techniques, particularly in the area of initial data exploration, that are capable to handle repeated measurements. Systematic theoretical development is also needed for building a solid foundation to judge the adequacy of some existing methods and providing insights that lead to future methodological development. In the current project, the investigator evaluates the theoretical properties of a class of flexible and useful statistical models known as the varying coefficient models and, by extending his theoretical results, develops a class of new modeling approaches that are potentially superior to the existing ones in many longitudinal settings. Because the statistical theory, models and algorithms can be applied to situations where there does not exist a pre-specified parametric model, they provide valuable tools that are capable to derive statistical inferences entirely based on the data. These tools allow scientists, policy makers and researchers to draw adequate conclusions from their data without depending on pre-specified assumptions that maybe too restrictive to their settings. In a collaborative effort with other statistical and biomedical researchers, the investigator and his colleagues demonstrate the application potential of their methods by applying them to a number of biomedical and epidemiological studies and discuss the biological implications of their findings.
View original record on NSF Award Search →