Improved Inference for Multistage Cluster Samples Using Bias Reduced Linearization Standard Error Estimates

$90,550FY2000SBENSF

Rand Corporation, Santa Monica CA

Investigators

Abstract

The project addresses important methodological issues for statistical inference in multistage samples by addressing shortcomings in the current literature on linearization estimators of the variance of parameter estimates in regression models. When the number of primary or first stage sampling units is small or the data contain "high leverage" clusters, linearization standard errors can be severely biased, resulting in confidence intervals that are too narrow, and tests with Type I error rates that greatly exceed the nominal value of the test. This project will improve on current practice in three ways. First, the study develops appropriate transformations of the residuals that account for the model fitting process and will be used in the formula for the standard linearization estimator to reduce bias in standard error estimates from logistic regression models and models fit to data from stratified multistage samples with unequal sampling weights. Second, the study will explore alternative approaches for choosing the reference distribution for both univariate and joint hypothesis tests that provide Type I error rates that approximate the nominal value of the test. Third, the project will develop diagnostics based on decompositions of the matrix of regressors to allow users to determine when the number of primary sampling units is sufficiently small or the distribution of the predictor variables is sufficiently imbalanced across clusters to result in biased and highly variable standard error estimates. Linearization methods are widely used for estimating the variability of estimates of the parameters of models fit to data from multistage samples that inform important public policy and clinical decisions in diverse areas such as education, health services, criminal justice, and drug abuse treatment and prevention. For some samples, however, the commonly used linearization methods can underestimate the statistical error in parameter estimates and result in undue confidence that parameters such as intervention effects are nonzero. The proposed methods will reduce the bias in estimates of error and improve inferences. Consequently, the proposed improvements should be widely applicable to applied research projects that serve society at large.

View original record on NSF Award Search →