Inference in Nonlinear Models with Endogeneity

$243,300FY2011SBENSF

Duke University, Durham NC

Investigators

Abstract

The proposed research involves developing new inference procedures for a variety of non-linear models with cross sectional or panel data. The models discussed, such as the binary choice and Roy model have seen widespread use in empirical work. The proposed activity can be divided into three parts. The first pertains to panel data versions of models with self selection. Self selection models enable the econometrician to control for optimal decisions of the economic agent. For example, observed wages should reflect that the wage offered to an individual in one sector exceeds the wage offered in all other sectors. Panel data models, where an agent's outcomes are observed over multiple time periods, have become increasingly popular in empirical research. The increased availability of longitudinal panel data sets has presented new opportunities for econometricians to control for individual unobserved heterogeneity across agents. Important work in nonlinear panel data models is surveyed in (Arellano and Honore (2001)). However, there is very little work in the area of panel data for models with self selection, and the proposed research aims to address this. Inference methods are proposed under both stationary and nonstationary conditions. The former refers to an assumption that unobserved components of individuals have the same distribution over time. The latter relaxes this assumption but imposes that unobserved components for different individuals in the cross section have the same distribution in the same time period. In both cases the new methods are able to estimate sharp sets for parameter of interest, such as the slope of a labor supply curve. A sharp set refers to the smallest set that can be obtained when the data satisfies the assumptions of the econometric model. The second part of this proposal pertains to cross sectional binary choice models with discrete endogenous covariates. Such models arise frequently in the treatment effect literature, where the endogenous variable is often the treatment status, and the outcome variable is binary, such as employment status. A parameter that is often of interest in these situations is the coefficient on treatment in a regression framework. Two approaches to identifying such a parameter that have been considered in the literature are the control function and the instrumental variable methods. The proposed activity here is to establish a relation between the two methods. In particular, a theorem is established for a control function model which demonstrates how difficult it is to conduct inference on the treatment effect parameter of interest. This is analogous to the theorem in (Khan and Tamer (2010)) for the instrumental variable model. Consequently, inference becomes nonstandard and so new inference methods are proposed. The third part is about establishing optimality results for a wide class of cross sectional censored regression models with self selection, such as the Roy model. First, conditions that ensure point identification of the parameters of interest are considered, such as independence, or support conditions, and efficiency bounds are derived. Point identification refers to the sharp set reducing to a single value. Efficiency bounds refer to the smallest attainable variance for an estimation procedure under the assumptions of the econometric model. The usefulness of such bounds is twofold - for one it will enable measuring the relative efficiency of methods that are adopted in practice, and second it will suggest new estimation procedures which attain the bound. References Arellano, M., and B. Honore (2001): "Panel Data Models: Some Recent Developments," Handbook of econometrics. Volume 5, pp. 3229-96. Khan, S., and E. Tamer (2010): "Irregular Identification, Support Conditions and Inverse Weight Estimation," Econometrica, forthcoming.

View original record on NSF Award Search →