TY - JOUR
T1 - Inference on regressions with interval data on a regressor or outcome
AU - Manski, Charles
AU - Tamer, Elie
PY - 2002/1/1
Y1 - 2002/1/1
N2 - This paper examines inference on regressions when interval data are available on one variable, the other variables being measured precisely. Let a population be characterized by a distribution P(y, x, v, v0, v1), where y ε R1, x ε Rk, and the real variables (v, v0, v1) satisfy v0 ≤ v ≤ v1. Let a random sample be drawn from P and the realizations of (y, x, v0, v1) be observed, but not those of v. The problem of interest may be to infer E(y|x, v) or E(v|x). This analysis maintains Interval (I), Monotonicity (M), and Mean Independence (MI) assumptions: (I) P(v0 ≤ v ≤ v1) = 1; (M)E(y|x, v) is monotone in v; (MI) E(y|x, v, v0, v1) = E(y|x, v). No restrictions are imposed on the distribution of the unobserved values of v within the observed intervals [v0, v1]. It is found that the IMMI Assumptions alone imply simple nonparametric bounds on E(y|x, v) and E(v|x). These assumptions invoked when y is binary and combined with a semiparametric binary regression model yield an identification region for the parameters that may be estimated consistently by a modified maximum score (MMS) method. The IMMI assumptions combined with a parametric model for E(y|x, v) or E(v|x) yield an identification region that may be estimated consistently by a modified minimum-distance (MMD) method. Monte Carlo methods are used to characterize the finite-sample performance of these estimators. Empirical case studies are performed using interval wealth data in the Health and Retirement Study and interval income data in the Current Population Survey.
AB - This paper examines inference on regressions when interval data are available on one variable, the other variables being measured precisely. Let a population be characterized by a distribution P(y, x, v, v0, v1), where y ε R1, x ε Rk, and the real variables (v, v0, v1) satisfy v0 ≤ v ≤ v1. Let a random sample be drawn from P and the realizations of (y, x, v0, v1) be observed, but not those of v. The problem of interest may be to infer E(y|x, v) or E(v|x). This analysis maintains Interval (I), Monotonicity (M), and Mean Independence (MI) assumptions: (I) P(v0 ≤ v ≤ v1) = 1; (M)E(y|x, v) is monotone in v; (MI) E(y|x, v, v0, v1) = E(y|x, v). No restrictions are imposed on the distribution of the unobserved values of v within the observed intervals [v0, v1]. It is found that the IMMI Assumptions alone imply simple nonparametric bounds on E(y|x, v) and E(v|x). These assumptions invoked when y is binary and combined with a semiparametric binary regression model yield an identification region for the parameters that may be estimated consistently by a modified maximum score (MMS) method. The IMMI assumptions combined with a parametric model for E(y|x, v) or E(v|x) yield an identification region that may be estimated consistently by a modified minimum-distance (MMD) method. Monte Carlo methods are used to characterize the finite-sample performance of these estimators. Empirical case studies are performed using interval wealth data in the Health and Retirement Study and interval income data in the Current Population Survey.
KW - Identification
KW - Interval data
KW - Regression
UR - http://www.scopus.com/inward/record.url?scp=0036215568&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0036215568&partnerID=8YFLogxK
U2 - 10.1111/1468-0262.00294
DO - 10.1111/1468-0262.00294
M3 - Article
AN - SCOPUS:0036215568
SN - 0012-9682
VL - 70
SP - 519
EP - 546
JO - Econometrica
JF - Econometrica
IS - 2
ER -