New Approaches in Testing Common Assumptions for Regressions with Missing Data

Date

2014-07-30

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

We consider both nonparametric regression and heteroskedastic nonparametric regression models with multivariate covariates and with responses missing at random. The regression function is estimated using a local polynomial smoother, and, when necessary, the scale function is estimated using a combination of local polynomial smoothers. It is shown, for both regression models, that suitable residual-based empirical distribution functions using only the complete cases, i.e. residuals that can actually be constructed from the data, are efficient in the sense of H?jek and Le Cam. In our proofs we derive, more generally, the efficient influence function for estimating an arbitrary linear functional of the error distribution; this covers the distribution function as a special case. Our estimators are shown to admit functional central limit theorems. We do this by applying the transfer principle for complete case statistics, which makes it possible to adapt known results for fully observed data to the case of missing data. Then, we use these residual-based empirical distribution functions to test for normal errors using a martingale transform approach. Small simulation studies are conducted to investigate the performance of these tests. Our results, for the homoskedastic model, show the proposed approach to be comparable to one based on imputation, and, for the heteroskedastic model, the results are sensitive to the estimate of the scale function. Finally, we construct a test for heteroskedasticity using residuals from a nonparametric regression. The approach uses a weighted empirical process and only the completely observed data, and is shown to perform well in certain scenarios. All of the tests considered here are asymptotically distribution free, which means inference based on them does not depend on unknown parameters.

Description

Citation