Efficient Estimation in a Regression Model with Missing Responses
MetadataShow full item record
This article examines methods to efficiently estimate the mean response in a linear model with an unknown error distribution under the assumption that the responses are missing at random. We show how the asymptotic variance is affected by the estimator of the regression parameter and by the imputation method. To estimate the regression parameter the Ordinary Least Squares method is efficient only if the error distribution happens to be normal. If the errors are not normal, then we propose a One Step Improvement estimator or a Maximum Empirical Likelihood estimator to estimate the parameter efficiently. In order to investigate the impact that imputation has on estimation of the mean response, we compare the Listwise Deletion method and the Propensity Score method (which do not use imputation at all), and two imputation methods. We show that Listwise Deletion and the Propensity Score method are inefficient. Partial Imputation, where only the missing responses are imputed, is compared to Full Imputation, where both missing and non-missing responses are imputed. Our results show that in general Full Imputation is better than Partial Imputation. However, when the regression parameter is estimated very poorly, then Partial Imputation will outperform Full Imputation. The efficient estimator for the mean response is the Full Imputation estimator that uses an efficient estimator of the parameter.