Examining the invariance of item and person parameters estimated from multilevel measurement models when distribution of person abilities are non-normal
Abstract
Multilevel measurement models (MMM), an application of hierarchical generalized linear models (HGLM), model the relationship between ability levels estimates and item difficulty parameters, based on examinee responses to items. A benefit of using MMM is the ability to include additional levels in the model to represent a nested data structure, which is common in educational contexts, by using the multilevel framework. Previous research has demonstrated the ability of the one-parameter MMM to accurately recover both item difficulty parameters and examinee ability levels, when using both 2- and 3-level models, under various sample size and test length conditions (Kamata, 1999; Brune, 2011). Parameter invariance of measurement models, that parameter estimates are equivalent regardless of the distribution of the ability levels, is important when the typical assumption of a normal distribution of ability levels in the population may not be correct. An assumption of MMM is that the distribution of examinee abilities, which is represented by the level-2 residuals in the HGLM, is normal. If the distribution of abilities in the population are not normal, as suggested by Micceri (1989), this assumption of MMM is violated, which has been shown to affect the estimation of the level-2 residuals. The current study investigated the parameter invariance of the 2-level 1P-MMM, by examining the accuracy of item difficulty parameter estimates and examinee ability level estimates. Study conditions included the standard normal distribution, as a baseline, and three non-normal distributions having various degrees of skew, in addition to various test lengths and sample sizes, to simulate various testing conditions. The study's results provide evidence for overall parameter invariance of the 2-level 1P-MMM, when accounting for scale indeterminacy from the estimation process, for the study conditions included. Although, the error in the item difficulty parameter and examinee ability level estimates in the study were not of practical importance, there was some evidence that ability distributions may affect the accuracy of parameter estimates for items with difficulties greater than represented in this study. Also, the accuracy of abilities estimates for non-normal distributions seemed less for conditions with greater test lengths and sample sizes, indicating possible increased difficulty in estimating abilities from non-normal distributions.