Main Article Content
Missing data is a common problem in datasets that are obtained by administration of educational and psychological tests. It is widely known that existence of missing observations in data can lead to serious problems such as biased parameter estimates and inflation of standard errors. Most of the missing data imputation methods are focused on datasets containing continuous variables. However, it is very common to work with datasets that are made of dichotomous responses of individuals to a set of test items to which IRT models are fitted. This study compared the performances of missing data imputation methods that are IRT model-based imputation (MBI), Expectation-Maximization (EM), Multiple Imputation (MI), and Regression Imputation (RI). Parameter recoveries were evaluated by repetitive analyses that were conducted on samples that were drawn from an empirical large-scale dataset. Results showed that MBI outperformed other imputation methods in recovering item difficulty and mean of the ability parameters, especially with higher sample sizes. However, MI produced the best results in recovery of item discrimination parameters.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
International Journal of Assessment Tools in Education
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Allison, P. D. (2001). Missing data. Thousand Oaks, CA, US: Sage publications.
Bennett, D. A. (2001). How can i deal with missing data in my study? Australian and New Zealand journal of public health, 25(5), 464-469.
De Leeuw, E. D., Hox, J., & Husman, M. (2003). Prevention and treatment of item nonresponse. Journal of Official Statistics, 19(2), 153-176.
Dong, Y., & Peng, C. Y. J. (2013). Principled missing data methods for researchers. SpringerPlus, 2(1), 222.
Embretson, S. E., & Reise, S. P. (2000). Item response theory for psychologists. Mahwah, NJ: Lawrence Erlbaum Associates.
Enders, C. K. (2004). The impact of missing data on sample reliability estimates: Implications for reliability reporting practices. Educational and psychological measurement, 64(3), 419-436.
Enders, C. K. (2010). Applied missing data analysis. Guilford Press.
Finch, H. (2008). Estimation of item response theory parameters in the presence of missing data. Journal of Educational Measurement, 45(3), 225-245.
Finch, W. H. (2010). Imputation methods for missing categorical questionnaire data: A comparison of approaches. Journal of Data Science, 8(3), 361-378.
Glas, C. A., & Pimentel, J. L. (2008). Modeling nonignorable missing data in speeded tests. Educational and Psychological Measurement, 68(6), 907-922.
Graham J.W., Cumsille, P.E., & Elek-Fisk, E. (2003). Methods for handling missing data. In Research Methods in Psychology, ed. JA Schinka, WF Velicer, pp. 87–114. Volume 2 of Handbook of Psychology, ed. IB Weiner. New York: Wiley
Graham, J. W., Taylor, B. J., Olchowski, A. E., & Cumsille, P. E. (2006). Planned missing data
designs in psychological research. Psychological methods, 11(4), 323.
Hambleton, R. K., & Swaminathan, H. (1985). Item response theory: Principles and applications (Vol. 7). Boston, MA: Kluwer-Nijhoff Publishing.
Huisman, M., & Molenaar, I. W. (2001). Imputation of missing scale data with item response
models. In Essays on item response theory (pp. 221-244). Springer, New York, NY.
IBM. (2011). IBM SPSS statistics base 20. Chicago, IL: SPSS Inc.
IBM (2011). IBM SPSS missing values 20. Retrieved January 24, 2018, from https://www.csun.edu/sites/default/files/missing-values20-64bit.pdf
IBM (2014). IBM SPSS missing values 22. Retrieved January 24, 2018, from http://www.sussex.ac.uk/its/pdfs/SPSS_Missing_Values_22.pdf
Little, R. J. A. (1988). A test of missing completely at random for multivariate data with missing values. Journal of the American Statistical Association, 83, 1198–1202.
Little, R. J. A., & Rubin, D. B. (2002). Statistical analysis with missing data. Wiley. New York.
Little, R. J., & Schenker, N. (1995). Missing data. In Handbook of statistical modeling for the social and behavioral sciences (pp. 39-75). Springer US.
Mellenbergh, G. J. (2002). Measurement model-based imputation of missing item responses. Unpublished manuscript.
Pallant, J. (2007). SPSS survival manual (3rd ed.). New York, NY: Open University Press.
Partchev, I. (2016). Irtoys: simple interface to the estimation and plotting of IRT models. R package version 0.2.0
Peugh, J. L., & Enders, C. K. (2004). Missing data in educational research: A review of reporting practices and suggestions for improvement. Review of educational research, 74(4), 525-556.
R Core Team. (2016). R: A language and environment for statistical computing [Computer Software]. Vienna, Austria: R Foundation for Statistical Computing.
Royston, P. (2004). Multiple imputation of missing values. Stata journal, 4(3), 227-41.
Royston, P., & White, I. R. (2011). Multiple imputation by chained equations (MICE): implementation in Stata. Journal of Statistical Software, 45(4), 1-20.
Rubin, D. B. (1976). Inference and missing data. Biometrika, 63(3), 581-592.
Rubin, R. B. (1987). Multiple imputation for nonresponse in surveys (J Wiley & Sons, New York, NY).
Schafer, J. L. (1997). Analysis of incomplete multivariate data. CRC press.
Schafer, J. L. (1999). Multiple imputation: a primer. Stat Methods in Med 8(1), 3–15.
Schafer, J. L., & Graham, J. W. (2002). Missing data: our view of the state of the art. Psychological methods, 7(2), 147.
Schlomer, G. L., Bauman, S., & Card, N. A. (2010). Best practices for missing data management in counseling psychology. Journal of Counseling psychology, 57(1), 1.
Sijtsma, K., & Van der Ark, L. A. (2003). Investigation and treatment of missing item scores in test and questionnaire data. Multivariate Behavioral Research, 38(4), 505-528.
Smits, N., Mellenbergh, G. J., & Vorst, H. (2002). Alternative missing data techniques to grade point average: Imputing unavailable grades. Journal of Educational Measurement, 39(3), 187-206.
Stekhoven, D. J. (2016). MissForest: nonparametric missing value imputation using random forest. R package version 1.4.
Tabachnick, B. G., & Fidell, L. S. (2007). Using Multivariate Statistics. Pearson Education. Boston, MA.