Investigating the Impact of Missing Data Handling Methods on the Detection of Differential Item Functioning

Main Article Content

Hüseyin Selvi Devrim Ozdemir Alıcı


In this study, it is aimed to investigate the impact of different missing data handling methods on the detection of Differential Item Functioning methods (Mantel Haenszel and Standardization methods based on Classical Test Theory and Likelihood Ratio Test method based on Item Response Theory). In this regard, on the data acquired from 1046 candidates who entered to Foreign National Student Exam (FNSE) held in year 2016 by Mersin University (MEU) and answered Basic Skills subtest, using different missing data handling methods, differential item functioning analyses with Mantel Haenszel, Standardization and Likelihood Ratio Test methods are performed. Basic Skills test consists of 80 multiple choice items. The items are all binary scored (1-0) items. Among the participants 523 are female and 523 are male. The findings showed that the number of items flagged as DIF has changed with the used missing data handling methods. The DIF detection methods based on Classical Test Theory are more consistent within themselves compared to DIF detection method based on Item Response Theory, whereas the used missing data handling methods differentiate the DIF detected items and this difference reaches a significant level for Mantel Haenszel method

Article Details

How to Cite
Selvi, H., & Alıcı, D. (2017). Investigating the Impact of Missing Data Handling Methods on the Detection of Differential Item Functioning. International Journal of Assessment Tools in Education, 5(1). Retrieved from
Author Biographies

Hüseyin Selvi, Mersin University, Medical Faculty, Medical Education Department

Medical Eucation Department, Asssist. Prof.

Devrim Ozdemir Alıcı, (Assoc Prof) Mersin University, Faculty of Education, Department of Measurement and Evaluation in Education

(Assoc Prof) Mersin University, Faculty of Education,
Department of Measurement and Evaluation in Education


Abedlazeez, N. (2010). Exploring DIF:comparison of CTT and IRT methods. International Journal of Sustainable Development, 7 (1), 11-46.

Allison, P. D. (2002). Missing data. Sage Publication Inc. California.

Alpar, R. (2011). Uygulamalı çok değişkenli istatistiksel yöntemler. Detay Yayıncılık. Ankara.

Angoff, W.H. (1993). Perspectives On Differential Item Functioning Methodology. Holland ve Wainer (Ed.), Differential Item Functioning. Lawrence Erlbaum Associates Publishers, New Jersey.

Banks, K., & Walker, C. (2006). Performance of SIBTEST when focal group examinees have missing data. Paper presented at the annual meeting of the National Council of Measurement in Education, San Francisco, CA.

Banks, K. (2015). An introduction to missing data in the context of differential item functioning. Practical Assessment, Research & Evaluation. 20(12).

Bennett, D. A. (2001). How can I deal with missing data in my study? Australian and New Zealand Journal of Public Health, (25), 464–469.

Bernhard, J., Celia, D.F., Coates, A.S. (1998). Missing quality of life data in cancer clinical trials: Serious problems and challenges. Statistics in Medicine, 17, 517-532.

Camili, G., Shepard, L.A. (1994). Methods for identifying biased test items. Sage Publication, London.

Demir, E., Parlak, B. (2012). Türkiye’de eğitim araştırmalarında kayıp veri sorunu. Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi, 3(1), 230-241.

Demir, E. (2013). Kayıp verilerin varlığında çoktan seçmeli testlerde madde ve test parametrelerinin kestirilmesi: SBS örneği [Item and test parameters estimations for multiple choice tests in the presence of missing data: The case of SBS]. Eğitim Bilimleri Araştırmaları Dergisi - Journal of Educational Sciences Research, 3 (2), 47–68.

Dişçi, R. (2012). Temel ve klinik biyoistatistik. İstanbul Tıp Kitapevi.
Doğan, N. ve Öğretmen, T. (2008). Değişen Madde Fonksiyonunu belirlemede Mantel–Haenszel, Ki-Kare ve Lojistik Regresyon tekniklerinin karşılaştırılması. Eğitim ve Bilim, 33(148).

Embretson, S.E., ve Reise, S.P. (2000). Item Response Theory for Psychologists. Lawrence Erlbaum Associates, Publishers, London.

Emenogu, B. C., Falenchuck, O., & Childs, R. A. (2010). The effect of missing data treatment on Mantel-Haenszel DIF detection. The Alberta Journal of Educational Research, 56(4), 459-469.

Falenchuk, O., & Herbert, M. (2009). Investigation of differential non-response as a factor affecting the results of Mantel-Haenszel DIF detection. Paper presented at the annual meeting of the American Educational Research Association, San Diego, CA.

Finch, W.H. (2011). The ımpact of missing data on the detection of nonuniform differential ıtem functioning. Educational and Psychological Measurement, 71(4) 663–683.

Garrett, P. L. (2009). A monte carlo study investigating missing data, differential item functioning, and effect size (Unpublished doctoral dissertation). Georgia State University.

Gelin, M.N. ve Zumbo, B.D. (2003). Differential item functioning results may change depending on how an item is scored: an illustration with the center for epidemiologic studies depression scale. Educational and Psychological Measurement, DOI: 10.1177/0013164402239317.

Gierl, M.J., Jodoin, M.G., Ackerman, T.A. (2000). Performance of Mantel-Haenszel, Simultaneous Item Bias Test, and Logistic Regression When the Proportion of DIF Items is Large. Annual Meeting of the American Educational Research Association (AERA).

Gonzales, A., Padilla, J.L., Dolores, H., Gomez-Benito, J., Benitez, I. (2010). EASY-DIF: Software for analyzing differential ıtem functioning using the Mantel-Haenszel and Standardization procedures. Applied Psychological Measurement. doi:10.1177/0146621610381489.

Graham, J.W. (2009). Missing Data Analysis: Making it work in the real world. Annual Review of Psychology, 60(4), 549-576.

Groves, R. M. (2006). Nonresponse rates and nonresponse bias in household surveys. Public Opinion Quarterly, 70 (5), 646-675.

Gözen Çıtak, G. (2007). Klasik Test Ve Madde-Tepki Kuramlarına Göre Çoktan Seçmeli Testlerde Farklı Puanlama Yöntemlerinin Karşılaştırılması. Doktora Tezi, Ankara Üniversitesi, Ankara

Hambletton, R.K. ve Swaminathan, H. (1985). Item Response Theory: Principles and Applications. Kluwer-Nijhoff Publishing, Boston.

Harwell, M. Stone, C. A., Hsu, T.C.,& Kirisci, L. (1996). Monte carlo studies in item response theory. Applied Psychological Measurement, 20, 101-125.

Hohensinn, C. & Kubinger K. D. (2011). On the impact of missing values on item fit and the model validness of the Rasch model. Psychological Test and Assessment Modeling, 53, 380-393.

Kan, A., Sünbül, Ö., Ömür, S. (2013). 6.- 8. Sınıf Seviye Belirleme Sınavları alt testlerinin çeşitli yöntemlere göre değişen madde fonksiyonlarının incelenmesi. Mersin Üniversitesi Eğitim Fakültesi Dergisi, 9(2), 207-222.

Kothari, C.R. (2004). Research Methodology: Methods and Techniques (Second Revised Edition). New Delhi: New Age Int. Ltd.

Kristanjansonn E., R. Aylesworth, I. McDowell & B.D. Zumbo (2005). A Comparison of four methods for detecting differeantial ıtem functioning ın ordered response model. Educational and Psychological Measurement. 65(6), 935-953.

Little, R. J. A & Rubin, D. B. (1987). Statistical analysis with missing data, 2nd ed. John Wiley & Sons, Inc. New York.

Lord, F. M. (1974). Estimation of latent ability and item parameters when there are omitted responses. Psychometrika, 39, 247-264.

Lord, F. M. (1980). Aplications of item response theory to practical testing problems. New Jersey: Lawrence Erlbaum Associates.

Molenberghs, G., Kenward, M.G. (2007). Missing data in clinical studies. 1 st ed, John Wiley&Sons, England.

Narayanan, P., Swaminathan, H. (1994). Performance of the Mantel-Haenszel and Simultaneous Item Bias procedures for detecting differential ıtem functioning, Applied Psychological Measurement, 18(4).

Osterling, S.J. (1983). Test İtem Bias. Sage Publication, London.

Padilla, J.L., Hidalgo, J.L., Benitez, I., Gomez-Benito, J. (2012).
Comparison of three software programs for evaluating DIF by means of the Mantel-Haenszel procedure; EASY DIF, DIFAS and EZDIF, Psicologica, 33,135-156.

Peng, C.Y.J., Harwell, M., Liou, S.M., Ehman, L. H. (2006). Advances in missing data methods and implications for educational research. In S. Sawilowsky (Ed.), Real data analysis, Greenwich.

Peng, C. J., & Zhu, J. (2008). Comparison of two approaches for handling missing covariates in logistic regression. Educational and Psychological Measurement, 68(1), 58-77.

Pigott, T.D. (2001). A review of methods for missing data. Educxational Research And Evaluation: 7(4); 353-383.

Robitzsch, A, & Rupp, A.A. (2009). Impact of missing data on the detection of differential ıtem functioning the case of mantel-haenszel and logistic regression analysis. Educational and Psychological Measurement, 69;(1): 18-34.

Rousseau, M., Bertrand, R., & Boiteau, N. (2006, April). Impact of missing data treatment on the efficiency of DIF methods. Paper presented at the annual meeting of the National Council on Measurement in Education, San Francisco, CA.

Royce, S., Straits, B.C., Straits, M.M. (1993). approaches to social research, 2nd Ed. Oxford University Press. New York.

Rubin, D. B. (1976). Inference and missing data. Biometrika, 63 (3), 581-592.

Schafer, J. L. (1999). Multiple imputation: A primer. Statistical Methods in Medical Research,(8):3-15

Sedivy, S. K., Zhang, B., & Traxel, N. M. (2006). Detection of differential item functioning with polytomous items in the presence of missing data. Paper presented at the annual meeting of the National Council of Measurement in Education, San Francisco, CA.

Selvi, H. (2013). Klasik test ve madde tepki kuramlarına dayalı değişen madde fonksiyonu belirleme tekniklerinin farklı puanlama durumlarında incelenmesi. Yayınlanmamış Doktora Tezi. Mersin Üniversitesi Eğitim Bilimleri Enstitüsü.

Singh, Y.K. (2006). Fundamental of research methodology and statistics. New Delhi: New Age Int. Ltd.

Spray, J., ve Miller, T. (1994). Identifying nonuniform DIF in polytomously scored test items. American College Testing Research Report Series 94-1. Iowa City, IA: American College Testing Program.

Ward, W.C., Bennett, R.E. (2012). Construction Versus Choice in Cognitive Measurement: Issues in Constructed Response, Performance Testing, and Portfolio Assessment. Routledge,Taylor ve Francis Group, London and New York.

Woodward, M., Smith, W.C., Tunstall Pedoe H. (1991). Bias from missing values: Sex differences in implication of failed venepuncture for the Scottish Health Study.Int J Epidemiol.

Wu, A. D., Li, Z., & Zumbo, B. D. (2007). Decoding the meaning of factorial invariance and updating the practice of multi-group confirmatory factor analysis: A demonstration with TIMSS data. Practical Assessment, Research & Evaluation, 12(3), 1-26.

Zumbo, B. D. (1999). A Handbook on the Theory and Methods of Differential Item Functioning (DIF): Logistic Regression Modeling as a Unitary Framework for Binary and Likert-type (Ordinal) Item Scores.Ottawa ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.