The Dif Identification in Constructed Response Items Using Partial Credit Model

Heri Retnawati


The study was to identify the load, the type and the significance of differential item functioning (DIF) in constructed response item using the partial credit model (PCM). The data in the study were the students’ instruments and the students’ responses toward the PISA-like test items that had been completed by 386 ninth grade students and 460 tenth grade students who had been about 15 years old in the Province of Yogyakarta Special Region in Indonesia. The analysis toward the item characteristics through the student categorization based on their class was conducted toward the PCM using CONQUEST software. Furthermore, by applying these items characteristics, the researcher draw the category response function (CRF) graphic in order to identify whether the type of DIF content had been in uniform or non-uniform. The significance of DIF was identified by comparing the discrepancy between the difficulty level parameter and the error in the CONQUEST output results. The results of the analysis showed that from 18 items that had been analyzed there were 4 items which had not been identified load DIF, there were 5 items that had been identified containing DIF but not statistically significant and there were 9 items that had been identified containing DIF significantly. The causes of items containing DIF were discussed.


DIF, polytomous data, partial credit model,

Full Text:



Acara, T. (2011). Sample size in differential item functioning: An application of hierarchical linear modeling. Kuramve Uygulamada Eğitim Bilimleri (Educational Sciences: Theory & Practice), 11(1), 284-288.

Adams, R.J. (1992). Item Bias. In Keeves, J.P. (Ed), The IEA technical handbook (pp. 177-187). The Hague: The International Association for the Evaluation of Educational Achiement (IEA).

Adams, R., & Wu, M. (2010). Differential Item Functioning. Retrieved from

Akour, M., Sabah, S., & Hammouri, H. (2015). Net and global differential item fuctioning in PISA polytomously scored science items: application of the differential step functioning framework. Journal of Psychoeducational Assessment. 33(2), 166-176.

Budiono, B. (2004). Perbandingan metode Mantel-Haenszel, sibtest, regresi logistik, dan perbedaan peluang dalam mendeteksi keberbedaan fungsi butir. Dissertasion. Universitas Negeri Yogyakarta, Indonesia.

Bulut, O., & Suh, Y. (2017). Functioning with the multiple indicators multiple causes model, the item response theory likelihood ratio test, and logistic regression. Frontiers in Education, October 2017, 1-14.

Camilli, G., & Shepard, L.A. (1994). Methods for identifying bias test items. Thousand Oaks, CA: Sage Publication.

Da Costa, P.D., & Araujo, L. (2012). Differential item functioning (DIF): What function differently for Immigrant students in PISA 2009 reading items? JRC Scientific and Policy Reports. Luxembourg: European Commission.

Elosua, P., & Wells, C. S. (2013). Detecting dif in polytomous items using MACS, IRT and ordinal logistic regression. Psicológica, 34(2), 327-34

Hambleton, R. K. & Swaminathan, H. (1985). Item response theory. Boston, MA: Kluwer Inc.

Hambleton, R.K., Swaminathan, H. & Rogers, H.J. (1991). Fundamental of item response theory. Newbury Park, CA: Sage Publication Inc.

Holland, P.W. & Thayer, D.T. (1988). Differential Item Performance and the Mantel-Haenszel Procedure. In Wainer, Howard; Braun, Henry I. (eds.) Test Validity (p p129-145). Hillsdale, NJ: Lawrence Erlbaum.

Jailani, J., Retnawati, H., Musfiqi, S., Arifin, Z., Riadi, A., Susanto, E., Wulandari, N. F. (2015). Pengembangan perangkat pembelajaran berbasis higher order thinking skills. Research Report. LPPM Universitas Negeri Yogyakarta.

Kartowagiran, B. & Retnawati, H. (2008). Pengembangan mengembangkan metode pendeteksian keberfungsian butir pembeda (differential item functioning, DIF) multidimensi. Laporan Penelitian. Lembaga Penelitian Universitas Negeri Yogyakarta.

Kementerian Pendidikan dan Kebudayaan Republik Indonesia. (2016). Peraturan Menteri Pendidikan dan Kebudayaan Republik Indonesia Nomor 21 tahun 2016 tentang Standar Isi. [Ministry of National Education of Republik Indonesia. (2016). Regulation of Ministry of National Education of Republik Indonesia No 22 Year 2006 about Content Standard in Education.]

Kementerian Pendidikan Nasional Republik Indonesia. (2006). Peraturan Menteri Pendidikan Nasional Nomor 22 tahun 2006 tentang Standar Isi. [Ministry of National Education of Republik Indonesia. (2006). Regulation of Ministry of National Education of Republik Indonesia No. 22 Year 2006 about Content Standard in Education.]

Khalid, M.N., & Glass, C.A.W. (2013). A step-wise method for evaluation of differential item functioning. Journal of Applied Quantitative Methods, 8(2), 25-47.

Lyons-Thomas, J., Sandilands, D., & Ercikan, K. (2014). Gender differential item functioning in mathematics in four international jurisdictions. Eğitim ve Bilim (Education and Science) 39(172), 20-32.

Masters, G.N. (2010). The partial credit model. In Nering, M.L., & Ostini, R. (Eds). Handbook of ıtem response theory models. New York: Routlegde.

Mazor, K. M., Kanjee, A., & Clauser, B. (1995) Using logistic regression and Maentel-Haenszel with multiple ability estimates to detect differential item functioning. Journal of Educational Measurement, 32 (2), 131-144.

Muraki, E., & Bock, R.D. (1997). Parscale 3: IRT based test scoring and item analysis for graded items and rating scales. Chicago: Scientific Software.

OECD. (2014). PISA 2012 results: what students know and can do - student performance in mathematics, reading and science. Paris: OECD Publishing.

Ogbebor, U., & Onuka, A. (2013). Differential item functioning method as an item bias indicator. Educational Research. 4(4), 367-373.

Osterlind, S.J. (1983). Test item bias. Beverly Hills, CA: Sage Publications Inc.

Plake, B.S., Patience, W.M., & Whitney, D. R. (1988). Differential item performance in mathematics achievement test items: Effect of item arrangement. Educational and Psychological Measurement, 48(4), 885-894.

Retnawati, H. (2003). Keberfungsian butir diferensial pada perangkat tes seleksi masuk smp mata pelalajaran matematika. Jurnal Penelitian dan Evaluasi Pendidikan, 5(6), 45-58.

Retnawati, H. (2013). Pendeteksian keberfungsian butir pembeda dengan indeks volume sederhana berdasarkan teori respons butir multidimensi. Jurnal Penelitian dan Evaluasi Pendidikan, 17(2), 275-286.

Retnawati, H. (2014). Teorı respons butir dan penerapannya. Yogyakarta: Parama.

Salehi, M. & Tayebi, A. (2012). Differential Item Functioning: Implications for Test Validation. Journal of Language Teaching and Research, 3(1), 84-92.

Sumarno, S., Sumardiningsih, S., Muhson, A., Retnawati, H., & Basuki, A. (2013). Faktor yang mempengaruhi menurunnya capaian siswa pada Ujian Nasional 2013. Laporan Penelitian. Direktorat PSMP Kementerian Pendidikan Republik Indonesia. [Sumarno, S; Sumardiningsih, S; Muhson, A.; Retnawati, H.; Basuki, A. (2013). Factors affecting students achievement in national examination 2013. Research report. Directorate of Secondary School of Ministry Education Office of Republik Indonesia.]

Thissen, D., Steinberg, L., & Wainer, H. (1993). Detection of differential item functioning using the parameters of item response models. In P. W. Holland & H. Wainer (Eds.), Differential item functioning (pp. 67-113). Hillsdale NJ: Erlbaum.

Varberg, D. & Purchell, E.J. (2001). Calculus (Kalkulus, translated by Susia, I.N.). Bandung: Interaksara.

Wang, W.C., Yeh, Y.L., & Yi, C. (2003). Effect of anchor item methods on differential item functioning detection with the likelihood ratio test. Applied Psychological Measurement. 27(6), 479-498.

Wu, M.L., Adams, R.J., & Wilson, M.R. (1997). ConQuest: Multi-aspect test software [computer program]. Camberwell: Australian Council for Educational Research.


  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution 3.0 License.