Comparison of G and Phi Coefficients Estimated in Generalizability Theory with Real Cases

Authors

  • Kaan Zülfikar Deniz Ankara University
  • Emel Ilıcan Phd. student at Educational Measurement and Evaluation Department, Ankara University (Turkey)

Keywords:

reliability, generalizability theory, decision study, item difficulty index

Abstract

In this study, the aim was to compare the G and Phi coefficients as predicted by D studies for a measurement tool with the G and Phi coefficients obtained from real cases in which items of differing difficulty levels were added as well as to also determine the conditions under which the D studies predicted reliability coefficients closer to reality. The study group for this research consisted of 80 seventh grade students from various public and private secondary schools in Ankara, Istanbul and Adana in Turkey. A total of four raters who also served as Turkish teachers in various public secondary schools throughout Ankara were utilised in this study. A data collection tool consisting of 12 tasks was prepared to measure the written expression skills in Turkish of the participating seventh grade students. Equation of the G and Phi coefficients predicted in the D study and obtained through the real cases was observed only when six tasks with possessing item difficulty indexes close to the mean difficulty of the test were added in such a way that the mean difficulty of the test never changed. In other cases, where the mean difficulty of the test did change because of the addition of very easy or very difficult tasks, it was determined that the reliability coefficients estimated in the D study and obtained in real cases were similar but of different values.

References

Aiken, L., R. (2000). Psychological testing and assessment. Massachusetts, MA: Allyn and Bacon.

Anastasi, A. (1997). Psychological testing. New Jersey, NJ: Prentice-Hall Inc.

Ankenmann, R. D., & Stone, C. A. (1992, April). A monte carlo study of marginal maximum likelihood parameter estimates fort he graded model. Paper presented at the Annual Meeting of the Council on Measurement in Education, San Francisco, CA.

Atılgan, H., & Tezbaşaran, A. A. (2005). Genellenebilirlik kuramı alternatif karar çalışmaları ile senaryolar ve gerçek durumlar için elde edilen G ve Phi katsayılarının tutarlılığının incelenmesi [An investigation on consistency of G and Phi coefficients obtained by generalizability theory alternative decisions study for scenarios and actual cases]. Eurasian Journal of Educational Research, 18, 236-252.

Bıkmaz Bilgen, Ö., & Doğan, N. (2017). Çok kategorili parametrik ve parametrik olmayan madde tepki kuramı modellerinin karşılaştırılması [Comparison of polytomous parametric and nonparametric item response theory models]. Journal of Measurement and Evaluation in Education and Psychology, 8(4), 354-372.

Brennan, R. L. (2001). Generalizability theory. New York, NY: Springer-Verlag.

Choi, J., & Wilson, M. R. (2018). Modeling rater effects using a combination of generalizability theory and IRT. Psychological Test and Assessment Modeling, 60(1), 53-80.

Crocker, L., & Algina J. (1986). Introduction to classical and modern test theory. Orlando, FL: Harcourt Brace Jovanovich Inc.

Çakıcı Eser, D., & Gelbal, S. (2013). Genellenebilirlik kuramı ve lojistik regresyona dayalı hesaplanan puanlayıcılar arası tutarlılığın karşılaştırılması [Comparison of interrater agreement calculated with generalizability theory and logistic regression]. Kastamonu Education Journal, 21(2), 421-438.

Çakır, M., & Aldemir, B. (2011). İki aşamalı genetik kavramlar tanı testi geliştirme ve geçerlik çalışması [Developing and validating a two tier mendel genetics diagnostic test]. Mustafa Kemal University Journal of Social Sciences Institute, 8(16), 335-353.

Deliceoğlu, G., & Çıkrıkçı Demirtaşlı, N. (2012). Futbol yetilerine ilişkin dereceleme ölçeğinin güvenirliğinin genellenebilirlik kuramına ve klasik test kuramına dayalı olarak karşılaştırılması [The comparison of the reliability of the soccer abilities’ rating scale based on the classical test theory and generalizabilty theory]. Hacettepe Journal of Sport Sciences, 23(1), 1-12.

Demir, B. P. (2016). Vee diyagramından elde edilen puanların güvenirliğinin klasik test kuramı ve genellenebilirlik kuramına göre incelenmesi [The examination of reliability of vee diagrams according to classical test theory and generalizability theory]. Journal of Measurement and Evaluation in Education and Psychology, 7(2), 419-431.

Demirel, Ö., & Epçaçan, C. (2012). Okuduğunu anlama stratejilerinin bilişsel ve duyuşsal öğrenme ürünlerine etkisi [Effects of reading comprehension strategies on cognitive and affective learning outcomes]. Kalem International Journal of Education and Human Sciences, 2(1), 71-106.

Doğan, C. D., & Anadol, H. Ö. (2017). Genellenebilirlik kuramında tümüyle çaprazlanmış ve maddelerin puanlayıcılara yuvalandığı desenlerin karşılaştırılması [Comparing fully crossed and nested designs where items nested in raters in generalizability theory]. Kastamonu Education Journal, 25(1), 361-372.

Doğan, N., & Bıkmaz Bilgen, Ö. (2017). Using generalizability theory in reliability estimation of measurements of higher-order cognitive skills. The Journal of Academic Social Science, 44, 1-9.

Giray, M. D., & Sahin, D. N. (2012). Algılanan örgütsel, yönetici ve çalışma arkadaşları desteği ölçekleri: Geçerlik ve güvenirlik çalışması [Perceived organizational, supervisor and co-worker support scales: A study for validity and reliability]. Turkish Psychological Articles, 15(30), 1-9.

Güler, M., & Yetim, Ü. (2008). Ebeveyn rolüne ilişkin kendilik algısı ölçeği: Geçerlik ve güvenirlik çalışması [Self-perception of parental role (SPPR) scale: Validity and reliability study]. Turkish Psychological Articles, 11(22), 34-43.

Güler, N. (2011). Rasgele veriler üzerinde genellenebilirlik kuramı ve klasik test kuramına göre güvenirliğin karşılaştırılması [The comparison of reliability according to generalizability theory and classical test theory on random data]. Education and Science, 36(162), 225-234.

Güler, N., Eroğlu, Y., & Akbaba, S. (2014). Genellenebilirlik kuramına göre ölçüt bağımlı ölçme araçlarında güvenirlik: Yemek yeme becerileri örneğinde bir uygulama [Reliability of criterion-dependent measurement tools according to generalizability theory: Application in the case of eating skills]. Abant İzzet Baysal University Journal of Faculty of Education, 14(2), 217-232.

Güler, N., Kaya Uyanık, G., & Taşdelen Teker, G. (2012). Genellenebilirlik kuramı [Generalizability theory]. Ankara, Turkey: Pegem.

Gülle, A., Uzun, N. B., & Akay, C. (2018). Ortaokul öğrencilerine yönelik blok flüt icra performansı dereceli puanlama anahtarının güvenirliğinin genellenebilirlik kuramı ile incelenmesi [The study on the reliability of the grading key measuring the performance of the block flute performance of the secondary school students via generalizability theory]. Elementary Education Online, 17(3), 1463-1475.

Hathcoat, J. D., & Penn, J. D. (2012). Generalizability of student writing across multiple tasks: A challenge for authentic assessment. Research & Practice in Assessment, 7, 16-28.

Hill, H. C., Charalambous, C. Y., & Kraft, M. A. (2012). When rater reliability is not enough: Teacher observation systems and a case for the generalizability study. Educational Researcher, 41(2), 56-64.

Hulin, C. L., Lissak, R. I., & Drasgow, F. (1982). Recovery of two and three parameter logistic item characteristic curves: A monte carlo study. Applied Psychological Measurement, 6, 249-260.

Kamış, Ö., & Doğan, C. D. (2017). Genellenebilirlik kuramında gerçekleştirilen karar çalışmaları ne kadar kararlı? [How consistent are decision studies in g theory?]. Gazi University Journal of Gazi Educational Faculty, 37(2), 591-610.

Kaplan, A., & Duran, M. (2016). Ortaokul öğrencilerine yönelik matematiksel üstbiliş farkındalık ölçeği: Geçerlik ve güvenirlik çalışması [Mathematical metacognition awareness inventory towards middle school students: Validity and reliability study]. Journal of Kazım Karabekir Education Faculty, 32, 1-17.

Karlsson, J. (2017). Generalizability theory and a scale measuring emotion knowledge in preschool children (Master's thesis). Retrieved from http://www.diva-portal.org/smash/get/diva2:1065849/FULLTEXT01.pdf

Katrancı, M., & Yangın, B. (2012). Üstbiliş stratejileri öğretiminin dinlediğini anlama becerisine ve dinlemeye yönelik tutuma etkisi [Effects of teaching metacognition strategies to listening comprehension skills and attitude toward listening]. Adiyaman University Journal of Social Sciences, 2013(11), 733-771.

Kaya, A. (2005). Çocuklar için yalnızlık ölçeğinin Türkçe formunun geçerlik ve güvenirlik çalışması [The validity and reliability study of the Turkish version of the children`s loneliness scale]. Eurasian Journal of Educational Research, 19, 220-237.

Kenny, D.A. (1987). Statistics for the social and behavioral science. Boston, MA: Little, Brown.

Scherbaum, C., Dickson, M., Larson, E., Bellenger, B., Yusko, K., & Goldstein, H. (2018). Creating test score bands for assessments involving ratings using a generalizability theory approach to reliability estimation. Personnel Assessment and Decisions, 4(1), 1-8. doi:10.25035/pad.2018.001

Solano-Flores, G., & Li, M. (2013). Generalizability theory and the fair and valid assessment of linguistic minorities. Educational Research and Evaluation, 19, 245-263. doi:10.1080/13803611.2013.767632

Tavşancıl, E. (2005). Tutumların ölçülmesi ve SPPS ile veri analizi [Measurement of attitudes and data analysis with SPPS]. Ankara, Turkey: Nobel.

Shavelson, J., & Webb, N. M. (1991). Generalizability theory: A primer. Newbury Park, CA: Sage.

Yılmaz Nalbantoğlu, F., & Gelbal, S. (2011). İletişim becerileri istasyonu örneğinde genellenebilirlik kuramıyla farklı desenlerin karşılaştırılması [Comparison of different designs in accordance with the generalizability theory in communication skills example]. Hacettepe University Journal of Education, 41, 509-518.

Published

2021-08-25

How to Cite

Deniz, K. Z. ., & Ilıcan, E. (2021). Comparison of G and Phi Coefficients Estimated in Generalizability Theory with Real Cases. International Journal of Assessment Tools in Education, 8(3), 583-595. Retrieved from https://ijate.net/index.php/ijate/article/view/30