What You might not be Assessing through a Multiple Choice Test Task

Main Article Content

Burcu Kayarkaya Aylin Ünaldı https://orcid.org/0000-0003-4119-6700


Comprehending a text involves constructing a coherent mental representation of it and deep comprehension of a text in its entirety is a critical skill in academic contexts. Interpretations on test takers’ ability to comprehend texts are made on the basis of performance in test tasks but the extent to which test tasks are effective in directing test takers towards reading a text to understand the whole of it is questionable. In the current study, tests based on multiple choice items are investigated in terms of their potential to facilitate or preclude cognitive processes that lead to higher level reading processes necessary for text level macrostructure formation. Participants’ performance in macrostructure formation after completing a multiple choice test and a summarization task were quantitatively and qualitatively analyzed. Task performances were compared and retrospective verbal protocol data were analyzed to categorize the reading processes the participants went through while dealing with both tasks. Analyses showed that participants’ performance in macrostructure formation of the texts they read for multiple choice test completion and summarization task differed significantly and that they were less successful in comprehending the text in its entirety when they were asked to read to answer multiple choice questions that followed the text. The findings provided substantial evidence of the inefficacy of the multiple choice test technique in facilitating test takers’ macrostructure formation and thus pointed at yet another threat to the validity of this test technique.

Article Details

How to Cite
Kayarkaya, B., & Ünaldı, A. (2020). What You might not be Assessing through a Multiple Choice Test Task. International Journal of Assessment Tools in Education, 7(1), 98-113. Retrieved from http://ijate.net/index.php/ijate/article/view/788


Alderson, J. C. (2000). Assessing reading. Cambridge, UK; New York, NY, USA.
ALTE (2011). Manual for Language Test Development and Examining, Strasbourg, Council of Europe. Retrieved from https://rm.coe.int/manual-for-language-test-development-and-examining-for-use-with-the-ce/1680667a2b
Airasian, P. W. (1994) Classroom assessment (2nd ed.). New York: McGraw-Hill.
Anderson, R. C., Hiebert, E. H., Scott, J. A., & Wilkinson, I. A. (1985). Becoming a nation of readers: the report of the Commission on Reading. Pittsburgh, PA: National Academy of Education.
Bachman, L. F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.
Bachman, L. F., & Palmer, A. S. (2010). Language assessment in practice. Oxford: Oxford University Press.
Bachman, L. F. (2000). Language Testing in Practice. Oxford: OUP.
Bax, S. (2013). The cognitive processing of candidates during reading tests: Evidence from eye tracking. Language Testing, 30(4), 441-465. doi:10.1177/0265532212473244
Bernhardt, E. B. (1983). Three approaches to reading comprehension in intermediate German. The Modern Language Journal, 67(2), 111-115. doi:4781.1983.tb01478.x
Britt, M., Rouet, J.F., Durik, A. M. (2017). Literacy beyond Text Comprehension. New York: Routledge, https://doi.org/10.4324/9781315682860
Brown, J. D. (1996). Testing in language programs. Upper Saddle River, NJ: Prentice Hall Regents.
Cerdan, R., Vidal-Abarca, E., Martinez, T., Gilabert, R., & Gil, L. (2009). Impact of question-answering tasks on search processes and reading comprehension. Language and Instruction, 19(1), 13-27.
Cohen, A. (1984). On taking language tests. Language Testing. 1(1). 70-81. doi:10.1177/026553228400100106
Cohen, A. D., & Upton, T. A. (2007). `I want to go back to the text’: Response strategies on the reading subtest of the new TOEFL®. Language Testing, 24(2), 209–250. doi:10.1177/0265532207076364
Cutting, L.E., & Scarborough, H.S. (2006). Prediction of reading comprehension: Relative contributions of word recognition, language proficiency, and other cognitive skills can depend on how comprehension is measured. Scientific Studies of Reading, 10(3), 277-299. doi:10.1207/s1532799xssr1003_5
Enright, M.K, Grabe, W., Koda, K., Mulcahy-Ernt, P., & Schedl, M. (2000). TOEFL 2000
reading framework: a working paper. Princeton, NJ: ETS.
Fuhrman, M. (1996). Developing Good Multiple-Choice Tests and Test Questions. Journal of Geoscience Education, 44(4), 379-384. doi:10.5408/1089-9995-44.4.379
Gernsbacher M. A. (1997). Coherence cues mapping during comprehension. In J. Costermans & M. Fayol (Eds.), Processing interclausal relationships in the production and comprehension of text (pp. 3-21). Mahwah, NJ: Erlbaum.
Graesser A., Singer M., & Trabasso T. (1994). Constructing inferences during narrative text comprehension. Psychological Review, 101, 371-395.
Grabe, W. (1991). Current Developments in Second Language Reading Research. TESOL
Quarterly, 25(3), 375-406. doi:10.2307/3586977
Grabe, W. (2009). Reading in a second language: moving from theory to practice. Cambridge: Cambridge University Press.
Goodman, K.S. (1967). Reading: A psycholinguistic guessing game. Journal of the Reading Specialist, 6(4), 126-135. doi:10.1080/19388076709556976
Gough, P.B. (1972). One second of reading. Visible Language, 6, 290-320.
Guthrie, J. T., & Kirsch, I. S. (1987). Distinctions between reading comprehension and locating information in text. Journal of Educational Psychology, 79(3), 220–227. https://doi.org/10.1037/0022-0663.79.3.220
Haladyna, T.M., & Downing, S.M. (2009). A Taxonomy of Multiple-Choice Item- Writing Rules. Applied Measurement in Education, 2(1), 37 50. doi:10.1207/s15324818ame0201_3
Khalifa, H., & Weir, C. (2009). Examining Reading: Research and Practice in Assessing Second Language Reading. Studies in Language Testing, 29. Cambridge: Cambridge University Press.
Keenan, J.M., Betjemann, R.S., & Olson, R.K. (2008). Reading comprehension tests vary in the skills they assess: Differential dependence on decoding and oral comprehension. Scientific Studies of Reading, 12(3), 281-300.
Kintsch, W. (1998). Comprehension: a paradigm for cognition. Cambridge: Cambridge University Press.
Kintsch, W., & Kintsch, E. (2005). Comprehension. In Paris, S. G. and Stahl, S. A. (eds.) Children’s Reading Comprehension and Assessment, 71-92. Mahwah, New Jersey: Erlbaum.
Kintsch, W., & van Dijk, T. A. (1978). Toward a model of text comprehension and production. Psychological Review, 85(5), 363–394. https://doi.org/10.1037/0033-295X.85.5.363
Kobayashi, M. (2002). Method effects on reading comprehension test performance: Text organization and response format. Language Testing, 19(2), 193-220.
Kurz, T.B. (1999). A review of scoring algorithms for multiple choice tests. Paper presented at the annual meeting of the Southwest Educational Research Association, San Antonio, TX.
Lau, P.N.K., Lau, S.H, Hong, K.S., & Usop, H. (2011). Guessing, partial knowledge, and misconceptions in multiple choice tests. Educational Technology & Society, 14, 99-110.
Lee, J.F. (1986). On the use of the recall task to measure L2 reading comprehension. Studies in Second Language Acquisition, 8, 201-212.
Lim, H. J. (2014). Exploring the validity evidence of the TOEFL IBT reading test from a cognitive perspective. Unpublished PhD Thesis. Michigan State University.
Martinez, M. E., & Katz, I. R. (1995). Cognitive Processing requirements of Constructed Figural Response and Multiple-Choice Items in Architecture Assessment. Educational Assessment, 3(1), 83–98. doi:10.1207/s15326977ea0301_4
Martinez, M. E. (1999). Cognition and the question of test item format. Educational Psychologist, 34(4), 207–218. doi:10.1207/s15326985ep3404_2
Myers J. L., & O’Brien E. J. (1998). Accessing the discourse representation during reading. Discourse Processes, 26, 131-157.
Prapphal, K. (2008). Issues and trends in language testing and assessment in Thailand. Language Testing, 25(1) 127-143.
Pearson, P. D., Garavaglia, D., Lycke, K., Roberts, E., Danridge, J., & Hamm, D. (1999). The impact of item format on the depth of students’ cognitive engagement. Washington, DC: Technical Report, American Institute for Research.
Pressley, G.M. (2002). Metacognition and self-regulated comprehension. In Farstrup, A.E., & Samuels, S.J. (Eds.), What research has to say about reading instruction. Newark, DE: International Reading Association.
Rupp, A., Ferne, T., & Choi, H. (2006). How assessing reading comprehension with multiple choice questions shapes the construct: a cognitive processing perspective. Language Testing, 23(4), 441-474.
Scouller, K. (1998). The influence of assessment method on students’ learning approaches: multiple choice question examination versus assignment essay. Higher Education, 35, 453-472.
Sheehan, K.M. and Ginther, A. (2001). What do passage-based MC verbal reasoning items really measure? An analysis of the cognitive skills underlying performance on the current TOEFL reading section. Paper presented at the 2000 Annual Meeting of the National Council of Measurement in Education.
Shohamy, E. (1984). Does the testing method make a difference? The case of reading comprehension. Language Testing, 1(2), 147-170.
Smith, M. (2017). Cognitive Validity: Can Multiple-Choice Items Tap Historical Thinking Processes? American Educational Research Journal, 54(6), 1256-1287.
Taylor, L. (2013). Testing reading through summary: investigating summary completion tasks for assessing reading comprehension ability. Studies in Language Testing, 39. Cambridge, England, UCLES/Cambridge University Press.
Unaldi, A. (2004). Componentiality of the reading construct: Construct validation of the reading subskills of the Boğaziçi University English Proficiency Test. Unpublished PhD Thesis. Faculty of Education, Boğaziçi University.
Urquhart, S., & Weir, C. (1998). Reading in a second language: process, product and practice. London: Routledge.
Watson Todd, R. (2008). The impact of evaluation on Thai ELT. In Ertuna, K., French, A., Faulk, C., Donnelly, D., and Kritprayoch, W. (Eds.), Proceedings of the 12th English in South East Asia conference: Trends and Directions, pp.118-127. Bangkok: KMUTT
van Dijk, T.A. (1980). Macrostructures: An interdisciplinary study of global structures in discourse, interaction and cognition. Hillsdale (N.J.): Lawrence Erlbaum Associates.
Weir, C. J. (2005). Limitations of the Common European Framework for developing comparable examinations and tests. Language Testing, 22(3), 281 300. https://doi.org/10.1191/0265532205lt309oa
Weir, C, Hawkey, R, Green, T., & Devi, S. (2009). The cognitive processes underlying the academic reading construct as measured by IELTS. IELTS Research Reports, 9, 157–189, British Council, London and IELTS Australia, Canberra.
Weir, C., Bax, S. (2012). Investigating learners’ cognitive processes during a computer-based CAE Reading test. Cambridge Research Notes, Cambridge ESOL, 47 (February 2012), 3–14. Retrieved from www.cambridgeesol.org/rs_notes/rs_nts47.pdf
Wolf, D.F. (1991). The effects of task, language of assessment, and target language experience on foreign language learners’ performance on reading comprehension tests. Dissertation, University of Illinois, ProQuest Dissertations and Theses.