Main Article Content
Learning progressions are used to describe how students’ understanding of a topic progresses over time. This study evaluates the effectiveness of different item formats for placing students into levels along a learning progression for carbon cycling. The item formats investigated were Constructed Response (CR) items and two types of two-tier items: (1) Ordered Multiple-Choice (OMC) followed by CR items and (2) Multiple True or False (MTF) followed by CR items. Our results suggest that estimates of students’ learning progression level based on OMC and MTF responses are moderately predictive of their level based on CR responses. With few exceptions, CR items were effective for differentiating students among learning progression levels. Based on the results, we discuss how to design and best use items in each format to more accurately measure students’ level along learning progressions in science.
International Journal of Assessment Tools in Education
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Alonzo, A. C., & Steedle, J. T. (2008). Developing and assessing a force and motion learning progression. Published online in Wiley InterScience.
Angoff, W. H. (1971). Scales, norms, and equivalent scores. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 508-600). Washington, DC: American Council on Education.
Authors. (2007, 2009a, 2009b, 2009c, 2012, 2013, 2015). Masked for blind review.
Berlak, H. (1992). The need for a new science of assessment. In H. Berlak et al. (Eds.) Toward a new science of educational testing and assessment (pp. 1-22). Albany: State University of New York Press.
Briggs, D.C. & Alonzo, A.C. (2012). The psychometric modeling of ordered multiple-choice item responses for diagnostic assessment with a learning progression. In A.C. Alonzo & A.W. Gotwals (eds). Learning progressions in science: Current challenges and future directions. Rotterdam, The Netherlands: Sense Publishers.
Briggs, D. C., Alonzo, A. C., Schwab, C., & Wilson, M. (2006). Diagnostic assessment with ordered multiple-choice items. Educational Assessment, 11, 33 – 63.
Catley, K., Lehrer R., & Reiser, B. (2004). Tracing a prospective learning progression for developing understanding of evolution, Paper Commissioned by the National Academies Committee on Test Design for K–12 Science Achievement, Washington, DC: National Academy of Science, 67.
Corcoran, T., Mosher, F. A., & Rogat, A. (2009, May). Learning progressions in science: An evidence based approach to reform (CPRE Research Report #RR-63). Philadelphia, PA: Consortium for Policy Research in Education.
Downing, S. M., & Yudkowsky, R. (2009). Assessment in Health Professions Education. New York, NY Routledge.
Dunham, M. L. (2007) An investigation of the multiple true-false item for nursing licensure and potential sources of construct-irrelevant difficulty. http://proquest.umi.com/pqdlink?did=1232396481&Fmt=7&clientI d=79356&RQT=309&VName=PQD
Embretson, S. E. (1996). Item response theory models and inferential bias in multiple group comparisons. Applied Psychological Measurement, 20, 201-212.
Ercikan, K., Schwarz, R. D., Julian, M. W., Burket, G. R., Weber, M. M., & Link, V. (1998). Calibration and scoring of tests with multiple-choice and constructed-response item types. Journal of Educational Measurement, 35, 137-154.
Flowers, K., Bolton, C., Brindle, N. (2008). Chance guessing in a forced-choice recognition task and the detection of malingering. In: Neuropsychology, 22 (2), 273-277.
Frisbie, D. A. (1992). The multiple true-false item format: A status review. Educational Measurement: Issues and Practice,11(4), 21–26.
Lee, H. -S., Liu, O. L., & Linn, M. C. (2011). Validating Measurement of Knowledge Integration Science Using Multiple-Choice and Explanation Items. Applied Measurement in Education, 24(2), 115-136.
Liu, O.L., Lee, H-S., Hofstedder, C. & Linn, M.C. (2008). Assessing knowledge integration in science: Construct, measures and evidence. Educational Assessment. 13, 33-55.
Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149-173.
Martinez, M. (1999). Cognition and the question of test item format. Educational Psychologists, 34, 207- 218.
Merritt, J. D., Krajcik, J., & Shwartz, Y. (2008). Development of a learning progression for the particle model of matter. In Proceedings of the 8th International Conference for the Learning Sciences (Vol. 2, pp. 75-81).
Utrecht, The Netherlands: International Society of the Learning Sciences.
National Research Council. (2006). Systems for state science assessment. Washington, DC: The National Academies Press.
National Research Council. (2007). Taking science to school. Washington, DC: The National Academies Press.
National Research Council. (2014). Developing Assessments for the Next Generation Science Standards. Committee on Developing Assessments of Science Proficiency in K-12. Board on Testing and Assessment and Board on Science Education, J.W. Pellegrino, M.R. Wilson, J.A. Koenig, and A.S. Beatty, Editors. Division of Behavioral and Social Sciences and Education. Washington, DC: The National Academies Press.
Plummer, J.D. & Maynard, L. (2014). Building a learning progression for celestial motion: An exploration of students’ reasoning about the seasons. Journal of Research in Science Teaching, 51(7), 902-929.
Rivet, A. & Kastens, K. (2012). Developing a construct-based assessment to examine students’ analogical reasoning around physical models in earth science. Journal of Research in Science Teaching, 49(6), 713-743.
Salinas, I. (2009, June). Learning progressions in science education: Two approaches for development. Paper presented at the Learning Progressions in Science (LeaPS) Conference, Iowa City, IA. Available from http://www.education.uiowa.edu/projects/leaps/proceedings/
Schuwirth, L. & van der Vleuten, C. (2004). Different Written Assessment Methods: What can be said about their Strengths and Weaknesses? Medical Education 38,9: 974-979.
Smith, C. L., Wiser, M., Anderson, C. W., & Krajcik, J. (2006). Implications on research on children’s learning for standards and assessment: A proposed learning for matter and the atomic molecular theory. Measurement: Interdisciplinary Research & Perspective,4(1), 1-98.
Steedle, J. T. & Shavelson, R. J. (2009). Supporting valid interpretations of learning progression level diagnoses. Journal of Research in Science Teaching, 46(6), 699-715.
Talento-Miller, E., Han, K. & Guo, F. (2011). Guess Again: The Effect of Correct Guesses on Scores in an Operational CAT Program. (Graduate Management Admission Council research report. No. RR-11-04). http://www.gmac.com/~/media/Files/gmac/Research/research-report-series/guessagaintheeffectofcorrect.pdf
Thissen, D., & Steinberg, L. (1997). A response model for multiple-choice items. In W. van der Linden & R. Hambleton (Eds.), Handbook of modern item response theory (pp. 52-65). New York: Springer-Verlag.
van der Linen, W. J., & Hambleton, R. K. (Eds.). (1997). Handbook of modern item response theory. New York: Springer.
Wainer, H., & Thissen, D. (1993). Combining multiple-choice and constructed-response test scores: Toward a Marxist theory of test construction. Applied Measurement in Education, 6(2), 103-118.
Wilson, M., & Wang, W.-C. (1995). Complex composites: Issues that arise in combining different modes of assessment. Applied Psychological Measurement, 19(1), 51-71.
Wright, B.D., Linacre, J. M., Gustafsson, J. E. & Martin-Loff, P. (1994). Reasonable mean-square fit values. Rasch Meas Trans 1994; 8: 370.
Wu, M.L., Adams, R.J. & Wilson, M.R. (1998). ACER Conquest: Generalised item response modelling software. Melbourne: ACER Press.
Wu, M. L., Adams, R. J., Wilson, M. R. & Haldane, S. A. (2007). ACER ConQuest Version 2.0: generalised item response modeling software. Camberwell, Australia: Australia Council for Educational Research.
Yao, L., & Boughton, K. A. (2009). Multidimensional linking for tests with mixed item types. Journal of Educational Measurement, 46 (2), 177–197.