Main Article Content
In this study, the efficiency of various random sampling methods to reduce the number of items rated by judges in an Angoff standard-setting study was examined and the methods were compared with each other. Firstly, the full-length test was formed by combining Placement Test 2012 and 2013 mathematics subsets. After then, simple random sampling (SRS), content stratified (C-SRS), item-difficulty stratified (D-SRS) and content-by-difficulty random sampling (CD-SRS) methods were used to constitute different length of subsets (30%, 40%, 50%, 70%) from the full-test. In total, 16 different study conditions (4 methods x 4 subsets) were investigated. In data analysis part, ANOVA analysis was conducted to examine whether minimum passing scores (MPSs) for the subsets were significantly different from the MPSs of the full-length test. As a follow-up analysis, RMSE and SEE (Standard Error of Estimation) values were calculated for each study condition. Results indicated that the estimated Angoff MPSs were significantly different from the full-test Angoff MPS (45.12) only in the study conditions of 30%-C-SRS, 40% C-SRS, 30% D-SRS and 30%-CD-SRS. According to RMSE values, the C-SRS method had the smallest error while the SRS method had the biggest one. Moreover, SEE examinations revealed that to achieve estimations similar to the full-test Angoff MPS (within one SEE), it is sufficient to get 50% of items with the C-SRS method. C-SRS method was the more effective one compared to the others in reducing the number of items rated by judges in MPS setting studies conducted with the Angoff method.
International Journal of Assessment Tools in Education
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
Berk, R. A. (1996). Standard setting: The next generation (where few pschometricians have gone before!). Applied Measurement in Education, 9, 215–235.
Buckendahl, C. W., Ferdous, A. A. & Gerrow, J. (2010). Recommending cut scores with a subset of items: An empirical illustration. Practical Assessment, 15(6), 1-10.
Cizek, G. J. (2001). Setting performance standards: Concepts, methods and perspectives. Mahwah, NJ: Lawrence Erlbaum Associates.
Çetin, S. (2011). İşaretleme ve angoff standart belirleme yöntemlerinin karşılaştırılması [Comparison of Bookmark and Angoff Standard Setting Methods]. PhD dissertation, Hacettepe University, Ankara.
Downing, S. M. (2006). Selected-Response item formats in test development. In T. M. Haladyna & S. M. Downing (Ed.), Handbook of test development (pp. 287-300). Mahwah, New Jersey: Routledge.
Ferdous, A. A., & Plake, B. S. (2005). The use of subsets of test questions in an Angoff standard setting method. Educational and Psychological Measurement, 65(2), 185-201.
Ferdous, A. A., & Plake, B. S. (2007). Item selection strategy for reducing the number of items rated in an Angoff standard setting study. Educational and Psychological Measurement, 67(2), 193-206.
Fraenkel, J. R., Wallen, N. E., & Hyun, H. H. (2012). How to design and evaluate research in education (8th ed.). New York: McGraw Hill.
Hambleton, R. M. (1998). Setting performance standards on achievement tests: Meeting the requirements of Title I. In L. N. Hansche (Ed.), Handbook for the development of performance standards (pp. 87-114). Washington, DC: Council of Chief State School Officers.
Hambleton, R. K., & Pitoniak, M. (2006). Setting performance standards. In R. L. Brennan (Ed.), Educational Measurement (pp. 433–470). Westport, CT: Praeger.
Impara, J. C., & Plake, B. S. (1997). Standard setting: An alternative approach. Journal of Educational Measurement, 34, 353-366.
Irwin, P. (2007). An alternative examinee-centered standard setting strategy (Doctoral dissertation). University of Nebraska, USA.
Jaeger, R. M. (1989). Certification of student competence. In R. L. Linn (Ed.), Educational measurement (pp. 485-514). New York: American Council on Education/Macmillan.
Kane, M. T. (2001). Current concerns in validity theory. Journal of Educational Measurement, 38(4), 319-342.
Kannan, P., Sgammato, A., & Tannenbaum, R. J. (2015). Evaluating the operational feasibility of using subsets of items to recommend minimal competency cut scores. Applied Measurement in Education, 28(4), 292-307.
Kannan, P., Sgammato, A., Tannenbaum, R. J., & Katz, I. R. (2015). Evaluating the consistency of angoff-based cut scores using subsets of items within a generalizability theory framework. Applied Measurement in Education, 28(3), 169-186.
Lewis, D. M., Mitzel, H. C., & Green, D. R. (1996). Standard setting: A bookmark approach. In D. R. Green (Ed.), IRT-based standard setting procedures utilizing behavioral anchoring. Symposium conducted at the Council of Chief State School Officers National Conference on Large-Scale Assessment, Phoenix, AZ.
MEB (2009). İlköğretim matematik dersi 6-8. sınıflar öğretim programı ve kılavuzu. [Elementary mathematics course curriculum and guide of 6-8. classes]. Retrieved November 29, 2019, from https://ttkb.meb.gov.tr .
Mehrens, W. A. (1995). Methodological issues in standard setting for educational exams. In Proceedings of Joint Conference on Standard Setting for Large-Scale Assessments (pp. 221-263). Washington, DC: National Assessment Governing Board and National Center for Education Statistics.
Norcini, J., Shea, J., & Ping, J. C. (1988). A note on the application of multiple matrix sampling to standard setting. Journal of Educational Measurement, 25(2), 159–164.
Özçelik, D. A. (2013). Test Hazırlama Kılavuzu [Test Preparation Guide]. Pegem Akademi Yayıncılık.
Pallant, J. (2005). SPSS survival manual: A step by step guide to data analysis using SPSS for Windows (2nd ed.). Crows Nest, Australia: Allen & Unwin.
Plake, B. S., & Impara, J. C. (2001). The fourteenth mental measurements yearbook. Lincoln, NB: Buros Institute of Mental Measurements.
Reckase, M. D. (2001). Innovative methods for helping standard-setting participants to perform their task. The role of feedback regarding consistency, accuracy, and impact. In G. J. Cizek (Ed.), Setting performance standards: Concepts, methods, and perspectives (pp.159-174). Mahwah, NJ: Erlbaum.
Sireci, S. G., Patelis, T., Rizavi, S., Dillingham, A. M., & Rodriguez, G. (2000). Setting standards on a computerized-adaptive placement examination. Laboratory or Psychometric and Evaluative Research Report No. 378.
Smith, T. N. (2011). Using stratified item selection to reduce the number of items rated in standard setting. University of South Florida, USA.
Siegel, S. (1956). Nonparametric methods for the behavioral sciences. New York.