Purpose The percent Angoff (PA) method has been recommended as a reliable method to set the cutoff score instead of a fixed cut point of 60% in the Korean Medical Licensing Examination (KMLE). The yes/no Angoff (YNA) method, which is easy for panelists to judge, can be considered as an alternative because the KMLE has many items to evaluate. This study aimed to compare the cutoff score and the reliability depending on whether the PA or the YNA standard-setting method was used in the KMLE.
Methods The materials were the open-access PA data of the KMLE. The PA data were converted to YNA data in 5 categories, in which the probabilities for a “yes” decision by panelists were 50%, 60%, 70%, 80%, and 90%. SPSS for descriptive analysis and G-string for generalizability theory were used to present the results.
Results The PA method and the YNA method counting 60% as “yes,” estimated similar cutoff scores. Those cutoff scores were deemed acceptable based on the results of the Hofstee method. The highest reliability coefficients estimated by the generalizability test were from the PA method and the YNA method, with probabilities of 70%, 80%, 60%, and 50% for deciding “yes,” in descending order. The panelist’s specialty was the main cause of the error variance. The error size was similar regardless of the standard-setting method.
Conclusion The above results showed that the PA method was more reliable than the YNA method in estimating the cutoff score of the KMLE. However, the YNA method with a 60% probability for deciding “yes” also can be used as a substitute for the PA method in estimating the cutoff score of the KMLE.
Citations
Citations to this article as recorded by
Issues in the 3rd year of the COVID-19 pandemic, including computer-based testing, study design, ChatGPT, journal metrics, and appreciation to reviewers Sun Huh Journal of Educational Evaluation for Health Professions.2023; 20: 5. CrossRef
Possibility of independent use of the yes/no Angoff and Hofstee methods for the standard setting of the Korean Medical Licensing Examination written test: a descriptive study Do-Hwan Kim, Ye Ji Kang, Hoon-Ki Park Journal of Educational Evaluation for Health Professions.2022; 19: 33. CrossRef
Computerized adaptive testing (CAT) greatly improves measurement efficiency in high-stakes testing operations through the selection and administration of test items with the difficulty level that is most relevant to each individual test taker. This paper explains the 3 components of a conventional CAT item selection algorithm: test content balancing, the item selection criterion, and item exposure control. Several noteworthy methodologies underlie each component. The test script method and constrained CAT method are used for test content balancing. Item selection criteria include the maximized Fisher information criterion, the b-matching method, the astratification method, the weighted likelihood information criterion, the efficiency balanced information criterion, and the KullbackLeibler information criterion. The randomesque method, the Sympson-Hetter method, the unconditional and conditional multinomial methods, and the fade-away method are used for item exposure control. Several holistic approaches to CAT use automated test assembly methods, such as the shadow test approach and the weighted deviation model. Item usage and exposure count vary depending on the item selection criterion and exposure control method. Finally, other important factors to consider when determining an appropriate CAT design are the computer resources requirement, the size of item pools, and the test length. The logic of CAT is now being adopted in the field of adaptive learning, which integrates the learning aspect and the (formative) assessment aspect of education into a continuous, individualized learning experience. Therefore, the algorithms and technologies described in this review may be able to help medical health educators and high-stakes test developers to adopt CAT more actively and efficiently.
Citations
Citations to this article as recorded by
A shortened test is feasible: Evaluating a large-scale multistage adaptive English language assessment Shangchao Min, Kyoungwon Bishop Language Testing.2024;[Epub] CrossRef
Efficiency of PROMIS MCAT Assessments for Orthopaedic Care Michael Bass, Scott Morris, Sheng Zhang Measurement: Interdisciplinary Research and Perspectives.2024; : 1. CrossRef
The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing Merve ŞAHİN KÜRŞAD Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi.2023; 14(1): 33. CrossRef
Presidential address: improving item validity and adopting computer-based testing, clinical skills assessments, artificial intelligence, and virtual reality in health professions licensing examinations in Korea Hyunjoo Pai Journal of Educational Evaluation for Health Professions.2023; 20: 8. CrossRef
Remote Symptom Monitoring With Ecological Momentary Computerized Adaptive Testing: Pilot Cohort Study of a Platform for Frequent, Low-Burden, and Personalized Patient-Reported Outcome Measures Conrad Harrison, Ryan Trickett, Justin Wormald, Thomas Dobbs, Przemysław Lis, Vesselin Popov, David J Beard, Jeremy Rodrigues Journal of Medical Internet Research.2023; 25: e47179. CrossRef
Utilizing Real-Time Test Data to Solve Attenuation Paradox in Computerized Adaptive Testing to Enhance Optimal Design Jyun-Hong Chen, Hsiu-Yi Chao Journal of Educational and Behavioral Statistics.2023;[Epub] CrossRef
A Context-based Question Selection Model to Support the Adaptive Assessment of Learning: A study of online learning assessment in elementary schools in Indonesia Umi Laili Yuhana, Eko Mulyanto Yuniarno, Wenny Rahayu, Eric Pardede Education and Information Technologies.2023;[Epub] CrossRef
Evaluating a Computerized Adaptive Testing Version of a Cognitive Ability Test Using a Simulation Study Ioannis Tsaousis, Georgios D. Sideridis, Hannan M. AlGhamdi Journal of Psychoeducational Assessment.2021; 39(8): 954. CrossRef
Developing Multistage Tests Using D-Scoring Method Kyung (Chris) T. Han, Dimiter M. Dimitrov, Faisal Al-Mashary Educational and Psychological Measurement.2019; 79(5): 988. CrossRef
Conducting simulation studies for computerized adaptive testing using SimulCAT: an instructional piece Kyung (Chris) Tyek Han Journal of Educational Evaluation for Health Professions.2018; 15: 20. CrossRef
Updates from 2018: Being indexed in Embase, becoming an affiliated journal of the World Federation for Medical Education, implementing an optional open data policy, adopting principles of transparency and best practice in scholarly publishing, and appreci Sun Huh Journal of Educational Evaluation for Health Professions.2018; 15: 36. CrossRef