Purpose The study investigates the efficacy of new features introduced to the selection process for medical school at the University of New South Wales, Australia: (1) considering the relative ranks rather than scores of the Undergraduate Medicine and Health Sciences Admission Test and Australian Tertiary Admission Rank; (2) structured interview focusing on interpersonal interaction and concerns should the applicants become students; and (3) embracing interviewers’ diverse perspectives.
Methods Data from 5 cohorts of students were analyzed, comparing outcomes of the second year in the medicine program of 4 cohorts of the old selection process and 1 of the new process. The main analysis comprised multiple linear regression models for predicting academic, clinical, and professional outcomes, by section tools and demographic variables.
Results Selection interview marks from the new interview (512 applicants, 2 interviewers each) were analyzed for inter-rater reliability, which identified a high level of agreement (kappa=0.639). No such analysis was possible for the old interview since it required interviewers to reach a consensus. Multivariate linear regression models utilizing outcomes for 5 cohorts (N=905) revealed that the new selection process was much more effective in predicting academic and clinical achievement in the program (R2=9.4%–17.8% vs. R2=1.5%–8.4%).
Conclusion The results suggest that the medical student selection process can be significantly enhanced by employing a non-compensatory selection algorithm; and using a structured interview focusing on interpersonal interaction and concerns should the applicants become students; as well as embracing interviewers’ diverse perspectives.
Purpose Computerized adaptive testing (CAT) has been adopted in licensing examinations because it improves the efficiency and accuracy of the tests, as shown in many studies. This simulation study investigated CAT scoring and item selection methods for the Korean Medical Licensing Examination (KMLE).
Methods This study used a post-hoc (real data) simulation design. The item bank used in this study included all items from the January 2017 KMLE. All CAT algorithms for this study were implemented using the ‘catR’ package in the R program.
Results In terms of accuracy, the Rasch and 2-parametric logistic (PL) models performed better than the 3PL model. The ‘modal a posteriori’ and ‘expected a posterior’ methods provided more accurate estimates than maximum likelihood estimation or weighted likelihood estimation. Furthermore, maximum posterior weighted information and minimum expected posterior variance performed better than other item selection methods. In terms of efficiency, the Rasch model is recommended to reduce test length.
Conclusion Before implementing live CAT, a simulation study should be performed under varied test conditions. Based on a simulation study, and based on the results, specific scoring and item selection methods should be predetermined.
Citations
Citations to this article as recorded by
Presidential address: improving item validity and adopting computer-based testing, clinical skills assessments, artificial intelligence, and virtual reality in health professions licensing examinations in Korea Hyunjoo Pai Journal of Educational Evaluation for Health Professions.2023; 20: 8. CrossRef
Developing Computerized Adaptive Testing for a National Health Professionals Exam: An Attempt from Psychometric Simulations Lingling Xu, Zhehan Jiang, Yuting Han, Haiying Liang, Jinying Ouyang Perspectives on Medical Education.2023;[Epub] CrossRef
Optimizing Computer Adaptive Test Performance: A Hybrid Simulation Study to Customize the Administration Rules of the CAT-EyeQ in Macular Edema Patients T. Petra Rausch-Koster, Michiel A. J. Luijten, Frank D. Verbraak, Ger H. M. B. van Rens, Ruth M. A. van Nispen Translational Vision Science & Technology.2022; 11(11): 14. CrossRef
The accuracy and consistency of mastery for each content domain using the Rasch and deterministic inputs, noisy “and” gate diagnostic classification models: a simulation study and a real-world analysis using data from the Korean Medical Licensing Examinat Dong Gi Seo, Jae Kum Kim Journal of Educational Evaluation for Health Professions.2021; 18: 15. CrossRef
Linear programming method to construct equated item sets for the implementation of periodical computer-based testing for the Korean Medical Licensing Examination Dong Gi Seo, Myeong Gi Kim, Na Hui Kim, Hye Sook Shin, Hyun Jung Kim Journal of Educational Evaluation for Health Professions.2018; 15: 26. CrossRef
Funding information of the article entitled “Post-hoc simulation study of computerized adaptive testing for the Korean Medical Licensing Examination” Dong Gi Seo, Jeongwook Choi Journal of Educational Evaluation for Health Professions.2018; 15: 27. CrossRef
Updates from 2018: Being indexed in Embase, becoming an affiliated journal of the World Federation for Medical Education, implementing an optional open data policy, adopting principles of transparency and best practice in scholarly publishing, and appreci Sun Huh Journal of Educational Evaluation for Health Professions.2018; 15: 36. CrossRef
Computerized adaptive testing (CAT) greatly improves measurement efficiency in high-stakes testing operations through the selection and administration of test items with the difficulty level that is most relevant to each individual test taker. This paper explains the 3 components of a conventional CAT item selection algorithm: test content balancing, the item selection criterion, and item exposure control. Several noteworthy methodologies underlie each component. The test script method and constrained CAT method are used for test content balancing. Item selection criteria include the maximized Fisher information criterion, the b-matching method, the astratification method, the weighted likelihood information criterion, the efficiency balanced information criterion, and the KullbackLeibler information criterion. The randomesque method, the Sympson-Hetter method, the unconditional and conditional multinomial methods, and the fade-away method are used for item exposure control. Several holistic approaches to CAT use automated test assembly methods, such as the shadow test approach and the weighted deviation model. Item usage and exposure count vary depending on the item selection criterion and exposure control method. Finally, other important factors to consider when determining an appropriate CAT design are the computer resources requirement, the size of item pools, and the test length. The logic of CAT is now being adopted in the field of adaptive learning, which integrates the learning aspect and the (formative) assessment aspect of education into a continuous, individualized learning experience. Therefore, the algorithms and technologies described in this review may be able to help medical health educators and high-stakes test developers to adopt CAT more actively and efficiently.
Citations
Citations to this article as recorded by
A shortened test is feasible: Evaluating a large-scale multistage adaptive English language assessment Shangchao Min, Kyoungwon Bishop Language Testing.2024;[Epub] CrossRef
Efficiency of PROMIS MCAT Assessments for Orthopaedic Care Michael Bass, Scott Morris, Sheng Zhang Measurement: Interdisciplinary Research and Perspectives.2024; : 1. CrossRef
The Effects of Different Item Selection Methods on Test Information and Test Efficiency in Computer Adaptive Testing Merve ŞAHİN KÜRŞAD Eğitimde ve Psikolojide Ölçme ve Değerlendirme Dergisi.2023; 14(1): 33. CrossRef
Presidential address: improving item validity and adopting computer-based testing, clinical skills assessments, artificial intelligence, and virtual reality in health professions licensing examinations in Korea Hyunjoo Pai Journal of Educational Evaluation for Health Professions.2023; 20: 8. CrossRef
Remote Symptom Monitoring With Ecological Momentary Computerized Adaptive Testing: Pilot Cohort Study of a Platform for Frequent, Low-Burden, and Personalized Patient-Reported Outcome Measures Conrad Harrison, Ryan Trickett, Justin Wormald, Thomas Dobbs, Przemysław Lis, Vesselin Popov, David J Beard, Jeremy Rodrigues Journal of Medical Internet Research.2023; 25: e47179. CrossRef
Utilizing Real-Time Test Data to Solve Attenuation Paradox in Computerized Adaptive Testing to Enhance Optimal Design Jyun-Hong Chen, Hsiu-Yi Chao Journal of Educational and Behavioral Statistics.2023;[Epub] CrossRef
A Context-based Question Selection Model to Support the Adaptive Assessment of Learning: A study of online learning assessment in elementary schools in Indonesia Umi Laili Yuhana, Eko Mulyanto Yuniarno, Wenny Rahayu, Eric Pardede Education and Information Technologies.2023;[Epub] CrossRef
Evaluating a Computerized Adaptive Testing Version of a Cognitive Ability Test Using a Simulation Study Ioannis Tsaousis, Georgios D. Sideridis, Hannan M. AlGhamdi Journal of Psychoeducational Assessment.2021; 39(8): 954. CrossRef
Developing Multistage Tests Using D-Scoring Method Kyung (Chris) T. Han, Dimiter M. Dimitrov, Faisal Al-Mashary Educational and Psychological Measurement.2019; 79(5): 988. CrossRef
Conducting simulation studies for computerized adaptive testing using SimulCAT: an instructional piece Kyung (Chris) Tyek Han Journal of Educational Evaluation for Health Professions.2018; 15: 20. CrossRef
Updates from 2018: Being indexed in Embase, becoming an affiliated journal of the World Federation for Medical Education, implementing an optional open data policy, adopting principles of transparency and best practice in scholarly publishing, and appreci Sun Huh Journal of Educational Evaluation for Health Professions.2018; 15: 36. CrossRef