High-fidelity patient simulation (HFPS) has been hypothesized as a modality for assessing competency of knowledge and skill in patient simulation, but uniform methods for HFPS performance assessment (PA) have not yet been completely achieved. Anesthesiology as a field founded the HFPS discipline and also leads in its PA. This project reviews the types, quality, and designated purpose of HFPS PA tools in anesthesiology. We used the systematic review method and systematically reviewed anesthesiology literature referenced in PubMed to assess the quality and reliability of available PA tools in HFPS. Of 412 articles identified, 50 met our inclusion criteria. Seventy seven percent of studies have been published since 2000; more recent studies demonstrated higher quality. Investigators reported a variety of test construction and validation methods. The most commonly reported test construction methods included ?占퐉odified Delphi Techniques??for item selection, reliability measurement using inter-rater agreement, and intra-class correlations between test items or subtests. Modern test theory, in particular generalizability theory, was used in nine (18%) of studies. Test score validity has been addressed in multiple investigations and shown a significant improvement in reporting accuracy. However the assessment of predicative has been low across the majority of studies. Usability and practicality of testing occasions and tools was only anecdotally reported. To more completely comply with the gold standards for PA design, both shared experience of experts and recognition of test construction standards, including reliability and validity measurements, instrument piloting, rater training, and explicit identification of the purpose and proposed use of the assessment tool, are required.
Citations
Citations to this article as recorded by
Simulation-based summative assessment in healthcare: an overview of key principles for practice Clément Buléon, Laurent Mattatia, Rebecca D. Minehart, Jenny W. Rudolph, Fernande J. Lois, Erwan Guillouet, Anne-Laure Philippon, Olivier Brissaud, Antoine Lefevre-Scelles, Dan Benhamou, François Lecomte, the SoFraSimS Assessment with simul group, Anne Be Advances in Simulation.2022;[Epub] CrossRef
Computer-Based Case Simulations for Assessment in Health Care: A Literature Review of Validity Evidence Robyn C. Ward, Timothy J. Muckle, Michael J. Kremer, Mary Anne Krogh Evaluation & the Health Professions.2019; 42(1): 82. CrossRef
Competence Assessment Instruments in Perianesthesia Nursing Care: A Scoping Review of the Literature Yunsuk Jeon, Riitta-Liisa Lakanmaa, Riitta Meretoja, Helena Leino-Kilpi Journal of PeriAnesthesia Nursing.2017; 32(6): 542. CrossRef
Training and Competency in Sedation Practice in Gastrointestinal Endoscopy Ben Da, James Buxbaum Gastrointestinal Endoscopy Clinics of North America.2016; 26(3): 443. CrossRef
Linking Simulation-Based Educational Assessments and Patient-Related Outcomes Ryan Brydges, Rose Hatala, Benjamin Zendejas, Patricia J. Erwin, David A. Cook Academic Medicine.2015; 90(2): 246. CrossRef
Simulation With PARTS (Phase-Augmented Research and Training Scenarios) Carl J. Schick, Mona Weiss, Michaela Kolbe, Adrian Marty, Micha Dambach, Axel Knauth, Donat R. Spahn, Gudela Grote, Bastian Grande Simulation in Healthcare: The Journal of the Society for Simulation in Healthcare.2015; 10(3): 178. CrossRef
Endoscopy nurse-administered propofol sedation performance. Development of an assessment tool and a reliability testing model Jeppe Thue Jensen, Lars Konge, Ann Møller, Pernille Hornslet, Peter Vilmann Scandinavian Journal of Gastroenterology.2014; 49(8): 1014. CrossRef
Exploration of Specialty Certification for Nurse Anesthetists: Nonsurgical Pain Management as a Test Case Steven Wooden, Sharron Docherty, Karen Plaus, Anthony Kusek, Charles Vacchiano Pain Management Nursing.2014; 15(4): 789. CrossRef
What counts as validity evidence? Examples and prevalence in a systematic review of simulation-based assessment David A. Cook, Benjamin Zendejas, Stanley J. Hamstra, Rose Hatala, Ryan Brydges Advances in Health Sciences Education.2014; 19(2): 233. CrossRef
Technology-Enhanced Simulation to Assess Health Professionals David A. Cook, Ryan Brydges, Benjamin Zendejas, Stanley J. Hamstra, Rose Hatala Academic Medicine.2013; 88(6): 872. CrossRef
Review article: Assessment in anesthesiology education John R. Boulet, David Murray Canadian Journal of Anesthesia/Journal canadien d'anesthésie.2012; 59(2): 182. CrossRef
Review article: Simulation in anesthesia: state of the science and looking forward Vicki R. LeBlanc Canadian Journal of Anesthesia/Journal canadien d'anesthésie.2012; 59(2): 193. CrossRef
Simulation and Quality Improvement in Anesthesiology Christine S. Park Anesthesiology Clinics.2011; 29(1): 13. CrossRef
Performance in assessment: Consensus statement and recommendations from the Ottawa conference Katharine Boursicot, Luci Etheridge, Zeryab Setna, Alison Sturrock, Jean Ker, Sydney Smee, Elango Sambandam Medical Teacher.2011; 33(5): 370. CrossRef
To test the applicability of item response theory (IRT) to the Korean Nurses' Licensing Examination (KNLE), item analysis was performed after testing the unidimensionality and goodness-of-fit. The results were compared with those based on classical test theory. The results of the 330-item KNLE administered to 12,024 examinees in January 2004 were analyzed. Unidimensionality was tested using DETECT and the goodness-of-fit was tested using WINSTEPS for the Rasch model and Bilog-MG for the two-parameter logistic model. Item analysis and ability estimation were done using WINSTEPS. Using DETECT, Dmax ranged from 0.1 to 0.23 for each subject. The mean square value of the infit and outfit values of all items using WINSTEPS ranged from 0.1 to 1.5, except for one item in pediatric nursing, which scored 1.53. Of the 330 items, 218 (42.7%) were misfit using the two-parameter logistic model of Bilog-MG. The correlation coefficients between the difficulty parameter using the Rasch model and the difficulty index from classical test theory ranged from 0.9039 to 0.9699. The correlation between the ability parameter using the Rasch model and the total score from classical test theory ranged from 0.9776 to 0.9984. Therefore, the results of the KNLE fit unidimensionality and goodness-of-fit for the Rasch model. The KNLE should be a good sample for analysis according to the IRT Rasch model, so further research using IRT is possible.
Citations
Citations to this article as recorded by
Item difficulty index, discrimination index, and reliability of the 26 health professions licensing examinations in 2022, Korea: a psychometric study Yoon Hee Kim, Bo Hyun Kim, Joonki Kim, Bokyoung Jung, Sangyoung Bae Journal of Educational Evaluation for Health Professions.2023; 20: 31. CrossRef
Study on the Academic Competency Assessment of Herbology Test using Rasch Model Han Chae, Soo Jin Lee, Chang-ho Han, Young Il Cho, Hyungwoo Kim Journal of Korean Medicine.2022; 43(2): 27. CrossRef
Can computerized tests be introduced to the Korean Medical Licensing Examination? Sun Huh Journal of the Korean Medical Association.2012; 55(2): 124. CrossRef
The results of the 64th and 65th Korean Medical Licensing Examination were analyzed according to the classical test theory and item response theory in order to know the possibility of applying item response theory to item analys and to suggest its applicability to computerized adaptive test. The correlation coefficiency of difficulty index, discriminating index and ability parameter between two kinds of analysis were got using computer programs such as Analyst 4.0, Bilog and Xcalibre. Correlation coefficiencies of difficulty index were equal to or more than 0.75; those of discriminating index were between - 0.023 and 0.753; those of ability parameter were equal to or more than 0.90. Those results suggested that the item analysis according to item response theory showed the comparable results with that according to classical test theory except discriminating index. Since the ability parameter is most widely used in the criteria-reference test, the high correlation between ability parameter and total score can provide the validity of computerized adaptive test utilizing item response theory.
Citations
Citations to this article as recorded by
Journal of Educational Evaluation for Health Professions received the top-ranking Journal Impact Factor―9.3—in the category of Education, Scientific Disciplines in the 2023 Journal Citation Ranking by Clarivate Sun Huh Journal of Educational Evaluation for Health Professions.2024; 21: 16. CrossRef
Analysis on Validity and Academic Competency of Mock Test for Korean Medicine National Licensing Examination Using Item Response Theory Han Chae, Eunbyul Cho, SeonKyoung Kim, DaHye Choi, Seul Lee Keimyung Medical Journal.2023; 42(1): 7. CrossRef
Item difficulty index, discrimination index, and reliability of the 26 health professions licensing examinations in 2022, Korea: a psychometric study Yoon Hee Kim, Bo Hyun Kim, Joonki Kim, Bokyoung Jung, Sangyoung Bae Journal of Educational Evaluation for Health Professions.2023; 20: 31. CrossRef
Can computerized tests be introduced to the Korean Medical Licensing Examination? Sun Huh Journal of the Korean Medical Association.2012; 55(2): 124. CrossRef