Educational/Faculty development material
-
The role of large language models in the peer-review process: opportunities and challenges for medical journal reviewers and editors
-
Jisoo Lee
, Jieun Lee
, Jeong-Ju Yoo
-
J Educ Eval Health Prof. 2025;22:4. Published online January 16, 2025
-
DOI: https://doi.org/10.3352/jeehp.2025.22.4
[Epub ahead of print]
-
-
Abstract
PDF
Supplementary Material
- The peer review process ensures the integrity of scientific research. This is particularly important in the medical field, where research findings directly impact patient care. However, the rapid growth of publications has strained reviewers, causing delays and potential declines in quality. Generative artificial intelligence, especially large language models (LLMs) such as ChatGPT, may assist researchers with efficient, high-quality reviews. This review explores the integration of LLMs into peer review, highlighting their strengths in linguistic tasks and challenges in assessing scientific validity, particularly in clinical medicine. Key points for integration include initial screening, reviewer matching, feedback support, and language review. However, implementing LLMs for these purposes will necessitate addressing biases, privacy concerns, and data confidentiality. We recommend using LLMs as complementary tools under clear guidelines to support, not replace, human expertise in maintaining rigorous peer review standards.
Research article
-
Inter-rater reliability and content validity of the measurement tool for portfolio assessments used in the Introduction to Clinical Medicine course at Ewha Womans University College of Medicine: a methodological study
-
Dong-Mi Yoo
, Jae Jin Han
-
J Educ Eval Health Prof. 2024;21:39. Published online December 10, 2024
-
DOI: https://doi.org/10.3352/jeehp.2024.21.39
-
-
Abstract
PDF
Supplementary Material
- Purpose
This study aimed to examine the reliability and validity of a measurement tool for portfolio assessments in medical education. Specifically, it investigated scoring consistency among raters and assessment criteria appropriateness according to an expert panel.
Methods
A cross-sectional observational study was conducted from September to December 2018 for the Introduction to Clinical Medicine course at the Ewha Womans University College of Medicine. Data were collected for 5 randomly selected portfolios scored by a gold-standard rater and 6 trained raters. An expert panel assessed the validity of 12 assessment items using the content validity index (CVI). Statistical analysis included Pearson correlation coefficients for rater alignment, the intraclass correlation coefficient (ICC) for inter-rater reliability, and the CVI for item-level validity.
Results
Rater 1 had the highest Pearson correlation (0.8916) with the gold-standard rater, while Rater 5 had the lowest (0.4203). The ICC for all raters was 0.3821, improving to 0.4415 after excluding Raters 1 and 5, indicating a 15.6% reliability increase. All assessment items met the CVI threshold of ≥0.75, with some achieving a perfect score (CVI=1.0). However, items like “sources” and “level and degree of performance” showed lower validity (CVI=0.72).
Conclusion
The present measurement tool for portfolio assessments demonstrated moderate reliability and strong validity, supporting its use as a credible tool. For a more reliable portfolio assessment, more faculty training is needed.
History article
-
History of the medical licensure system in Korea from the late 1800s to 1992
-
Sang-Ik Hwang
-
J Educ Eval Health Prof. 2024;21:36. Published online December 9, 2024
-
DOI: https://doi.org/10.3352/jeehp.2024.21.36
-
-
Abstract
PDF
Supplementary Material
- The introduction of modern Western medicine in the late 19th century, notably through vaccination initiatives, marked the beginning of governmental involvement in medical licensure, with the licensing of doctors who performed vaccinations. The establishment of the national medical school “Euihakkyo” in 1899 further formalized medical education and licensure, granting graduates the privilege to practice medicine without additional examinations. The enactment of the Regulations on Doctors in 1900 by the Joseon government aimed to define doctor qualifications, including modern and traditional practitioners, comprehensively. However, resistance from the traditional medical community hindered its full implementation. During the Japanese colonial occupation of the Korean Peninsula from 1910 to 1945, the medical licensure system was controlled by colonial authorities, leading to the marginalization of traditional Korean medicine and the imposition of imperial hierarchical structures. Following liberation in 1945 from Japanese colonial rule, the Korean government undertook significant reforms, culminating in the National Medical Law, which was enacted in 1951. This law redefined doctor qualifications and reinstated the status of traditional Korean medicine. The introduction of national examinations for physicians increased state involvement in ensuring medical competence. The privatization of the Korean Medical Licensing Examination led to the establishment of the Korea Health Personnel Licensing Examination Institute in 1992, which assumed responsibility for administering licensing examinations for all healthcare workers. This shift reflected a move towards specialized management of professional standards. The evolution of the medical licensure system in Korea illustrates a dynamic process shaped by the historical context, balancing the protection of public health with the rights of medical practitioners.
Review
-
The legality and appropriateness of keeping Korean Medical Licensing Examination items confidential: a comparative analysis and review of court rulings
-
Jae Sun Kim
, Dae Un Hong
, Ju Yoen Lee
-
J Educ Eval Health Prof. 2024;21:28. Published online October 15, 2024
-
DOI: https://doi.org/10.3352/jeehp.2024.21.28
-
-
Abstract
PDF
Supplementary Material
- This study examines the legality and appropriateness of keeping the multiple-choice question items of the Korean Medical Licensing Examination (KMLE) confidential. Through an analysis of cases from the United States, Canada, and Australia, where medical licensing exams are conducted using item banks and computer-based testing, we found that exam items are kept confidential to ensure fairness and prevent cheating. In Korea, the Korea Health Personnel Licensing Examination Institute (KHPLEI) has been disclosing KMLE questions despite concerns over exam integrity. Korean courts have consistently ruled that multiple-choice question items prepared by public institutions are non-public information under Article 9(1)(v) of the Korea Official Information Disclosure Act (KOIDA), which exempts disclosure if it significantly hinders the fairness of exams or research and development. The Constitutional Court of Korea has upheld this provision. Given the time and cost involved in developing high-quality items and the need to accurately assess examinees’ abilities, there are compelling reasons to keep KMLE items confidential. As a public institution responsible for selecting qualified medical practitioners, KHPLEI should establish its disclosure policy based on a balanced assessment of public interest, without influence from specific groups. We conclude that KMLE questions qualify as non-public information under KOIDA, and KHPLEI may choose to maintain their confidentiality to ensure exam fairness and efficiency.
Research articles
-
A new performance evaluation indicator for the LEE Jong-wook Fellowship Program of Korea Foundation for International Healthcare to better assess its long-term educational impacts: a Delphi study
-
Minkyung Oh
, Bo Young Yoon
-
J Educ Eval Health Prof. 2024;21:27. Published online October 2, 2024
-
DOI: https://doi.org/10.3352/jeehp.2024.21.27
-
-
Abstract
PDF
Supplementary Material
- Purpose
The Dr. LEE Jong-wook Fellowship Program, established by the Korea Foundation for International Healthcare (KOFIH), aims to strengthen healthcare capacity in partner countries. The aim of the study was to develop new performance evaluation indicators for the program to better assess long-term educational impact across various courses and professional roles.
Methods
A 3-stage process was employed. First, a literature review of established evaluation models (Kirkpatrick’s 4 levels, context/input/process/product evaluation model, Organization for Economic Cooperation and Development Assistance Committee criteria) was conducted to devise evaluation criteria. Second, these criteria were validated via a 2-round Delphi survey with 18 experts in training projects from May 2021 to June 2021. Third, the relative importance of the evaluation criteria was determined using the analytic hierarchy process (AHP), calculating weights and ensuring consistency through the consistency index and consistency ratio (CR), with CR values below 0.1 indicating acceptable consistency.
Results
The literature review led to a combined evaluation model, resulting in 4 evaluation areas, 20 items, and 92 indicators. The Delphi surveys confirmed the validity of these indicators, with content validity ratio values exceeding 0.444. The AHP analysis assigned weights to each indicator, and CR values below 0.1 indicated consistency. The final set of evaluation indicators was confirmed through a workshop with KOFIH and adopted as the new evaluation tool.
Conclusion
The developed evaluation framework provides a comprehensive tool for assessing the long-term outcomes of the Dr. LEE Jong-wook Fellowship Program. It enhances evaluation capabilities and supports improvements in the training program’s effectiveness and international healthcare collaboration.
-
Impact of a change from A–F grading to honors/pass/fail grading on academic performance at Yonsei University College of Medicine in Korea: a cross-sectional serial mediation analysis
-
Min-Kyeong Kim
, Hae Won Kim
-
J Educ Eval Health Prof. 2024;21:20. Published online August 16, 2024
-
DOI: https://doi.org/10.3352/jeehp.2024.21.20
-
Correction in: J Educ Eval Health Prof 2024;21(0):35
-
1,063
View
-
301
Download
-
1
Crossref
-
Abstract
PDF
Supplementary Material
- Purpose
This study aimed to explore how the grading system affected medical students’ academic performance based on their perceptions of the learning environment and intrinsic motivation in the context of changing from norm-referenced A–F grading to criterion-referenced honors/pass/fail grading.
Methods
The study involved 238 second-year medical students from 2014 (n=127, A–F grading) and 2015 (n=111, honors/pass/fail grading) at Yonsei University College of Medicine in Korea. Scores on the Dundee Ready Education Environment Measure, the Academic Motivation Scale, and the Basic Medical Science Examination were used to measure overall learning environment perceptions, intrinsic motivation, and academic performance, respectively. Serial mediation analysis was conducted to examine the pathways between the grading system and academic performance, focusing on the mediating roles of student perceptions and intrinsic motivation.
Results
The honors/pass/fail grading class students reported more positive perceptions of the learning environment, higher intrinsic motivation, and better academic performance than the A–F grading class students. Mediation analysis demonstrated a serial mediation effect between the grading system and academic performance through learning environment perceptions and intrinsic motivation. Student perceptions and intrinsic motivation did not independently mediate the relationship between the grading system and performance.
Conclusion
Reducing the number of grades and eliminating rank-based grading might have created an affirming learning environment that fulfills basic psychological needs and reinforces the intrinsic motivation linked to academic performance. The cumulative effect of these 2 mediators suggests that a comprehensive approach should be used to understand student performance.
-
Citations
Citations to this article as recorded by

- Erratum: Impact of a change from A–F grading to honors/pass/fail grading on academic performance at Yonsei University College of Medicine in Korea: a cross-sectional serial mediation analysis
Journal of Educational Evaluation for Health Professions.2024; 21: 35. CrossRef
Special article on the 20th anniversary of the journal
-
Comparison of real data and simulated data analysis of a stopping rule based on the standard error of measurement in computerized adaptive testing for medical examinations in Korea: a psychometric study
-
Dong Gi Seo
, Jeongwook Choi
, Jinha Kim
-
J Educ Eval Health Prof. 2024;21:18. Published online July 9, 2024
-
DOI: https://doi.org/10.3352/jeehp.2024.21.18
-
-
Abstract
PDF
Supplementary Material
- Purpose
This study aimed to compare and evaluate the efficiency and accuracy of computerized adaptive testing (CAT) under 2 stopping rules (standard error of measurement [SEM]=0.3 and 0.25) using both real and simulated data in medical examinations in Korea.
Methods
This study employed post-hoc simulation and real data analysis to explore the optimal stopping rule for CAT in medical examinations. The real data were obtained from the responses of 3rd-year medical students during examinations in 2020 at Hallym University College of Medicine. Simulated data were generated using estimated parameters from a real item bank in R. Outcome variables included the number of examinees’ passing or failing with SEM values of 0.25 and 0.30, the number of items administered, and the correlation. The consistency of real CAT result was evaluated by examining consistency of pass or fail based on a cut score of 0.0. The efficiency of all CAT designs was assessed by comparing the average number of items administered under both stopping rules.
Results
Both SEM 0.25 and SEM 0.30 provided a good balance between accuracy and efficiency in CAT. The real data showed minimal differences in pass/fail outcomes between the 2 SEM conditions, with a high correlation (r=0.99) between ability estimates. The simulation results confirmed these findings, indicating similar average item numbers between real and simulated data.
Conclusion
The findings suggest that both SEM 0.25 and 0.30 are effective termination criteria in the context of the Rasch model, balancing accuracy and efficiency in CAT.
-
Development of examination objectives for the Korean paramedic and emergency medical technician examination: a survey study
-
Tai-hwan Uhm
, Heakyung Choi
, Seok Hwan Hong
, Hyungsub Kim
, Minju Kang
, Keunyoung Kim
, Hyejin Seo
, Eunyoung Ki
, Hyeryeong Lee
, Heejeong Ahn
, Uk-jin Choi
, Sang Woong Park
-
J Educ Eval Health Prof. 2024;21:13. Published online June 12, 2024
-
DOI: https://doi.org/10.3352/jeehp.2024.21.13
-
-
Abstract
PDF
Supplementary Material
- Purpose
The duties of paramedics and emergency medical technicians (P&EMTs) are continuously changing due to developments in medical systems. This study presents evaluation goals for P&EMTs by analyzing their work, especially the tasks that new P&EMTs (with less than 3 years’ experience) find difficult, to foster the training of P&EMTs who could adapt to emergency situations after graduation.
Methods
A questionnaire was created based on prior job analyses of P&EMTs. The survey questions were reviewed through focus group interviews, from which 253 task elements were derived. A survey was conducted from July 10, 2023 to October 13, 2023 on the frequency, importance, and difficulty of the 6 occupations in which P&EMTs were employed.
Results
The P&EMTs’ most common tasks involved obtaining patients’ medical histories and measuring vital signs, whereas the most important task was cardiopulmonary resuscitation (CPR). The task elements that the P&EMTs found most difficult were newborn delivery and infant CPR. New paramedics reported that treating patients with fractures, poisoning, and childhood fever was difficult, while new EMTs reported that they had difficulty keeping diaries, managing ambulances, and controlling infection.
Conclusion
Communication was the most important item for P&EMTs, whereas CPR was the most important skill. It is important for P&EMTs to have knowledge of all tasks; however, they also need to master frequently performed tasks and those that pose difficulties in the field. By deriving goals for evaluating P&EMTs, changes could be made to their education, thereby making it possible to train more capable P&EMTs.
-
Revised evaluation objectives of the Korean Dentist Clinical Skill Test: a survey study and focus group interviews
-
Jae-Hoon Kim
, Young J Kim
, Deuk-Sang Ma
, Se-Hee Park
, Ahran Pae
, June-Sung Shim
, Il-Hyung Yang
, Ui-Won Jung
, Byung-Joon Choi
, Yang-Hyun Chun
-
J Educ Eval Health Prof. 2024;21:11. Published online May 30, 2024
-
DOI: https://doi.org/10.3352/jeehp.2024.21.11
-
-
Abstract
PDF
Supplementary Material
- Purpose
This study aimed to propose a revision of the evaluation objectives of the Korean Dentist Clinical Skill Test by analyzing the opinions of those involved in the examination after a review of those objectives.
Methods
The clinical skill test objectives were reviewed based on the national-level dental practitioner competencies, dental school educational competencies, and the third dental practitioner job analysis. Current and former examinees were surveyed about their perceptions of the evaluation objectives. The validity of 22 evaluation objectives and overlapping perceptions based on area of specialty were surveyed on a 5-point Likert scale by professors who participated in the clinical skill test and dental school faculty members. Additionally, focus group interviews were conducted with experts on the examination.
Results
It was necessary to consider including competency assessments for “emergency rescue skills” and “planning and performing prosthetic treatment.” There were no significant differences between current and former examinees in their perceptions of the clinical skill test’s objectives. The professors who participated in the examination and dental school faculty members recognized that most of the objectives were valid. However, some responses stated that “oromaxillofacial cranial nerve examination,” “temporomandibular disorder palpation test,” and “space management for primary and mixed dentition” were unfeasible evaluation objectives and overlapped with dental specialty areas.
Conclusion
When revising the Korean Dentist Clinical Skill Test’s objectives, it is advisable to consider incorporating competency assessments related to “emergency rescue skills” and “planning and performing prosthetic treatment.”
-
Importance, performance frequency, and predicted future importance of dietitians’ jobs by practicing dietitians in Korea: a survey study
-
Cheongmin Sohn
, Sooyoun Kwon
, Won Gyoung Kim
, Kyung-Eun Lee
, Sun-Young Lee
, Seungmin Lee
-
J Educ Eval Health Prof. 2024;21:1. Published online January 2, 2024
-
DOI: https://doi.org/10.3352/jeehp.2024.21.1
-
-
Abstract
PDF
Supplementary Material
- Purpose
This study aimed to explore the perceptions held by practicing dietitians of the importance of their tasks performed in current work environments, the frequency at which those tasks are performed, and predictions about the importance of those tasks in future work environments.
Methods
This was a cross-sectional survey study. An online survey was administered to 350 practicing dietitians. They were asked to assess the importance, performance frequency, and predicted changes in the importance of 27 tasks using a 5-point scale. Descriptive statistics were calculated, and the means of the variables were compared across categorized work environments using analysis of variance.
Results
The importance scores of all surveyed tasks were higher than 3.0, except for the marketing management task. Self-development, nutrition education/counseling, menu planning, food safety management, and documentation/data management were all rated higher than 4.0. The highest performance frequency score was related to documentation/data management. The importance scores of all duties, except for professional development, differed significantly by workplace. As for predictions about the future importance of the tasks surveyed, dietitians responded that the importance of all 27 tasks would either remain at current levels or increase in the future.
Conclusion
Twenty-seven tasks were confirmed to represent dietitians’ job functions in various workplaces. These tasks can be used to improve the test specifications of the Korean Dietitian Licensing Examination and the curriculum of dietetic education programs.
-
Information amount, accuracy, and relevance of generative artificial intelligence platforms’ answers regarding learning objectives of medical arthropodology evaluated in English and Korean queries in December 2023: a descriptive study
-
Hyunju Lee
, Soobin Park
-
J Educ Eval Health Prof. 2023;20:39. Published online December 28, 2023
-
DOI: https://doi.org/10.3352/jeehp.2023.20.39
-
-
2,749
View
-
221
Download
-
3
Web of Science
-
3
Crossref
-
Abstract
PDF
Supplementary Material
- Purpose
This study assessed the performance of 6 generative artificial intelligence (AI) platforms on the learning objectives of medical arthropodology in a parasitology class in Korea. We examined the AI platforms’ performance by querying in Korean and English to determine their information amount, accuracy, and relevance in prompts in both languages.
Methods
From December 15 to 17, 2023, 6 generative AI platforms—Bard, Bing, Claude, Clova X, GPT-4, and Wrtn—were tested on 7 medical arthropodology learning objectives in English and Korean. Clova X and Wrtn are platforms from Korean companies. Responses were evaluated using specific criteria for the English and Korean queries.
Results
Bard had abundant information but was fourth in accuracy and relevance. GPT-4, with high information content, ranked first in accuracy and relevance. Clova X was 4th in amount but 2nd in accuracy and relevance. Bing provided less information, with moderate accuracy and relevance. Wrtn’s answers were short, with average accuracy and relevance. Claude AI had reasonable information, but lower accuracy and relevance. The responses in English were superior in all aspects. Clova X was notably optimized for Korean, leading in relevance.
Conclusion
In a study of 6 generative AI platforms applied to medical arthropodology, GPT-4 excelled overall, while Clova X, a Korea-based AI product, achieved 100% relevance in Korean queries, the highest among its peers. Utilizing these AI platforms in classrooms improved the authors’ self-efficacy and interest in the subject, offering a positive experience of interacting with generative AI platforms to question and receive information.
-
Citations
Citations to this article as recorded by

- Opportunities, challenges, and future directions of large language models, including ChatGPT in medical education: a systematic scoping review
Xiaojun Xu, Yixiao Chen, Jing Miao
Journal of Educational Evaluation for Health Professions.2024; 21: 6. CrossRef - The emergence of generative artificial intelligence platforms in 2023, journal metrics, appreciation to reviewers and volunteers, and obituary
Sun Huh
Journal of Educational Evaluation for Health Professions.2024; 21: 9. CrossRef - Comparison of the Performance of ChatGPT, Claude and Bard in Support of Myopia Prevention and Control
Yan Wang, Lihua Liang, Ran Li, Yihua Wang, Changfu Hao
Journal of Multidisciplinary Healthcare.2024; Volume 17: 3917. CrossRef
-
Effect of a transcultural nursing course on improving the cultural competency of nursing graduate students in Korea: a before-and-after study
-
Kyung Eui Bae
, Geum Hee Jeong
-
J Educ Eval Health Prof. 2023;20:35. Published online December 4, 2023
-
DOI: https://doi.org/10.3352/jeehp.2023.20.35
-
-
Abstract
PDF
Supplementary Material
- Purpose
This study aimed to evaluate the impact of a transcultural nursing course on enhancing the cultural competency of graduate nursing students in Korea. We hypothesized that participants’ cultural competency would significantly improve in areas such as communication, biocultural ecology and family, dietary habits, death rituals, spirituality, equity, and empowerment and intermediation after completing the course. Furthermore, we assessed the participants’ overall satisfaction with the course.
Methods
A before-and-after study was conducted with graduate nursing students at Hallym University, Chuncheon, Korea, from March to June 2023. A transcultural nursing course was developed based on Giger & Haddad’s transcultural nursing model and Purnell’s theoretical model of cultural competence. Data was collected using a cultural competence scale for registered nurses developed by Kim and his colleagues. A total of 18 students participated, and the paired t-test was employed to compare pre-and post-intervention scores.
Results
The study revealed significant improvements in all 7 categories of cultural nursing competence (P<0.01). Specifically, the mean differences in scores (pre–post) ranged from 0.74 to 1.09 across the categories. Additionally, participants expressed high satisfaction with the course, with an average score of 4.72 out of a maximum of 5.0.
Conclusion
The transcultural nursing course effectively enhanced the cultural competency of graduate nursing students. Such courses are imperative to ensure quality care for the increasing multicultural population in Korea.
Technical report
-
Item difficulty index, discrimination index, and reliability of the 26 health professions licensing examinations in 2022, Korea: a psychometric study
-
Yoon Hee Kim
, Bo Hyun Kim
, Joonki Kim
, Bokyoung Jung
, Sangyoung Bae
-
J Educ Eval Health Prof. 2023;20:31. Published online November 22, 2023
-
DOI: https://doi.org/10.3352/jeehp.2023.20.31
-
-
Abstract
PDF
Supplementary Material
- Purpose
This study presents item analysis results of the 26 health personnel licensing examinations managed by the Korea Health Personnel Licensing Examination Institute (KHPLEI) in 2022.
Methods
The item difficulty index, item discrimination index, and reliability were calculated. The item discrimination index was calculated using a discrimination index based on the upper and lower 27% rule and the item-total correlation.
Results
Out of 468,352 total examinees, 418,887 (89.4%) passed. The pass rates ranged from 27.3% for health educators level 1 to 97.1% for oriental medical doctors. Most examinations had a high average difficulty index, albeit to varying degrees, ranging from 61.3% for prosthetists and orthotists to 83.9% for care workers. The average discrimination index based on the upper and lower 27% rule ranged from 0.17 for oriental medical doctors to 0.38 for radiological technologists. The average item-total correlation ranged from 0.20 for oriental medical doctors to 0.38 for radiological technologists. The Cronbach α, as a measure of reliability, ranged from 0.872 for health educators-level 3 to 0.978 for medical technologists. The correlation coefficient between the average difficulty index and average discrimination index was -0.2452 (P=0.1557), that between the average difficulty index and the average item-total correlation was 0.3502 (P=0.0392), and that between the average discrimination index and the average item-total correlation was 0.7944 (P<0.0001).
Conclusion
This technical report presents the item analysis results and reliability of the recent examinations by the KHPLEI, demonstrating an acceptable range of difficulty index and discrimination index values, as well as good reliability.
Research articles
-
Medical students’ patterns of using ChatGPT as a feedback tool and perceptions of ChatGPT in a Leadership and Communication course in Korea: a cross-sectional study
-
Janghee Park
-
J Educ Eval Health Prof. 2023;20:29. Published online November 10, 2023
-
DOI: https://doi.org/10.3352/jeehp.2023.20.29
-
-
3,319
View
-
235
Download
-
6
Web of Science
-
8
Crossref
-
Abstract
PDF
Supplementary Material
- Purpose
This study aimed to analyze patterns of using ChatGPT before and after group activities and to explore medical students’ perceptions of ChatGPT as a feedback tool in the classroom.
Methods
The study included 99 2nd-year pre-medical students who participated in a “Leadership and Communication” course from March to June 2023. Students engaged in both individual and group activities related to negotiation strategies. ChatGPT was used to provide feedback on their solutions. A survey was administered to assess students’ perceptions of ChatGPT’s feedback, its use in the classroom, and the strengths and challenges of ChatGPT from May 17 to 19, 2023.
Results
The students responded by indicating that ChatGPT’s feedback was helpful, and revised and resubmitted their group answers in various ways after receiving feedback. The majority of respondents expressed agreement with the use of ChatGPT during class. The most common response concerning the appropriate context of using ChatGPT’s feedback was “after the first round of discussion, for revisions.” There was a significant difference in satisfaction with ChatGPT’s feedback, including correctness, usefulness, and ethics, depending on whether or not ChatGPT was used during class, but there was no significant difference according to gender or whether students had previous experience with ChatGPT. The strongest advantages were “providing answers to questions” and “summarizing information,” and the worst disadvantage was “producing information without supporting evidence.”
Conclusion
The students were aware of the advantages and disadvantages of ChatGPT, and they had a positive attitude toward using ChatGPT in the classroom.
-
Citations
Citations to this article as recorded by

- Opportunities, challenges, and future directions of large language models, including ChatGPT in medical education: a systematic scoping review
Xiaojun Xu, Yixiao Chen, Jing Miao
Journal of Educational Evaluation for Health Professions.2024; 21: 6. CrossRef - Embracing ChatGPT for Medical Education: Exploring Its Impact on Doctors and Medical Students
Yijun Wu, Yue Zheng, Baijie Feng, Yuqi Yang, Kai Kang, Ailin Zhao
JMIR Medical Education.2024; 10: e52483. CrossRef - Integration of ChatGPT Into a Course for Medical Students: Explorative Study on Teaching Scenarios, Students’ Perception, and Applications
Anita V Thomae, Claudia M Witt, Jürgen Barth
JMIR Medical Education.2024; 10: e50545. CrossRef - A cross sectional investigation of ChatGPT-like large language models application among medical students in China
Guixia Pan, Jing Ni
BMC Medical Education.2024;[Epub] CrossRef - A Pilot Study of Medical Student Opinions on Large Language Models
Alan Y Xu, Vincent S Piranio, Skye Speakman, Chelsea D Rosen, Sally Lu, Chris Lamprecht, Robert E Medina, Maisha Corrielus, Ian T Griffin, Corinne E Chatham, Nicolas J Abchee, Daniel Stribling, Phuong B Huynh, Heather Harrell, Benjamin Shickel, Meghan Bre
Cureus.2024;[Epub] CrossRef - The intent of ChatGPT usage and its robustness in medical proficiency exams: a systematic review
Tatiana Chaiban, Zeinab Nahle, Ghaith Assi, Michelle Cherfane
Discover Education.2024;[Epub] CrossRef - ChatGPT and Clinical Training: Perception, Concerns, and Practice of Pharm-D Students
Mohammed Zawiah, Fahmi Al-Ashwal, Lobna Gharaibeh, Rana Abu Farha, Karem Alzoubi, Khawla Abu Hammour, Qutaiba A Qasim, Fahd Abrah
Journal of Multidisciplinary Healthcare.2023; Volume 16: 4099. CrossRef - Information amount, accuracy, and relevance of generative artificial intelligence platforms’ answers regarding learning objectives of medical arthropodology evaluated in English and Korean queries in December 2023: a descriptive study
Hyunju Lee, Soobin Park
Journal of Educational Evaluation for Health Professions.2023; 20: 39. CrossRef
-
Suggestion for item allocation to 8 nursing activity categories of the Korean Nursing Licensing Examination: a survey-based descriptive study
-
Kyunghee Kim
, So Young Kang
, Younhee Kang
, Youngran Kweon
, Hyunjung Kim
, Youngshin Song
, Juyeon Cho
, Mi-Young Choi
, Hyun Su Lee
-
J Educ Eval Health Prof. 2023;20:18. Published online June 12, 2023
-
DOI: https://doi.org/10.3352/jeehp.2023.20.18
-
-
Abstract
PDF
Supplementary Material
- Purpose
This study aims to suggest the number of test items in each of 8 nursing activity categories of the Korean Nursing Licensing Examination, which comprises 134 activity statements including 275 items. The examination will be able to evaluate the minimum ability that nursing graduates must have to perform their duties. Methods: Two opinion surveys involving the members of 7 academic societies were conducted from March 19 to May 14, 2021. The survey results were reviewed by members of 4 expert associations from May 21 to June 4, 2021. The results for revised numbers of items in each category were compared with those reported by Tak and his colleagues and the National Council License Examination for Registered Nurses of the United States. Results: Based on 2 opinion surveys and previous studies, the suggestions for item allocation to 8 nursing activity categories of the Korean Nursing Licensing Examination in this study are as follows: 50 items for management of care and improvement of professionalism, 33 items for safety and infection control, 40 items for management of potential risk, 28 items for basic care, 47 items for physiological integrity and maintenance, 33 items for pharmacological and parenteral therapies, 24 items for psychosocial integrity and maintenance, and 20 items for health promotion and maintenance. Twenty other items related to health and medical laws were not included due to their mandatory status. Conclusion: These suggestions for the number of test items for each activity category will be helpful in developing new items for the Korean Nursing Licensing Examination.