Item difficulty index, discrimination index, and reliability of the 26 health professions licensing examinations in 2022, Korea: a psychometric study
Article information
Abstract
Purpose
This study presents item analysis results of the 26 health personnel licensing examinations managed by the Korea Health Personnel Licensing Examination Institute (KHPLEI) in 2022.
Methods
The item difficulty index, item discrimination index, and reliability were calculated. The item discrimination index was calculated using a discrimination index based on the upper and lower 27% rule and the item-total correlation.
Results
Out of 468,352 total examinees, 418,887 (89.4%) passed. The pass rates ranged from 27.3% for health educators level 1 to 97.1% for oriental medical doctors. Most examinations had a high average difficulty index, albeit to varying degrees, ranging from 61.3% for prosthetists and orthotists to 83.9% for care workers. The average discrimination index based on the upper and lower 27% rule ranged from 0.17 for oriental medical doctors to 0.38 for radiological technologists. The average item-total correlation ranged from 0.20 for oriental medical doctors to 0.38 for radiological technologists. The Cronbach α, as a measure of reliability, ranged from 0.872 for health educators-level 3 to 0.978 for medical technologists. The correlation coefficient between the average difficulty index and average discrimination index was -0.2452 (P=0.1557), that between the average difficulty index and the average item-total correlation was 0.3502 (P=0.0392), and that between the average discrimination index and the average item-total correlation was 0.7944 (P<0.0001).
Conclusion
This technical report presents the item analysis results and reliability of the recent examinations by the KHPLEI, demonstrating an acceptable range of difficulty index and discrimination index values, as well as good reliability.
Introduction
Background
The Korea Health Personnel Licensing Examination Institute (KHPLEI) conducts national health personnel licensing examinations every year to assess whether candidates of health professionals have the minimal competency to practice medical and health care in the field. These national licensing examinations should consist of appropriate items that assess candidates for their competencies. After the examinations, it is essential to analyze the items’ difficulty, discrimination, and reliability and evaluate their appropriateness. If certain items show too high or low difficulty and discrimination index, the content of those items should be rechecked. The results of those analyses can be reflected in subsequent examinations.
Objectives
The examination results were analyzed for item difficulty, discrimination, and reliability to evaluate the appropriateness of the items in 26 health personnel licensing examinations in Korea. The correlation between the average item difficulty index and the average item discrimination index was also assessed.
Methods
Ethics statement
This was not a human population study, but an analysis of the test results. Therefore, neither approval by the institutional review board nor obtainment of informed consent was required.
Study design
This was a descriptive psychometric study based on examinees’ responses to the licensing examinations.
Setting
Licensing examinations for 26 health professions from January to February 2022 were included for item analysis based on the classical test theory.
Variables
The items’ difficulty index, discrimination index, and reliability were variables.
Data source/measurement
Based on classical test theory, item difficulty, discrimination, and test reliability were calculated from the data. The item difficulty of each item was calculated as follows: P=(number of examines of correct choice)/(number of all examinees). It ranged from 0 to 1. The discrimination index was calculated using 2 commonly used methods. The first one was a method to find the difference in difficulty between the top 27% group and the bottom 27% group; this is called the upper and lower 27% rule method. The second method involved finding the correlation coefficients between items and the total score. The discrimination index and test reliability based on classical test theory were calculated only for tests with more than 100 examinees.
Bias
There was no bias in selecting data. All data were included.
Study size
Sample size estimation was not necessary because all data were included.
Statistical methods
The item analysis based on classical test theory was done using IBM SPSS ver. 21.0 (IBM Corp.) The correlation analysis was done using DBSTAT ver. 5.0 (DBSTAT Co.).
Results
Pass rates
Table 1 shows the results of the 26 national licensure examinations for healthcare professionals administered in 2022 (Supplement 1). The analysis covered 36 examinations, including those that are administered more than once a year. The number of examinees ranged from as few as 11 to as many as 96,541. Among them, the examinations for midwives and health educators level 1 and the rehabilitation counselors level 2 had fewer than 100 examinees, so only the difficulty based on the classical test theory was analyzed, while other analyses, such as the discrimination and reliability according to classical test theory, were omitted. The pass rate varied from 27.3% to 97.1%. The pass rate of the health educator level 1 examination was less than 50%, which was markedly lower than those of other professions. The pass rates of physicians, dentists, midwives, nurses, oriental medical doctors, pharmacists, rehabilitation counselors, and the 38th, 39th, and 41st examinations for care workers were over 90%.
Item analysis
The results of the item analysis of the examinations conducted in 2022 are presented in Table 2. Reliability was very high for all professions, with the lowest reliability shown by a Cronbach α value of 0.872 for the health educators level 3 examination. The correlation coefficient between the average difficulty index and the average discrimination index was -0.2452 (P=0.1557). The correlation coefficient between the average difficulty index and the average item-total correlation was 0.3502 (P=0.0392). The correlation coefficient between the average discrimination index and average item-total correlation was 0.7944 (P<0.0001).
Discussion
Key results
In 2022, 26 health professions licensing examinations displayed pass rates ranging from 27.3% to 97.1%. The examinee numbers varied widely, from 11 to 96,541. Average difficulty indexes ranged from 61.3% to 83.9%. The average item-total correlation ranged from 0.20 to 0.38. Overall reliability was high, with the lowest Cronbach α value being 0.872.
Interpretation
For the analysis results based on the classical test theory, if the average difficulty is less than 50% to 60%, it is interpreted as moderate, 60% to 70% as somewhat easy, 70% to 80% as easy, and 80% or more as very easy [1]. According to these interpretations, the difficulty indexes of the above examinations were in the range of easy (9), somewhat easy (17), and moderate (12). For medical personnel, including physicians, dentists, nurses, and oriental medical doctors, the difficulty indexes were all in the somewhat easy category. Their pass rates were all over 90%. The Korean government controlled the school admission capacity for those professionals. The minimum requirement for them was not excessively high, and those examinees performed at a high level.
For the discriminant power, if the average discrimination index is less than 0.25, it is interpreted as a test with low discriminant power, 0.25 to 0.30 as a test with average discriminant power, and 0.30 or more as a test with good discriminant power [2]. According to this interpretation, 8 examinations had low discrimination power, 14 had average discrimination power, and 13 had good discrimination power. Regarding the item-total correlations, 7 examinations had low values, 11 had average values, and 17 had good values. For reliability, we found that all tests had Cronbach α values equal to or greater than 0.872, indicating high reliability across all examinations.
Comparison with previous studies
Some studies have published item analyses of Korea’s health personnel licensing examinations. Investigations of the difficulty and discrimination indexes of the 64th (2000) and 65th (2001) Korean Medical Licensing Examinations (KMLE), based on classical test theory, revealed values of 71.9±21.7 and 68.3±23.5, and 0.22±0.11 and 0.18±0.13, respectively [3].
In addition to item analysis, the proportions of question items, according to their cognitive domain levels and types of multiple choice questions (MCQs), and the contents of medical knowledge of the KMLE conducted in 1992 and 1993 were explored. In 1992 and 1993, recall-level question items constituted 68.0% of all MCQ question items. The proportions of problem-solving level question items were only 7.7% in 1992 and 11.1% in 1993. The predominant types of MCQs were “best answer type” and “one correct answer type,” comprising 40.7% and 30.9%, respectively, in 1992, and 35.0% and 32.0%, respectively, in 1993 [4]. However, in 2022, problem-solving level question items constituted 55.3% of all MCQ question items, while recall-level question items accounted for only 6.0% [5].
For the nursing licensing examination, the outcomes of the 330-item examination, administered to 12,024 examinees in January 2004, were analyzed. According to classical test theory, the analysis of the items revealed a prevalence of easy items with a difficulty level of 0.7 or higher, and the correlation coefficient between item-total scores ranged from 0.2 to 0.3, indicating moderate discrimination [6]. Notably, there was a limited number of item analyses for the 26 personnel licensing examinations, likely due to challenges in data accessibility.
Limitation
There were no data in the item analysis for each item. A further analysis of the information for each item would help understand the item’s qualities.
Generalizability
The data only reflected an item analysis of health personnel licensing examinations in Korea.
Suggestion for further study
Item response theory needs item analysis to find the more precise and stable item characteristics [7]. The item parameters based on item response theory are invariant and independent of the examinees’ characteristics. With those items, a tailored test, including computerized adaptive testing, can be achieved, where the item’s difficulty can adapt to the examinee’s ability. This method can enhance test efficiency and precision.
Conclusion
The above results of the national health personnel licensing examinations conducted in 2022 showed an acceptable range of difficulty index values, discrimination index values, and reliability, although 8 out of the 25 examinations’ difficulty indexes were low discrimination. This suggests that all examinations administered by the KHPLEI fulfill their purpose—namely, assessing the minimum competency of health professionals to perform in their fields.
Notes
Authors’ contributions
Conceptualization: YHK. Data curation: YHK. Methodology/formal analysis/validation: YHK. BHK, JK, BJ, SB. Project administration: YHK. Funding acquisition: none. Writing–original draft: YHK. Writing–review & editing: YHK. BHK, JK, BJ, SB.
Conflict of interest
All authors are employees of the Korea Health Personnel Licensing Examination Insitute. However, they were not involved in the peer review or decision process. Otherwise, no potential conflict of interest relevant to this article was reported.
Funding
None.
Data availability
The raw data are unavailable due to the Korea Health Personnel Licensing Examination Insititute’s policies. For a more detailed analysis, please reach out to the corresponding author. Some data are available at http://www.kuksiwon.or.kr/notice/brd/m_51/list.do.
Acknowledgements
None.
Supplementary materials
Notes
Editor’s note
This is the first attempt to publish an annual report presenting item analyses of 26 health personnel licensing examinations administered by the Korea Health Personnel Licensing Examination Institute, a publisher of this journal. One of the primary purposes of the journal is to publish this kind of annual report. Although there are no item parameter data for each item, these analyses will help understand the current status of examination results. There has been a negotiation with the publisher on the data sharing of each item’s characteristics according to item response theory and the raw responses of examinees. If data are shared, the best research data will be provided for item analyses and further psychometric studies. I anticipate that this journal can be the best resource for psychometric research on licensing examination data, which are rarely reported worldwide.