What Are You Measuring? Dimensionality and Reliability Analysis of Ability and Speed in Medical School Didactic Examinations.

Thompson Jj

What Are You Measuring? Dimensionality and Reliability Analysis of Ability and Speed in Medical School Didactic Examinations.

2016

Thompson Jj

Summative didactic evaluation often involves multiple choice questions which are then aggregated into exam scores, course scores, and cumulative grade point averages. To be valid, each of these levels should have some relationship to the topic tested (dimensionality) and be sufficiently reproducible between persons (reliability) to justify student ranking. Evaluation of dimensionality is difficult and is complicated by the classic observation that didactic performance involves a generalized component (g) in addition to subtest specific factors. In this work, 183 students were analyzed over two academic years in 13 courses with 44 exams and 3352 questions for both accuracy and speed. Reliability at all levels was good (>0.95). Assessed by bifactor analysis, g effects dominated most levels resulting in essential unidimensionality. Effect sizes on predicted accuracy and speed due to nesting in exams and courses was small. There was little relationship between person ability and person speed. Thus, the hierarchical grading system appears warrented because of its g-dependence.

Keywords:

Correction
Source
Cite
Save
Machine Reading By IdeaReader

References

Citations