Purpose: To investigate the measurement characteristics of standardized clinical evaluation forms (CEFs) used to assign grades for clerkship performance.
Method: In 1996-97, the authors reviewed 5,168 CEFs completed for 175 students in eight clerkships. Limiting their analysis to the three clerkships that produced the most CEFs, the authors conducted a generalizability study to determine the five variance components for each clerkship. A decision study then calculated the generalizability coefficients and standard errors of measurement in each clerkship for varied numbers of raters and CEF items.
Results: The generalizability study found large variance components attributable to rater and rating context. The decision study found that, when three or more raters completed CEFs for a student, the generalizability coefficient and standard error of measurement reached levels acceptable for grading. Increasing the number of items on the CEF had no significant effect.
Conclusion: The reliability of assigning students clerkship grades based on single CEFs is unacceptably low. However, CEFs can accurately measure students' clerkship performances if completed by three or more raters.