Effects of item and rater characteristics on checklist recording: what should we look for?

Med Educ. 2005 Aug;39(8):852-8. doi: 10.1111/j.1365-2929.2005.02226.x.

Abstract

Objective: Examinations based on using standardised patients (SPs) commonly use checklist recordings to evaluate students' clinical performance. This paper examines whether and to what extent item and rater characteristics affect the reliability of history checklist recording in an SP-based assessment.

Methods: Checklist items were reviewed for the presence or absence of 5 item characteristics and a 2-point versus 3-point scoring scale. Agreement between checklist recordings obtained from SPs and clinician-examiners (CEs) were compared by item characteristics, scoring scale and CEs' level of involvement in the assessment.

Results: Based on 3179 pairs of recordings, the overall percentage of agreement between SPs and CEs was 83% (kappa = 0.64). Agreement was significantly higher for items scored on a 2-point than on a 3-point scale, and when the CE was also the author and the trainer of the station. After controlling for other factors, item characteristics were only marginally associated with level of interrater agreement.

Conclusions: This study suggests that attention should be paid to specific aspects of checklist development and checklist recording training when an SP or CE is used as recorder.

MeSH terms

  • Clinical Competence / standards*
  • Education, Medical, Undergraduate*
  • Educational Measurement / methods*
  • Humans
  • Medical History Taking / standards
  • Physical Examination / standards
  • Students, Medical*
  • Teaching / methods