The reported validity and reliability of methods for evaluating continuing medical education: a systematic review

Neda Ratanawongsa; Patricia A Thomas; Spyridon S Marinopoulos; Todd Dorman; Lisa M Wilson; Bimal H Ashar; Jeffrey L Magaziner; Redonda G Miller; Gregory P Prokopowicz; Rehan Qayyum; Eric B Bass

doi:10.1097/ACM.0b013e3181637925

The reported validity and reliability of methods for evaluating continuing medical education: a systematic review

Acad Med. 2008 Mar;83(3):274-83. doi: 10.1097/ACM.0b013e3181637925.

Authors

Neda Ratanawongsa¹, Patricia A Thomas, Spyridon S Marinopoulos, Todd Dorman, Lisa M Wilson, Bimal H Ashar, Jeffrey L Magaziner, Redonda G Miller, Gregory P Prokopowicz, Rehan Qayyum, Eric B Bass

Affiliation

¹ Department of Medicine, Johns Hopkins University School of Medicine, Johns Hopkins Bayview Medical Center, 5200 Eastern Avenue, Suite 2300, Baltimore, MD 21224, USA. neda@jhmi.edu

PMID: 18316877
DOI: 10.1097/ACM.0b013e3181637925

Abstract

Purpose: To appraise the reported validity and reliability of evaluation methods used in high-quality trials of continuing medical education (CME).

Method: The authors conducted a systematic review (1981 to February 2006) by hand-searching key journals and searching electronic databases. Eligible articles studied CME effectiveness using randomized controlled trials or historic/concurrent comparison designs, were conducted in the United States or Canada, were written in English, and involved at least 15 physicians. Sequential double review was conducted for data abstraction, using a traditional approach to validity and reliability.

Results: Of 136 eligible articles, 47 (34.6%) reported the validity or reliability of at least one evaluation method, for a total of 62 methods; 31 methods were drawn from previous sources. The most common targeted outcome was practice behavior (21 methods). Validity was reported for 31 evaluation methods, including content (16), concurrent criterion (8), predictive criterion (1), and construct (5) validity. Reliability was reported for 44 evaluation methods, including internal consistency (20), interrater (16), intrarater (2), equivalence (4), and test-retest (5) reliability. When reported, statistical tests yielded modest evidence of validity and reliability. Translated to the contemporary classification approach, our data indicate that reporting about internal structure validity exceeded reporting about other categories of validity evidence.

Conclusions: The evidence for CME effectiveness is limited by weaknesses in the reported validity and reliability of evaluation methods. Educators should devote more attention to the development and reporting of high-quality CME evaluation methods and to emerging guidelines for establishing the validity of CME evaluation methods.

Publication types

Research Support, Non-U.S. Gov't
Research Support, U.S. Gov't, Non-P.H.S.
Review
Systematic Review

MeSH terms

Cognition
Cost-Benefit Analysis
Curriculum
Education, Medical, Continuing / economics
Education, Medical, Continuing / methods*
Educational Measurement
Educational Status
Health Knowledge, Attitudes, Practice*
Humans
Models, Educational
Reproducibility of Results*