Evaluating hippocampal internal architecture on MRI: inter-rater reliability of a proposed scoring system

Epilepsy Res. 2013 Sep;106(1-2):146-54. doi: 10.1016/j.eplepsyres.2013.05.009. Epub 2013 Aug 3.

Abstract

Background: Asymmetry of hippocampal internal architecture (HIA) has been reported to be a frequent imaging finding in epilepsy patients with temporal lobe epilepsy (TLE) who exhibit other signs of hippocampal sclerosis. HIA asymmetry may also be an independent predictor of the side of seizure onset in patients with otherwise normal MRI scans. The study of HIA asymmetry and its relationship to the laterality of TLE would benefit from a reliable method of assessing the clarity of HIA in MRI scans. We propose a visual scoring system that rates HIA clarity from 1 (imperceptible) to 4 (excellent) and report the inter-rater reliability (IRR) of this system.

Methods: In the initial preliminary phase of this study we examined IRR using a kappa statistic (κ) among a mixed group of expert and non-expert reviewers using only a brief description of the scoring system to score single images from a series of patients. In the second phase we explored the effect of training on the use of our HIA scoring system by assessing IRR among neuroimaging experts before and after a brief interactive training session. In this phase, multiple slices from each patient were scored. Separate κ values and intraclass correlation coefficients (ICC) were calculated from the scores given to each hippocampal image and from the asymmetry of scores between left and right for each slice. In the third phase the effect of training on non-expert reviewers was explored using a similar approach as with the expert reviewers.

Results: In the preliminary phase of the study, HIA scoring of single images showed substantial agreement among expert reviewers (κHIA=0.65), fair agreement among non-expert reviewers (κHIA=0.27), and a fair to moderate degree of agreement among all the reviewers as a whole (κHIA=0.40). In the second phase, prior to training there was substantial agreement among expert reviewers in regard to the individual HIA scores (κHIA=0.62; ICCHIA=0.81) but only moderate agreement on the degree of asymmetry (κAsym=0.47; ICCAsym=0.71). Training improved agreement on the individual HIA scores (κHIA=0.58-0.72; ICCHIA=0.76-0.84) and on the degree of asymmetry (κAsym=0.61-0.67; ICCAsym=0.81-0.85). Among non-expert reviewers, scores improved from only a fair degree of agreement pre-training (κHIA=0.25, κAsym=0.25; ICCHIA=0.68, ICCAsym=0.66) to a moderate level of agreement after training (κHIA=0.54, κAsym=0.52; ICCHIA=0.78, ICCAsym=0.81).

Conclusions: The proposed HIA scoring system has a substantial degree of inter-rater reliability among experienced neuroimaging reviewers. Training improves the detection of asymmetries in HIA score in particular. Non-expert reviewers can employ the system with a moderate degree of reliability, and training has an even greater impact on the improvement of scoring reliability.

Keywords: Ammon's horn; HIA; HS; Hippocampus; IRR; Inter-rater reliability; Internal architecture; MRI; TLE; Temporal lobe epilepsy; hippocampal internal architecture; hippocampal sclerosis; inter-rater reliability; temporal lobe epilepsy.

Publication types

  • Research Support, N.I.H., Extramural

MeSH terms

  • Adolescent
  • Adult
  • Age of Onset
  • Artifacts
  • Data Interpretation, Statistical
  • Epilepsy, Temporal Lobe / pathology*
  • Female
  • Hippocampus / pathology*
  • Humans
  • Image Interpretation, Computer-Assisted
  • Magnetic Resonance Imaging / methods*
  • Male
  • Middle Aged
  • Observer Variation
  • Reproducibility of Results
  • Young Adult