AI performance by mammographic density in a retrospective cohort study of 99,489 participants in BreastScreen Norway

Eur Radiol. 2024 Mar 25. doi: 10.1007/s00330-024-10681-z. Online ahead of print.

Abstract

Objective: To explore the ability of artificial intelligence (AI) to classify breast cancer by mammographic density in an organized screening program.

Materials and method: We included information about 99,489 examinations from 74,941 women who participated in BreastScreen Norway, 2013-2019. All examinations were analyzed with an AI system that assigned a malignancy risk score (AI score) from 1 (lowest) to 10 (highest) for each examination. Mammographic density was classified into Volpara density grade (VDG), VDG1-4; VDG1 indicated fatty and VDG4 extremely dense breasts. Screen-detected and interval cancers with an AI score of 1-10 were stratified by VDG.

Results: We found 10,406 (10.5% of the total) examinations to have an AI risk score of 10, of which 6.7% (704/10,406) was breast cancer. The cancers represented 89.7% (617/688) of the screen-detected and 44.6% (87/195) of the interval cancers. 20.3% (20,178/99,489) of the examinations were classified as VDG1 and 6.1% (6047/99,489) as VDG4. For screen-detected cancers, 84.0% (68/81, 95% CI, 74.1-91.2) had an AI score of 10 for VDG1, 88.9% (328/369, 95% CI, 85.2-91.9) for VDG2, 92.5% (185/200, 95% CI, 87.9-95.7) for VDG3, and 94.7% (36/38, 95% CI, 82.3-99.4) for VDG4. For interval cancers, the percentages with an AI score of 10 were 33.3% (3/9, 95% CI, 7.5-70.1) for VDG1 and 48.0% (12/25, 95% CI, 27.8-68.7) for VDG4.

Conclusion: The tested AI system performed well according to cancer detection across all density categories, especially for extremely dense breasts. The highest proportion of screen-detected cancers with an AI score of 10 was observed for women classified as VDG4.

Clinical relevance statement: Our study demonstrates that AI can correctly classify the majority of screen-detected and about half of the interval breast cancers, regardless of breast density.

Key points: • Mammographic density is important to consider in the evaluation of artificial intelligence in mammographic screening. • Given a threshold representing about 10% of those with the highest malignancy risk score by an AI system, we found an increasing percentage of cancers with increasing mammographic density. • Artificial intelligence risk score and mammographic density combined may help triage examinations to reduce workload for radiologists.

Keywords: Artificial intelligence; Breast cancer; Breast density; Mammography; Screening.

Grants and funding