16-08-2023 | Breast cancer | News

AI-supported mammography matches standard screen-detected breast cancer rate

medwireNews: Preliminary MASAI trial results indicate that artificial intelligence (AI)-supported mammography has a similar rate of breast cancer detection to standard double reading while reducing the screen-reading workload.

“The clinical safety analysis concludes that the AI-supported screen-reading procedure can be considered safe,” report Kristina Lång, from Lund University in Malmö, Sweden, and co-authors in The Lancet Oncology.

However, the team emphasizes that the trial is ongoing and will only determine the primary endpoint of the interval cancer detection rate in 100,000 participants in the two arms after 2 years of follow-up.

Nested within the Swedish national breast screening program, women aged 40–80 years (median 54 years) attending either general screening at 1.5–2.0-year intervals or annual high-risk screening gave consent to participate in the study and were randomly assigned to AI-supported screening (n=39,996) or standard double reading (n=40,024).

The AI system triaged each screening examination on a 10-point malignancy risk score, where readings scored 1–9 were given a single reading and those scored 10 a double reading. Both the risk score and the computer-aided detection mark for readings scored 8–10 were available to the screening radiologists, the team explains.

Overall, 46,345 AI-supported screening readings resulted in 244 screen-detected cancers and 861 recalls while 83,231 standard screening readings identified 203 screen-detected cancers and led to 817 recalls.

The cancer detection rate in the AI-supported group was 6.1 cases per 1000 screened participants; this was above the lowest acceptable limit for safety and did not differ significantly from the rate of 5.1 cases per 1000 screened participants in the control arm, giving a ratio of 1.2, say Lång et al.

The rate of recall in the AI-supported and standard double screening arms was also comparable (2.2 vs 2.0%) and the corresponding positive predictive values for recall were 28.3% and 24.8%. The false–positive rate was 1.5% in both groups, but there was a 44.3% decrease in the screen-reading workload with AI-supported versus conventional screening.

“The actual time saved was not measured, but, if we assume that a radiologist reads on average 50 screening examinations per hour, it would have taken one radiologist 4.6 months less to read the 46,345 screening examinations in the intervention group compared with the 83,231 in the control group,” the researchers remark.

Of the 244 cancers in the AI-supported arm, 75% were invasive and 83% were stage T1, while 81% of the 203 cancers were invasive in the control arm and 78% were stage T1. In situ carcinomas therefore made up a respective 25% and 19% of detected cases, although this difference was not statistically significant.

Among the 208 patients in the AI-supported arm who had a cancer risk score of 10 and whose images underwent double reading, the cancer detection rate was 72.3 cases per 1000 participants, translating to one case per 14 screenings. Of the 490 screenings defined by AI as having the 1% highest risk, 38.6% were recalled, forming 22.0% of all recalls in the intervention arm.

And of the 189 participants who were recalled as being at extra high risk, 136 had cancer, giving a positive predictive value of recall of 72.0% and a cancer detection rate of 277.6 cases per 1000 examinations.

“Thus, the 1.2% of screening examinations flagged as extra high risk contained 55.7% of all screen-detected cancers in this group,” emphasize Lång and co-authors.

They note that there was also a “considerable difference” in the cancer detection rate for the AI-supported group of patients who had a risk score of 1–7 compared with a score of 8–9, at 0.2 and 4.7 cases per 1000 participants, respectively. This translated to radiologists reading 5000 versus 212 mammograms to detect one cancer, the researchers say.

Nereo Segnan and Antonio Ponti, both from CPO Piemonte in Torino, Italy, describe the trial findings in a linked comment as being “of great interest” and note that the risk score algorithm was “very accurate,” showing “remarkable” potential to help improve the criteria for recall in patients at both low and high risk.

Nevertheless, the commentators say it is important to determine whether the higher rate of in situ carcinomas in the AI-supported arm is from overdiagnosis of benign changes or overdetection of indolent lesions.

They conclude that, although the final MASAI findings will determine the characteristics of the detected cancers and the rate of interval cancer detection, an “important research question thus remains: is AI, when appropriately trained, able to capture relevant biological features – or, in other words, the natural history of the disease – such as the capacity of tumours to grow and disseminate?”

Lancet Oncol 2023; doi:10.1016/S1470-2045(23)00298-X
Lancet Oncol 2023; doi:10.1016/S1470-2045(23)00336-4

Medicine Matters Oncology