AI Identifies Patients with Increased Lung Cancer Risk Up To 4 Months Earlier
Posted on 24 Apr 2025
Earlier diagnosis plays a crucial role in improving the prognosis of cancer, as delays in starting therapy are associated with decreased survival rates. In most cases, cancer is first identified when symptoms become apparent in general practice, and timely detection in this setting could significantly improve outcomes. Lung cancer, in particular, has one of the poorest prognoses. While several emerging technologies, including new biomarkers, electronic nose technology, and free-circulating tumor DNA, have shown potential for earlier diagnosis in general practice, these have yet to result in widely applicable diagnostic tests. For about 80% of patients diagnosed with lung cancer, the journey begins in general practice (GP). However, approximately 75% of patients are diagnosed at an advanced stage (3 or 4), leading to an 80% mortality rate within one year. The long-term data available in GP records may contain crucial information that could be used to identify patients at risk for cancer at earlier stages. One promising method for improving cancer risk identification involves using the text data available in GP patient records. However, prior attempts to leverage this information have not yielded improved performance over existing clinical prediction tools, likely due to the predefined nature of these predictors. Now, general practitioners may soon be able to identify patients at higher risk of lung cancer up to four months earlier than current practices by using an algorithm during consultations.
Developed by researchers at Amsterdam UMC (Amsterdam, The Netherlands), this new algorithm analyzes all medical information from general practice, with a particular focus on free text data. This is a novel and significant aspect of the study. The algorithm identifies predictive signals within the patients' medical histories, allowing it to detect a substantial number of cases up to four months earlier than current methods. Unlike previous studies that relied on predefined, coded variables like “smoking” or “coughing up blood,” this algorithm taps into the free text portion of patient records, uncovering details that enable earlier detection of cancer. Further research is needed to understand which specific text fragments the algorithm identifies to make this method applicable in routine clinical practice.
For their study, the researchers analyzed data from 525,526 patients across four academic GP networks in Amsterdam, Utrecht, and Groningen. Among this group, 2,386 patients were diagnosed with lung cancer, with diagnoses confirmed using the Dutch cancer registry. Both structured and free text data were used to predict lung cancer diagnoses five months earlier (four months before referral). The study, published in the British Journal of General Practice, showed that, out of 34 patients flagged by the algorithm, one had lung cancer. The algorithm enabled 62% of patients with lung cancer to be referred four months earlier. Compared to traditional screening methods, this approach results in fewer false positives and allows for earlier selection during consultations. This method could potentially improve early detection for other cancers, such as pancreatic, stomach, or ovarian cancer, which often go undetected until advanced stages. However, the approach still needs to be validated in different countries and healthcare systems.
“Now many patients are diagnosed with lung cancer at an advanced stage, 3 or 4, so 80 percent of patients often die within a year,” said Henk van Weert, emeritus professor of family medicine. “Previous research made it likely that a four-week gain already has a noticeable effect on prognosis. Four months is thus a likely highly relevant gain.”