Abstract
BACKGROUND: Cancer recurrence and progression, once seen as markers of poor prognosis, are now considered manageable aspects of long-term care. Advances in treatment have extended survival, emphasizing the need for representative epidemiological information. Population-based cancer registries are essential in this respect. However, tracking treatment outcomes and accurately distinguishing recurrences from progressions remain challenging due to incomplete follow-up data. To address this aiming at meaningful cancer registry data analyses, we employed machine learning (ML) for precise classification, surpassing traditional clinical assumptions.
METHODS: We developed a ML model to identify and classify cancer recurrence and progression using lung cancer (ICD-10: C34) data from the Hamburg Cancer Registry. To ensure interoperability, we created a standardized indicator dataset. The model's predictive performance was validated using data from five additional German cancer registries. After extensive evaluation, a histogram-based gradient-boosted decision tree ensemble was chosen for its high accuracy and adaptability.
RESULTS: The model demonstrated strong predictive performance, with areas under the curve (AUC) ranging from 0.74 to 0.99 across test datasets, highlighting its robustness and generalizability. Its classification accuracy was comparable to experienced human annotators, ensuring reliability for large-scale analysis.
CONCLUSION: This study highlights the potential of ML in enhancing cancer registry data interpretation. By reliably identifying recurrences and progressions, our algorithm addresses gaps caused by incomplete reporting. The established framework provides a scalable approach for integrating AI-driven insights into cancer research, improving registry-based outcome analyses, and supporting advancements in cancer epidemiology.
| Originalsprache | Englisch |
|---|---|
| Aufsatznummer | 115604 |
| Zeitschrift | European Journal of Cancer |
| Jahrgang | 227 |
| Seiten (von - bis) | 115604 |
| ISSN | 0959-8049 |
| DOIs | |
| Publikationsstatus | Veröffentlicht - 09.09.2025 |
Fördermittel
This work was part of the AI-CARE project, funded by the German Ministry of Health (BMG), Grant ZMI5-2522DAT13J.
| Träger | Trägernummer |
|---|---|
| Bundesministerium für Gesundheit | ZMI5-2522DAT13J |
UN SDGs
Dieser Output leistet einen Beitrag zu folgendem(n) Ziel(en) für nachhaltige Entwicklung
-
SDG 3 – Gesundheit und Wohlergehen
-
SDG 5 – Gender Equality
-
SDG 10 – Weniger Ungleichheiten
-
SDG 12 – Verantwortungsvoller Konsum und Produktion
Strategische Forschungsbereiche und Zentren
- Profilbereich: Zentrum für Bevölkerungsmedizin und Versorgungsforschung (ZBV)
Fingerprint
Untersuchen Sie die Forschungsthemen von „Leveraging machine-learning techniques to detect recurrences in cancer registry data: A multi-registry validation study using German lung cancer data“. Zusammen bilden sie einen einzigartigen Fingerprint.Zitieren
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver