Zur Hauptnavigation wechseln Zur Suche wechseln Zum Hauptinhalt wechseln

Abstract

BACKGROUND: Cancer recurrence and progression, once seen as markers of poor prognosis, are now considered manageable aspects of long-term care. Advances in treatment have extended survival, emphasizing the need for representative epidemiological information. Population-based cancer registries are essential in this respect. However, tracking treatment outcomes and accurately distinguishing recurrences from progressions remain challenging due to incomplete follow-up data. To address this aiming at meaningful cancer registry data analyses, we employed machine learning (ML) for precise classification, surpassing traditional clinical assumptions.

METHODS: We developed a ML model to identify and classify cancer recurrence and progression using lung cancer (ICD-10: C34) data from the Hamburg Cancer Registry. To ensure interoperability, we created a standardized indicator dataset. The model's predictive performance was validated using data from five additional German cancer registries. After extensive evaluation, a histogram-based gradient-boosted decision tree ensemble was chosen for its high accuracy and adaptability.

RESULTS: The model demonstrated strong predictive performance, with areas under the curve (AUC) ranging from 0.74 to 0.99 across test datasets, highlighting its robustness and generalizability. Its classification accuracy was comparable to experienced human annotators, ensuring reliability for large-scale analysis.

CONCLUSION: This study highlights the potential of ML in enhancing cancer registry data interpretation. By reliably identifying recurrences and progressions, our algorithm addresses gaps caused by incomplete reporting. The established framework provides a scalable approach for integrating AI-driven insights into cancer research, improving registry-based outcome analyses, and supporting advancements in cancer epidemiology.

OriginalspracheEnglisch
Aufsatznummer115604
ZeitschriftEuropean Journal of Cancer
Jahrgang227
Seiten (von - bis)115604
ISSN0959-8049
DOIs
PublikationsstatusVeröffentlicht - 09.09.2025

Fördermittel

This work was part of the AI-CARE project, funded by the German Ministry of Health (BMG), Grant ZMI5-2522DAT13J.

TrägerTrägernummer
Bundesministerium für GesundheitZMI5-2522DAT13J

    UN SDGs

    Dieser Output leistet einen Beitrag zu folgendem(n) Ziel(en) für nachhaltige Entwicklung

    1. SDG 3 – Gesundheit und Wohlergehen
      SDG 3 – Gesundheit und Wohlergehen
    2. SDG 5 – Gender Equality
      SDG 5 – Gender Equality
    3. SDG 10 – Weniger Ungleichheiten
      SDG 10 – Weniger Ungleichheiten
    4. SDG 12 – Verantwortungsvoller Konsum und Produktion
      SDG 12 – Verantwortungsvoller Konsum und Produktion

    Strategische Forschungsbereiche und Zentren

    • Profilbereich: Zentrum für Bevölkerungsmedizin und Versorgungsforschung (ZBV)

    Fingerprint

    Untersuchen Sie die Forschungsthemen von „Leveraging machine-learning techniques to detect recurrences in cancer registry data: A multi-registry validation study using German lung cancer data“. Zusammen bilden sie einen einzigartigen Fingerprint.

    Zitieren