Overcoming data scarcity in biomedical imaging with a foundational multi-task model

Raphael Schäfer, Till Nicke, Henning Höfener, Annkristin Lange, Dorit Merhof, Friedrich Feuerhake, Volkmar Schulz, Johannes Lotz*, Fabian Kiessling*

*Corresponding author for this work

Abstract

Foundational models, pretrained on a large scale, have demonstrated substantial success across non-medical domains. However, training these models typically requires large, comprehensive datasets, which contrasts with the smaller and more specialized datasets common in biomedical imaging. Here we propose a multi-task learning strategy that decouples the number of training tasks from memory requirements. We trained a universal biomedical pretrained model (UMedPT) on a multi-task database including tomographic, microscopic and X-ray images, with various labeling strategies such as classification, segmentation and object detection. The UMedPT foundational model outperformed ImageNet pretraining and previous state-of-the-art models. For classification tasks related to the pretraining database, it maintained its performance with only 1% of the original training data and without fine-tuning. For out-of-domain tasks it required only 50% of the original training data. In an external independent validation, imaging features extracted using UMedPT proved to set a new standard for cross-center transferability.

Original languageEnglish
JournalNature Computational Science
Volume4
Issue number7
Pages (from-to)495-509
Number of pages15
DOIs
Publication statusPublished - 07.2024

Funding

FundersFunder number
Bundesministerium für Bildung und Forschung01IS21067C
Deutsche ForschungsgemeinschaftCRC 1382, 403224013

    Fingerprint

    Dive into the research topics of 'Overcoming data scarcity in biomedical imaging with a foundational multi-task model'. Together they form a unique fingerprint.

    Cite this