The distributed nature of our digital healthcare and the rapid emergence of new data sources prevents a compelling overview and the joint use of new data. Data integration, e.g., with metadata and semantic annotations, is expected to overcome this challenge. In this paper, we present an approach to predict UMLS codes to given German metadata using recurrent neural networks. The augmentation of the training dataset using the Medical Subject Headings (MeSH), particularly the German translations, also improved the model accuracy. The model demonstrates robust performance with 75% accuracy and aims to show that increasingly sophisticated machine learning tools can already play a significant role in data integration.
Research Areas and Centers
- Centers: Center for Artificial Intelligence Luebeck (ZKIL)