GasHisSDB: A New Gastric Histopathology Image Dataset for Computer Aided Diagnosis of Gastric Cancer

Weiming Hu, Chen Li, Xiaoyan Li, Md Mamunur Rahaman, Jiquan Ma, Yong Zhang, Haoyuan Chen, Wanli Liu, Changhao Sun, Yudong Yao, Hongzan Sun, Marcin Grzegorzek

Abstract

Background and objective: Gastric cancer is the fifth most common cancer globally, and early detection of gastric cancer is essential to save lives. Histopathological examination of gastric cancer is the gold standard for the diagnosis of gastric cancer. However, computer-aided diagnostic techniques are challenging to evaluate due to the scarcity of publicly available gastric histopathology image datasets. Methods: In this paper, a noble publicly available Gastric Histopathology Sub-size Image Database (GasHisSDB) is published to identify classifiers’ performance. Specifically, two types of data are included: normal and abnormal, with a total of 245,196 tissue case images. In order to prove that the methods of different periods in the field of image classification have discrepancies on GasHisSDB, we select a variety of classifiers for evaluation. Seven classical machine learning classifiers, three Convolutional Neural Network classifiers, and a novel transformer-based classifier are selected for testing on image classification tasks. Results: This study performed extensive experiments using traditional machine learning and deep learning methods to prove that the methods of different periods have discrepancies on GasHisSDB. Traditional machine learning achieved the best accuracy rate of 86.08% and a minimum of just 41.12%. The best accuracy of deep learning reached 96.47% and the lowest was 86.21%. Accuracy rates vary significantly across classifiers. Conclusions: To the best of our knowledge, it is the first publicly available gastric cancer histopathology dataset containing a large number of images for weakly supervised learning. We believe that GasHisSDB can attract researchers to explore new algorithms for the automated diagnosis of gastric cancer, which can help physicians and patients in the clinical setting.

Original languageEnglish
Article number105207
JournalComputers in Biology and Medicine
Volume142
Pages (from-to)105207
Number of pages1
ISSN0010-4825
DOIs
Publication statusPublished - 03.2022

Fingerprint

Dive into the research topics of 'GasHisSDB: A New Gastric Histopathology Image Dataset for Computer Aided Diagnosis of Gastric Cancer'. Together they form a unique fingerprint.

Cite this