TY - JOUR
T1 - GasHisSDB: A New Gastric Histopathology Image Dataset for Computer Aided Diagnosis of Gastric Cancer
AU - Hu, Weiming
AU - Li, Chen
AU - Li, Xiaoyan
AU - Rahaman, Md Mamunur
AU - Ma, Jiquan
AU - Zhang, Yong
AU - Chen, Haoyuan
AU - Liu, Wanli
AU - Sun, Changhao
AU - Yao, Yudong
AU - Sun, Hongzan
AU - Grzegorzek, Marcin
N1 - DOI: https://doi.org/10.1016/j.compbiomed.2021.10520710.1016/j.compbiomed.2021.105207
Publisher Copyright:
© 2022 Elsevier Ltd
DBLP License: DBLP's bibliographic metadata records provided through http://dblp.org/ are distributed under a Creative Commons CC0 1.0 Universal Public Domain Dedication. Although the bibliographic metadata records are provided consistent with CC0 1.0 Dedication, the content described by the metadata records is not. Content may be subject to copyright, rights of privacy, rights of publicity and other restrictions.
PY - 2022/3
Y1 - 2022/3
N2 - Background and objective: Gastric cancer is the fifth most common cancer globally, and early detection of gastric cancer is essential to save lives. Histopathological examination of gastric cancer is the gold standard for the diagnosis of gastric cancer. However, computer-aided diagnostic techniques are challenging to evaluate due to the scarcity of publicly available gastric histopathology image datasets. Methods: In this paper, a noble publicly available Gastric Histopathology Sub-size Image Database (GasHisSDB) is published to identify classifiers’ performance. Specifically, two types of data are included: normal and abnormal, with a total of 245,196 tissue case images. In order to prove that the methods of different periods in the field of image classification have discrepancies on GasHisSDB, we select a variety of classifiers for evaluation. Seven classical machine learning classifiers, three Convolutional Neural Network classifiers, and a novel transformer-based classifier are selected for testing on image classification tasks. Results: This study performed extensive experiments using traditional machine learning and deep learning methods to prove that the methods of different periods have discrepancies on GasHisSDB. Traditional machine learning achieved the best accuracy rate of 86.08% and a minimum of just 41.12%. The best accuracy of deep learning reached 96.47% and the lowest was 86.21%. Accuracy rates vary significantly across classifiers. Conclusions: To the best of our knowledge, it is the first publicly available gastric cancer histopathology dataset containing a large number of images for weakly supervised learning. We believe that GasHisSDB can attract researchers to explore new algorithms for the automated diagnosis of gastric cancer, which can help physicians and patients in the clinical setting.
AB - Background and objective: Gastric cancer is the fifth most common cancer globally, and early detection of gastric cancer is essential to save lives. Histopathological examination of gastric cancer is the gold standard for the diagnosis of gastric cancer. However, computer-aided diagnostic techniques are challenging to evaluate due to the scarcity of publicly available gastric histopathology image datasets. Methods: In this paper, a noble publicly available Gastric Histopathology Sub-size Image Database (GasHisSDB) is published to identify classifiers’ performance. Specifically, two types of data are included: normal and abnormal, with a total of 245,196 tissue case images. In order to prove that the methods of different periods in the field of image classification have discrepancies on GasHisSDB, we select a variety of classifiers for evaluation. Seven classical machine learning classifiers, three Convolutional Neural Network classifiers, and a novel transformer-based classifier are selected for testing on image classification tasks. Results: This study performed extensive experiments using traditional machine learning and deep learning methods to prove that the methods of different periods have discrepancies on GasHisSDB. Traditional machine learning achieved the best accuracy rate of 86.08% and a minimum of just 41.12%. The best accuracy of deep learning reached 96.47% and the lowest was 86.21%. Accuracy rates vary significantly across classifiers. Conclusions: To the best of our knowledge, it is the first publicly available gastric cancer histopathology dataset containing a large number of images for weakly supervised learning. We believe that GasHisSDB can attract researchers to explore new algorithms for the automated diagnosis of gastric cancer, which can help physicians and patients in the clinical setting.
UR - http://www.scopus.com/inward/record.url?scp=85122480310&partnerID=8YFLogxK
UR - https://www.mendeley.com/catalogue/f1041841-0379-3622-af02-785f992ffdb0/
U2 - 10.1016/j.compbiomed.2021.105207
DO - 10.1016/j.compbiomed.2021.105207
M3 - Journal articles
SN - 0010-4825
VL - 142
SP - 105207
JO - Computers in Biology and Medicine
JF - Computers in Biology and Medicine
M1 - 105207
ER -