Research and application of a kidney disease binary classification framework based on convolutional and vision transformer architectures| International Journal of Innovative Science and Research Technology

Research and Application of a Kidney Disease Binary Classification Framework Based on Convolutional and Vision Transformer Architectures

Authors : Sajid Ali; Zhang Yihong; Sajad Ul Haq; Ameer Hamza; Md. Saifur Rahman; Nabeel Hussain; Manzar Hussain; Amjad Ali; Irfan Ali

Volume/Issue : Volume 11 - 2026, Issue 5 - May

Google Scholar : https://tinyurl.com/yjfr5dvy

Scribd : https://tinyurl.com/ypr9pcah

DOI : https://doi.org/10.38124/ijisrt/26May1422

PlumX Metrics

Semantic Scholar

ResearchGate

Note : A published paper may take 4-5 working days from the publication date to appear in PlumX Metrics, Semantic Scholar, and ResearchGate.

Abstract : Early and reliable identification of kidney abnormalities from computed tomography (CT) images is important for supporting clinical decision-making and reducing radiological workload. This study presents a comparative evaluation of transfer-learning-based convolutional neural networks (CNNs) and custom deep learning architectures for CT-based kidney abnormality classification. The publicly accessible CT-Kidney dataset reported by Islam et al. was used, consisting of 12,446 CT images collected from multiple hospitals in Dhaka, Bangladesh. The original dataset contains 5,077 normal kidney images, 3,709 cyst images, 2,283 tumor images, and 1,377 stone images. For this work, the multiclass labels were reorganized into a binary classification task, normal versus abnormal where cyst, tumor, and stone samples were treated as abnormal. Data were divided into training, validation, and testing subsets using a stratified 60:25:15 split to preserve class distribution. Six models were evaluated: VGG16, ResNet50, InceptionV3, InceptionResNetV2, a custom CNN based on ResNet152, and a custom Vision Transformer (ViT). Standard performance metrics including accuracy, precision, recall, F1-score, root mean square error (RMSE), AUC-ROC, and AUC-PR were used for assessment.

Keywords : Computed Tomography, Convolutional Neural Network, Deep Learning, Kidney Disease Classification, Medical Image Analysis, Transfer Learning, Vision Transformer.

References :

M. N. Islam, M. Hasan, M. K. Hossain, M. G. R. Alam, M. Z. Uddin, and A. Soylu, "Vision transformer and explainable transfer learning models for auto detection of kidney cyst, stone and tumor from CT-radiography," Scientific Reports, vol. 12, no. 1, art. 11440, Jul. 2022.
R. Kaur, M. Juneja, and A. K. Mandal, "Computer-aided diagnosis of renal lesions in CT images: A comprehensive survey and future prospects," Computers & Electrical Engineering, vol. 77, pp. 423–434, Jul. 2019.
M. Zhang, Z. Ye, E. Yuan, X. Lv, Y. Zhang, Y. Tan, C. Xia, J. Tang, J. Huang, and Z. Li, "Imaging-based deep learning in kidney diseases: Recent progress and future prospects," Insights into Imaging, vol. 15, no. 1, art. 50, Feb. 2024.
K. Simonyan and A. Zisserman, "Very deep convolutional networks for large-scale image recognition," in Proc. Int. Conf. Learn. Representations (ICLR), San Diego, CA, USA, May 2015.
K. He, X. Zhang, S. Ren, and J. Sun, "Deep residual learning for image recognition," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, NV, USA, Jun. 2016, pp. 770–778.
C. Szegedy, V. Vanhoucke, S. Ioffe, J. Shlens, and Z. Wojna, "Rethinking the inception architecture for computer vision," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Las Vegas, NV, USA, Jun. 2016, pp. 2818–2826.
C. Szegedy, S. Ioffe, V. Vanhoucke, and A. Alemi, "Inception-v4, Inception-ResNet and the impact of residual connections on learning," in Proc. AAAI Conf. Artif. Intell., San Francisco, CA, USA, Feb. 2017, vol. 31, no. 1, pp. 4278–4284.
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, and N. Houlsby, "An image is worth 16×16 words: Transformers for image recognition at scale," in Proc. Int. Conf. Learn. Representations (ICLR), virtual, May 2021.
D. P. Kingma and J. Ba, "Adam: A method for stochastic optimization," in Proc. Int. Conf. Learn. Representations (ICLR), San Diego, CA, USA, May 2015.
C. Shorten and T. M. Khoshghoftaar, "A survey on image data augmentation for deep learning," Journal of Big Data, vol. 6, no. 1, art. 60, Jul. 2019.
D. M. W. Powers, "Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation," Journal of Machine Learning Technologies, vol. 2, no. 1, pp. 37–63, 2011.
O. Ronneberger, P. Fischer, and T. Brox, "U-Net: Convolutional networks for biomedical image segmentation," in Proc. Med. Image Comput. Comput.-Assist. Intervent. (MICCAI), Munich, Germany, Oct. 2015, pp. 234–241.
G. Litjens, T. Kooi, B. E. Bejnordi, A. A. A. Setio, F. Ciompi, M. Ghafoorian, J. A. W. M. van der Laak, B. van Ginneken, and C. I. Sánchez, "A survey on deep learning in medical image analysis," Medical Image Analysis, vol. 42, pp. 60–88, Dec. 2017.
J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei, "ImageNet: A large-scale hierarchical image database," in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. (CVPR), Miami, FL, USA, Jun. 2009, pp. 248–255.
R. R. Selvaraju, M. Cogswell, A. Das, R. Vedantam, D. Parikh, and D. Batra, "Grad-CAM: Visual explanations from deep networks via gradient-based localization," in Proc. IEEE Int. Conf. Comput. Vis. (ICCV), Venice, Italy, Oct. 2017, pp. 618–626.

Early and reliable identification of kidney abnormalities from computed tomography (CT) images is important for supporting clinical decision-making and reducing radiological workload. This study presents a comparative evaluation of transfer-learning-based convolutional neural networks (CNNs) and custom deep learning architectures for CT-based kidney abnormality classification. The publicly accessible CT-Kidney dataset reported by Islam et al. was used, consisting of 12,446 CT images collected from multiple hospitals in Dhaka, Bangladesh. The original dataset contains 5,077 normal kidney images, 3,709 cyst images, 2,283 tumor images, and 1,377 stone images. For this work, the multiclass labels were reorganized into a binary classification task, normal versus abnormal where cyst, tumor, and stone samples were treated as abnormal. Data were divided into training, validation, and testing subsets using a stratified 60:25:15 split to preserve class distribution. Six models were evaluated: VGG16, ResNet50, InceptionV3, InceptionResNetV2, a custom CNN based on ResNet152, and a custom Vision Transformer (ViT). Standard performance metrics including accuracy, precision, recall, F1-score, root mean square error (RMSE), AUC-ROC, and AUC-PR were used for assessment.

Keywords : Computed Tomography, Convolutional Neural Network, Deep Learning, Kidney Disease Classification, Medical Image Analysis, Transfer Learning, Vision Transformer.

Paper Submission Last Date
31 - July - 2026

SUBMIT YOUR PAPER CALL FOR PAPERS

Video Explanation for Published paper

Never miss an update from Papermashup

Get notified about the latest tutorials and downloads.

Subscribe by Email

Get alerts directly into your inbox after each post and stay updated.

Subscribe by RSS

Add our RSS to your feedreader to get regular updates from us.