Software-Engineering Perspectives on Machine for Skin-Disease Classification
Article Information
Abstract
Skin‑disease classification has evolved from simple image recognizers into software‑driven pipelines that demand reliability, reproducibility, and ethical governance. While most AI reviews focus on algorithmic accuracy, few examine these systems through a software‑engineering (SE) lens—essential for assessing pipeline modularity, version control, deployment readiness, and long‑term maintainability, all critical for clinical integration. This review surveys literature from 2015 to early 2025, curating about 180 papers that link skin‑disease classification with SE practices. It traces the shift from handcrafted feature‑based classifiers to end‑to‑end convolutional, ensemble, and transformer architectures, alongside the engineering processes that support versioning, deployment, and monitoring. Benchmark datasets (PH$^2$, HAM10000, ISIC, etc.) have established reproducible evaluation protocols that underpin software verification. Emerging directions—self‑supervised pretraining, multimodal fusion, human‑AI collaboration—signal a move from model‑centric to system‑level integration. The analysis highlights not only accuracy and generalization but also SE quality attributes: scalability, maintainability, explainability, and fairness, which are indispensable for trustworthy adoption in diverse clinical workflows.
Graphical Abstract
Keywords
Data Availability Statement
Funding
Conflicts of Interest
AI Use Statement
Ethical Approval and Consent to Participate
References
- He, M., & Zhang, X. (2024, November). A Review of Research Advances in Image Segmentation of Skin Lesions. In International Artificial Intelligence Conference (pp. 265-279). Singapore: Springer Nature Singapore.
[CrossRef] [Google Scholar] - Daneshjou, R., Barata, C., Betz-Stablein, B., Celebi, M. E., Codella, N., Combalia, M., ... & Rotemberg, V. (2022). Checklist for evaluation of image-based artificial intelligence reports in dermatology: CLEAR derm consensus guidelines from the international skin imaging collaboration artificial intelligence working group. JAMA dermatology, 158(1), 90-96.
[CrossRef] [Google Scholar] - Karimkhani, C., Boyers, L. N., Dellavalle, R. P., & Weinstock, M. A. (2015). It's time for 'keratinocyte carcinoma' to replace the term 'nonmelanoma skin cancer'. Journal of the American Academy of Dermatology, 72(1), 186–187.
[CrossRef] [Google Scholar] - Groh, M., Harris, C., Soenksen, L., Lau, F., Han, R., Kim, A., ... & Badri, O. (2021). Evaluating deep neural networks trained on clinical images in dermatology with the fitzpatrick 17k dataset. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 1820-1828).
[Google Scholar] - Barata, C., Celebi, M. E., & Marques, J. S. (2015). Improving dermoscopy image classification using color constancy. IEEE Journal of Biomedical and Health Informatics, 19(3), 1146–1152.
[CrossRef] [Google Scholar] - Cassidy, B., Kendrick, C., Brodzicki, A., Jaworek-Korjakowska, J., & Yap, M. H. (2022). Analysis of the ISIC image datasets: Usage, benchmarks and recommendations. Medical image analysis, 75, 102305.
[CrossRef] [Google Scholar] - Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., & Thrun, S. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115–118.
[CrossRef] [Google Scholar] - Brinker, T. J., Hekler, A., Enk, A. H., Berking, C., Haferkamp, S., Hauschild, A., ... & Schadendorf, D. (2019). Deep learning outperformed 136 of 157 dermatologists in a head-to-head dermoscopic melanoma image classification task. European Journal of Cancer, 113, 47–54.
[CrossRef] [Google Scholar] - Codella, N., Rotemberg, V., Tschandl, P., Celebi, M. E., Dusza, S., Gutman, D., ... & Halpern, A. (2019). Skin lesion analysis toward melanoma detection 2018: A challenge hosted by the international skin imaging collaboration (isic). arXiv preprint arXiv:1902.03368.
[Google Scholar] - Gessert, N., Nielsen, M., Shaikh, M., Werner, R., & Schlaefer, A. (2020). Skin lesion classification using ensembles of multi-resolution EfficientNets with meta data. MethodsX, 7, 100864.
[CrossRef] [Google Scholar] - Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In Advances in neural information processing systems 30 (NeurIPS 2017).
[Google Scholar] - Hosny, K. M., Kassem, M. A., & Foaud, M. M. (2018). Skin cancer classification using deep learning and transfer learning. In 2018 9th Cairo International Biomedical Engineering Conference (CIBEC) (pp. 90–93). IEEE.
[CrossRef] [Google Scholar] - Daneshjou, R., Barata, C., Betz-Stablein, B., Celebi, M. E., Codella, N., Combalia, M., ... & Rotemberg, V. (2022). Checklist for evaluation of image-based artificial intelligence reports in dermatology: CLEAR Derm consensus guidelines from the International Skin Imaging Collaboration Artificial Intelligence Working Group. JAMA Dermatology, 158(1), 90–96.
[CrossRef] [Google Scholar] - Alipour, N., Burke, T., & Courtney, J. (2024). Skin type diversity in skin lesion datasets: a review. Current Dermatology Reports, 13(3), 198-210.
[CrossRef] [Google Scholar] - Wu, L., & Tao, T. (2025). MARTE-based modeling and analysis for real-time neuromorphic computing in embedded systems. ICCK Journal of Software Engineering, 1(1), 9-16.
[CrossRef] [Google Scholar] - Debelee, T. G. (2023). Skin lesion classification and detection using machine learning techniques: A systematic review. Diagnostics, 13(19), 3147.
[CrossRef] [Google Scholar] - Celebi, M. E., Wen, Q., Hwang, S., Iyatomi, H., & Schaefer, G. (2013). Lesion border detection in dermoscopy images using ensembles of thresholding methods. Skin Research and Technology, 19(1), e252-258.
[CrossRef] [Google Scholar] - Szyc, Ł., Hillen, U., Scharlach, C., Kauer, F., & Garbe, C. (2019). Diagnostic performance of a support vector machine for dermatofluoroscopic melanoma recognition: The results of the retrospective clinical study on 214 pigmented skin lesions. Diagnostics, 9(3), 103.
[CrossRef] [Google Scholar] - Zahid, M., Rziza, M., & Alaoui, R. (2025). Skin lesion classification using hybrid feature extraction based on classical and deep learning methods. BioMedInformatics, 5(3), 41.
[CrossRef] [Google Scholar] - Jeong, H. K., Park, C., Henao, R., & Kheterpal, M. (2023). Deep learning in dermatology: A systematic review of current approaches, outcomes, and limitations. JID Innovations, 3(1), 100150.
[CrossRef] [Google Scholar] - Zhang, J., Zhong, F., He, K., Ji, M., Li, S., & Li, C. (2023). Recent advancements and perspectives in the diagnosis of skin diseases using machine learning and deep learning: A review. Diagnostics, 13(23), 3506.
[CrossRef] [Google Scholar] - Fogelberg, K., Chamarthi, S., Maron, R. C., Niebling, J., & Brinker, T. J. (2023). Domain shifts in dermoscopic skin cancer datasets: Evaluation of essential limitations for clinical translation. New Biotechnology, 76, 106-117.
[CrossRef] [Google Scholar] - Bissoto, A., Fornaciali, M., Valle, E., & Avila, S. (2019). (De)constructing bias on skin lesion datasets. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (pp. 2766–2774). IEEE.
[CrossRef] [Google Scholar] - Puri, P., Combalia, M., Rotemberg, V., Carrera, C., Puig, S., Malvehy, J., ... & Tschandl, P. (2022). Deep learning for dermatologists: Part II. Current applications. Journal of the American Academy of Dermatology, 87(6), 1352–1360.
[CrossRef] [Google Scholar] - Jaworek-Korjakowska, J., & Kleczek, P. (2018). Eskin: study on the smartphone application for early detection of malignant melanoma. Wireless Communications and Mobile Computing, 2018(1), 5767360.
[CrossRef] [Google Scholar] - Liu, Y., Jain, A., Eng, C., Way, D. H., Lee, K., Bui, P., ... & Coz, D. (2020). A deep learning system for differential diagnosis of skin diseases. Nature Medicine, 26(6), 900–908.
[CrossRef] [Google Scholar] - Jin, C., Guo, Z., Lin, Y., Luo, L., & Chen, H. (2023). Label-efficient deep learning in medical image analysis: Challenges and future directions. arXiv preprint arXiv:2303.12484.
[Google Scholar] - Kawahara, J., BenTaieb, A., & Hamarneh, G. (2016). Deep features to classify skin lesions. In 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI) (pp. 1397–1400). IEEE.
[CrossRef] [Google Scholar] - Srinivas, A., Lin, T. Y., Parmar, N., Shlens, J., Abbeel, P., & Vaswani, A. (2021, June). Bottleneck Transformers for Visual Recognition. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 16514-16524). IEEE.
[CrossRef] [Google Scholar] - Lin, H., Xu, C., & Qin, J. (2025). Taming Vision-Language Models for Medical Image Analysis: A Comprehensive Review. arXiv preprint arXiv:2506.18378.
[Google Scholar] - Du, S., Bayasi, N., Hamarneh, G., & Garbi, R. (2023, October). Avit: Adapting vision transformers for small skin lesion segmentation datasets. In International Conference on Medical Image Computing and Computer-Assisted Intervention (pp. 25-36). Cham: Springer Nature Switzerland.
[CrossRef] [Google Scholar] - Zhang, F., Yuan, K., Li, X., Gao, Y., Liu, Y., Wang, Z., ... & Zhang, D. (2025). Federated cross-incremental self-supervised learning for medical image segmentation. IEEE Transactions on Neural Networks and Learning Systems, 36(7), 13498–13511.
[CrossRef] [Google Scholar] - Mendonca, T., Ferreira, P. M., Marques, J. S., Marcal, A. R. S., & Rozeira, J. (2013). PH2 - a dermoscopic image database for research and benchmarking. In 2013 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (pp. 5437–5440). IEEE.
[CrossRef] [Google Scholar] - Tschandl, P., Rosendahl, C., & Kittler, H. (2018). The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Scientific data, 5(1), 1-9.
[CrossRef] [Google Scholar] - Combalia, M., Codella, N. C., Rotemberg, V., Helba, B., Vilaplana, V., Reiter, O., ... & Malvehy, J. (2019). Bcn20000: Dermoscopic lesions in the wild. arXiv preprint arXiv:1908.02288.
[Google Scholar] - Dong, C., Dai, D., Zhang, Y., Zhang, C., Li, Z., & Xu, S. (2023). Learning from dermoscopic images in association with clinical metadata for skin lesion segmentation and classification. Computers in Biology and Medicine, 152, 106321.
[CrossRef] [Google Scholar] - Pacheco, A. G. C., Lima, G. R., Salomao, A. S., Krohling, B., Biral, I. P., de Angelo, G. G., ... & Krohling, R. A. (2020). PAD-UFES-20: A skin lesion dataset composed of patient data and clinical images collected from smartphones. Data in Brief, 32, 106221.
[CrossRef] [Google Scholar] - Zhang, X., Song, C., Li, S., Wang, Y., Liu, J., Liu, L., ... & He, X. (2025). DermViT: Diagnosis-guided vision transformer for robust and efficient skin lesion classification. Bioengineering, 12(4), 421.
[CrossRef] [Google Scholar] - Lungu-Stan, V. C., Cercel, D. C., & Pop, F. (2023, September). Skindistilvit: Lightweight vision transformer for skin lesion classification. In International Conference on Artificial Neural Networks (pp. 268-280). Cham: Springer Nature Switzerland.
[CrossRef] [Google Scholar] - Cheslerean-Boghiu, T., Fleischmann, M. E., Willem, T., & Lasser, T. (2023). Transformer-based interpretable multi-modal data fusion for skin lesion classification. arXiv preprint arXiv:2304.14505.
[Google Scholar] - Zhang, Y., Xie, Y., Wang, H., Avery, J. C., Hull, M. L., & Carneiro, G. (2025, February). A Novel Perspective for Multi-modal Multi-label Skin Lesion Classification. In 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) (pp. 3549-3558). IEEE.
[CrossRef] [Google Scholar] - Mahbod, A., Ecker, R., & Woitek, R. (2025). Fusion of Foundation and Vision Transformer Model Features for Dermatoscopic Image Classification. arXiv preprint arXiv:2505.16338.
[Google Scholar] - Amin, J., Azhar, M., Arshad, H., Zafar, A., & Kim, S. H. (2025). Skin-lesion segmentation using boundary-aware segmentation network and classification based on a mixture of convolutional and transformer neural networks. Frontiers in Medicine, 12, 1524146.
[CrossRef] [Google Scholar] - Mohan, J., Sivasubramanian, A., & Ravi, V. (2025). Enhancing skin disease classification leveraging transformer-based deep learning architectures and explainable ai. Computers in Biology and Medicine, 190, 110007.
[CrossRef] [Google Scholar] - Cassidy, B., Kendrick, C., Brodzicki, A., Jaworek-Korjakowska, J., & Yap, M. H. (2022). Analysis of the ISIC image datasets: Usage, benchmarks and recommendations. Medical Image Analysis, 75, 102305.
[CrossRef] [Google Scholar] - Groh, M., Harris, C., Soenksen, L., Lau, F., Han, R., Kim, A., ... & Badri, O. (2021, June). Evaluating Deep Neural Networks Trained on Clinical Images in Dermatology with the Fitzpatrick 17k Dataset. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (pp. 1820-1828). IEEE.
[CrossRef] [Google Scholar] - Alipour, N., Burke, T., & Courtney, J. (2024). Skin type diversity in skin lesion datasets: A review. Current Dermatology Reports, 13(3), 198–210.
[CrossRef] [Google Scholar] - Montoya, L. N., Roberts, J. S., & Hidalgo, B. S. (2025, March). Towards fairness in AI for melanoma detection: Systemic review and recommendations. In Future of Information and Communication Conference (pp. 320-341). Cham: Springer Nature Switzerland.
[CrossRef] [Google Scholar] - Fogelberg, K., Chamarthi, S., Maron, R. C., Niebling, J., & Brinker, T. J. (2023). Domain shifts in dermoscopic skin cancer datasets: Evaluation of essential limitations for clinical translation. New Biotechnology, 76, 106–117.
[CrossRef] [Google Scholar] - Ye, J. (2025, July). Enhancing skin lesion classification generalization with active domain adaptation. In 2025 47th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) (pp. 1-7). IEEE.
[CrossRef] [Google Scholar] - Zakhem, G. A., Fakhoury, J. W., Motosko, C. C., & Ho, R. S. (2021). Characterizing the role of dermatologists in developing artificial intelligence for assessment of skin cancer. Journal of the American Academy of Dermatology, 85(6), 1544-1556.
[CrossRef] [Google Scholar] - Chanda, T., Haggenmueller, S., Bucher, T. C., Holland-Letz, T., Kittler, H., Tschandl, P., ... & Brinker, T. J. (2025). Dermatologist-like explainable AI enhances melanoma diagnosis accuracy: eye-tracking study. Nature Communications, 16(1), 4739.
[CrossRef] [Google Scholar] - Hendrix, R., Proietto Salanitri, F., Spampinato, C., Palazzo, S., & Bagci, U. (2024, December). Evidential Federated Learning for Skin Lesion Image Classification. In International Conference on Pattern Recognition (pp. 354-365). Cham: Springer Nature Switzerland.
[CrossRef] [Google Scholar]
Cite This Article
TY - JOUR AU - Nazir, Moomna AU - Ahsan, Azka AU - Khadim, Rabia AU - Abbas, Shakeel AU - Muhammad, Aown AU - Sohail, Zain PY - 2026 DA - 2026/02/11 TI - Software-Engineering Perspectives on Machine for Skin-Disease Classification JO - ICCK Journal of Software Engineering T2 - ICCK Journal of Software Engineering JF - ICCK Journal of Software Engineering VL - 2 IS - 1 SP - 52 EP - 70 DO - 10.62762/JSE.2025.913699 UR - https://www.icck.org/article/abs/JSE.2025.913699 KW - software engineering KW - machine learning KW - deep learning KW - dermatology KW - computer-aided diagnosis KW - MLOps KW - fairness AB - Skin‑disease classification has evolved from simple image recognizers into software‑driven pipelines that demand reliability, reproducibility, and ethical governance. While most AI reviews focus on algorithmic accuracy, few examine these systems through a software‑engineering (SE) lens—essential for assessing pipeline modularity, version control, deployment readiness, and long‑term maintainability, all critical for clinical integration. This review surveys literature from 2015 to early 2025, curating about 180 papers that link skin‑disease classification with SE practices. It traces the shift from handcrafted feature‑based classifiers to end‑to‑end convolutional, ensemble, and transformer architectures, alongside the engineering processes that support versioning, deployment, and monitoring. Benchmark datasets (PH$^2$, HAM10000, ISIC, etc.) have established reproducible evaluation protocols that underpin software verification. Emerging directions—self‑supervised pretraining, multimodal fusion, human‑AI collaboration—signal a move from model‑centric to system‑level integration. The analysis highlights not only accuracy and generalization but also SE quality attributes: scalability, maintainability, explainability, and fairness, which are indispensable for trustworthy adoption in diverse clinical workflows. SN - 3069-1834 PB - Institute of Central Computation and Knowledge LA - English ER -
@article{Nazir2026SoftwareEn,
author = {Moomna Nazir and Azka Ahsan and Rabia Khadim and Shakeel Abbas and Aown Muhammad and Zain Sohail},
title = {Software-Engineering Perspectives on Machine for Skin-Disease Classification},
journal = {ICCK Journal of Software Engineering},
year = {2026},
volume = {2},
number = {1},
pages = {52-70},
doi = {10.62762/JSE.2025.913699},
url = {https://www.icck.org/article/abs/JSE.2025.913699},
abstract = {Skin‑disease classification has evolved from simple image recognizers into software‑driven pipelines that demand reliability, reproducibility, and ethical governance. While most AI reviews focus on algorithmic accuracy, few examine these systems through a software‑engineering (SE) lens—essential for assessing pipeline modularity, version control, deployment readiness, and long‑term maintainability, all critical for clinical integration. This review surveys literature from 2015 to early 2025, curating about 180 papers that link skin‑disease classification with SE practices. It traces the shift from handcrafted feature‑based classifiers to end‑to‑end convolutional, ensemble, and transformer architectures, alongside the engineering processes that support versioning, deployment, and monitoring. Benchmark datasets (PH\$^2\$, HAM10000, ISIC, etc.) have established reproducible evaluation protocols that underpin software verification. Emerging directions—self‑supervised pretraining, multimodal fusion, human‑AI collaboration—signal a move from model‑centric to system‑level integration. The analysis highlights not only accuracy and generalization but also SE quality attributes: scalability, maintainability, explainability, and fairness, which are indispensable for trustworthy adoption in diverse clinical workflows.},
keywords = {software engineering, machine learning, deep learning, dermatology, computer-aided diagnosis, MLOps, fairness},
issn = {3069-1834},
publisher = {Institute of Central Computation and Knowledge}
}
Article Metrics
Publisher's Note
ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and Permissions
Copyright © 2026 by the Author(s). Published by Institute of Central Computation and Knowledge. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
Portico