Road Crack Segmentation Algorithm Based on Multiple Attention Fusion and Defect Correction
Article Information
Abstract
This paper proposes an enhanced U-Net-based segmentation framework for road crack detection that effectively addresses issues such as incomplete segmentation, detail loss, environmental complexity, and crack-pixel imbalance. The model integrates multiple functional modules to improve segmentation performance across varying crack types and scales. Specifically, an atrous residual convolution (ARC) module is embedded in the encoder to expand the receptive field and capture large-scale features. A multiple attention fusion module (MAFM), combined with an efficient channel attention mechanism, is introduced at the bridge stage to emphasize crack-relevant features. In the decoder, a defect correction module (DCM) with deep supervision and adaptive refinement is designed to restore fine-grained crack boundaries, especially for small or subtle defects. The proposed model achieves F1-scores of 78.11%, 90.02%, and 81.60% on CFD dataset, CRACK500 dataset, and HYCrack dataset, respectively. Compared to existing state-of-the-art segmentation models, the proposed approach achieves superior accuracy and better preservation of crack detail. These results demonstrate its practical value and strong potential for widespread application in road crack detection and infrastructure maintenance.
Graphical Abstract
Keywords
Data Availability Statement
Funding
Conflicts of Interest
AI Use Statement
Ethical Approval and Consent to Participate
References
- Wang, W., Wang, M., Li, H., Zhao, H., Wang, K., He, C., ... & Chen, J. (2019). Pavement crack image acquisition methods and crack extraction algorithms: A review. Journal of Traffic and Transportation Engineering (English Edition), 6(6), 535-556.
[CrossRef] [Google Scholar] - Yang, X., Li, H., Yu, Y., Luo, X., Huang, T., & Yang, X. (2018). Automatic pixel‐level crack detection and measurement using fully convolutional network. Computer‐Aided Civil and Infrastructure Engineering, 33(12), 1090-1109.
[CrossRef] [Google Scholar] - Ronneberger, O., Fischer, P., & Brox, T. (2015, October). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Cham: Springer international publishing.
[CrossRef] [Google Scholar] - Chen, L. C., Zhu, Y., Papandreou, G., Schroff, F., & Adam, H. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV) (pp. 801-818).
[Google Scholar] - Huang, S., Chen, H., Yan, L., Zou, X., Li, B., & Bi, Y. (2025). A review of the progress in machine vision-based crack detection and identification technology for asphalt pavements. Digital Transportation and Safety, 4(1), 65-79.
[CrossRef] [Google Scholar] - Zhang, Q., Chen, S., Wu, Y., Ji, Z., Yan, F., Huang, S., & Liu, Y. (2024). Improved U-net network asphalt pavement crack detection method. Plos one, 19(5), e0300679.
[CrossRef] [Google Scholar] - Yang, L., Bai, S., Liu, Y., & Yu, H. (2023). Multi-scale triple-attention network for pixelwise crack segmentation. Automation in Construction, 150, 104853.
[CrossRef] [Google Scholar] - Pan, Y., Zhang, G., & Zhang, L. (2020). A spatial-channel hierarchical deep learning network for pixel-level automated crack detection. Automation in Construction, 119, 103357.
[CrossRef] [Google Scholar] - Al-Huda, Z., Peng, B., Algburi, R. N. A., Al-antari, M. A., Al-Jarazi, R., & Zhai, D. (2023). A hybrid deep learning pavement crack semantic segmentation. Engineering Applications of Artificial Intelligence, 122, 106142.
[CrossRef] [Google Scholar] - Ali, L., Jassmi, H. A., Khan, W., & Alnajjar, F. (2023). Crack45K: integration of vision transformer with tubularity flow field (TuFF) and sliding-window approach for crack-segmentation in pavement structures. Buildings, 13(1), 55.
[CrossRef] [Google Scholar] - Hamishebahar, Y., Guan, H., So, S., & Jo, J. (2022). A comprehensive review of deep learning-based crack detection approaches. Applied Sciences, 12(3), 1374.
[CrossRef] [Google Scholar] - Xiang, C., Guo, J., Cao, R., & Deng, L. (2023). A crack-segmentation algorithm fusing transformers and convolutional neural networks for complex detection scenarios. Automation in Construction, 152, 104894.
[CrossRef] [Google Scholar] - Sahragard, E., Farsi, H., & Mohamadzadeh, S. (2025). Advancing semantic segmentation: Enhanced UNet algorithm with attention mechanism and deformable convolution. PloS one, 20(1), e0305561.
[CrossRef] [Google Scholar] - Shi, Y., Cui, L., Qi, Z., Meng, F., & Chen, Z. (2016). Automatic road crack detection using random structured forests. IEEE transactions on intelligent transportation systems, 17(12), 3434-3445.
[CrossRef] [Google Scholar] - Yang, F., Zhang, L., Yu, S., Prokhorov, D., Mei, X., & Ling, H. (2019). Feature pyramid and hierarchical boosting network for pavement crack detection. IEEE transactions on intelligent transportation systems, 21(4), 1525-1535.
[CrossRef] [Google Scholar] - Yan, Y., Deng, C., Li, L., Zhu, L., & Ye, B. (2023). Survey of image semantic segmentation methods in the deep learning era. Journal of Image and Graphics, 28(11), 3342-3362.
[CrossRef] [Google Scholar] - He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 770-778). IEEE.
[CrossRef] [Google Scholar] - Woo, S., Park, J., Lee, J. Y., & Kweon, I. S. (2018, September). CBAM: Convolutional Block Attention Module. In European Conference on Computer Vision (pp. 3-19). Cham: Springer International Publishing.
[CrossRef] [Google Scholar] - Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017, July). Pyramid Scene Parsing Network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 6230-6239). IEEE.
[CrossRef] [Google Scholar] - Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017, July). Feature Pyramid Networks for Object Detection. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 936-944). IEEE.
[CrossRef] [Google Scholar] - Zou, Q., Zhang, Z., Li, Q., Qi, X., Wang, Q., & Wang, S. (2018). Deepcrack: Learning hierarchical convolutional features for crack detection. IEEE transactions on image processing, 28(3), 1498-1512.
[CrossRef] [Google Scholar] - Wang, W., & Su, C. (2022). Automatic concrete crack segmentation model based on transformer. Automation in Construction, 139, 104275.
[CrossRef] [Google Scholar] - Rakshitha, R., Srinath, S., Vinay Kumar, N., Rashmi, S., & Poornima, B. V. (2024). Crack SAM: enhancing crack detection utilizing foundation models and Detectron2 architecture. Journal of Infrastructure Preservation and Resilience, 5(1), 11.
[CrossRef] [Google Scholar] - Li, H., Yue, Z., Liu, J., Wang, Y., Cai, H., Cui, K., & Chen, X. (2021). Sccdnet: A pixel-level crack segmentation network. Applied Sciences, 11(11), 5074.
[CrossRef] [Google Scholar] - Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., & Hu, Q. (2020). ECA-Net: Efficient channel attention for deep convolutional neural networks. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 11534-11542). IEEE.
[CrossRef] [Google Scholar]
Cite This Article
TY - JOUR AU - Ji, Rendong AU - Zhang, Xiaojun AU - Tang, Xiu AU - Wang, Xiaoyan AU - Xu, Yunlong AU - Shi, Jiaxin PY - 2026 DA - 2026/06/29 TI - Road Crack Segmentation Algorithm Based on Multiple Attention Fusion and Defect Correction JO - ICCK Transactions on Intelligent Systematics T2 - ICCK Transactions on Intelligent Systematics JF - ICCK Transactions on Intelligent Systematics VL - 3 IS - 2 SP - 126 EP - 144 DO - 10.62762/TIS.2025.325163 UR - https://www.icck.org/article/abs/TIS.2025.325163 KW - atrous residual convolution KW - defect correction KW - multiple attention fusion KW - road crack KW - semantic segmentation AB - This paper proposes an enhanced U-Net-based segmentation framework for road crack detection that effectively addresses issues such as incomplete segmentation, detail loss, environmental complexity, and crack-pixel imbalance. The model integrates multiple functional modules to improve segmentation performance across varying crack types and scales. Specifically, an atrous residual convolution (ARC) module is embedded in the encoder to expand the receptive field and capture large-scale features. A multiple attention fusion module (MAFM), combined with an efficient channel attention mechanism, is introduced at the bridge stage to emphasize crack-relevant features. In the decoder, a defect correction module (DCM) with deep supervision and adaptive refinement is designed to restore fine-grained crack boundaries, especially for small or subtle defects. The proposed model achieves F1-scores of 78.11%, 90.02%, and 81.60% on CFD dataset, CRACK500 dataset, and HYCrack dataset, respectively. Compared to existing state-of-the-art segmentation models, the proposed approach achieves superior accuracy and better preservation of crack detail. These results demonstrate its practical value and strong potential for widespread application in road crack detection and infrastructure maintenance. SN - 3068-5079 PB - Institute of Central Computation and Knowledge LA - English ER -
@article{Ji2026Road,
author = {Rendong Ji and Xiaojun Zhang and Xiu Tang and Xiaoyan Wang and Yunlong Xu and Jiaxin Shi},
title = {Road Crack Segmentation Algorithm Based on Multiple Attention Fusion and Defect Correction},
journal = {ICCK Transactions on Intelligent Systematics},
year = {2026},
volume = {3},
number = {2},
pages = {126-144},
doi = {10.62762/TIS.2025.325163},
url = {https://www.icck.org/article/abs/TIS.2025.325163},
abstract = {This paper proposes an enhanced U-Net-based segmentation framework for road crack detection that effectively addresses issues such as incomplete segmentation, detail loss, environmental complexity, and crack-pixel imbalance. The model integrates multiple functional modules to improve segmentation performance across varying crack types and scales. Specifically, an atrous residual convolution (ARC) module is embedded in the encoder to expand the receptive field and capture large-scale features. A multiple attention fusion module (MAFM), combined with an efficient channel attention mechanism, is introduced at the bridge stage to emphasize crack-relevant features. In the decoder, a defect correction module (DCM) with deep supervision and adaptive refinement is designed to restore fine-grained crack boundaries, especially for small or subtle defects. The proposed model achieves F1-scores of 78.11\%, 90.02\%, and 81.60\% on CFD dataset, CRACK500 dataset, and HYCrack dataset, respectively. Compared to existing state-of-the-art segmentation models, the proposed approach achieves superior accuracy and better preservation of crack detail. These results demonstrate its practical value and strong potential for widespread application in road crack detection and infrastructure maintenance.},
keywords = {atrous residual convolution, defect correction, multiple attention fusion, road crack, semantic segmentation},
issn = {3068-5079},
publisher = {Institute of Central Computation and Knowledge}
}
Article Metrics
Publisher's Note
ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and Permissions
Portico