Road Crack Segmentation Algorithm Based on Multiple Attention Fusion and Defect Correction

Rendong Ji; Xiaojun Zhang; Xiu Tang; Xiaoyan Wang; Yunlong Xu; Jiaxin Shi

doi:10.62762/TIS.2025.325163

Article Information

Published in ICCK Transactions on Intelligent Systematics

Volume/Issue Volume 3, Issue 2, 2026

Pages 126-144

Abstract

This paper proposes an enhanced U-Net-based segmentation framework for road crack detection that effectively addresses issues such as incomplete segmentation, detail loss, environmental complexity, and crack-pixel imbalance. The model integrates multiple functional modules to improve segmentation performance across varying crack types and scales. Specifically, an atrous residual convolution (ARC) module is embedded in the encoder to expand the receptive field and capture large-scale features. A multiple attention fusion module (MAFM), combined with an efficient channel attention mechanism, is introduced at the bridge stage to emphasize crack-relevant features. In the decoder, a defect correction module (DCM) with deep supervision and adaptive refinement is designed to restore fine-grained crack boundaries, especially for small or subtle defects. The proposed model achieves F1-scores of 78.11%, 90.02%, and 81.60% on CFD dataset, CRACK500 dataset, and HYCrack dataset, respectively. Compared to existing state-of-the-art segmentation models, the proposed approach achieves superior accuracy and better preservation of crack detail. These results demonstrate its practical value and strong potential for widespread application in road crack detection and infrastructure maintenance.

Graphical Abstract

Road Crack Segmentation Algorithm Based on Multiple Attention Fusion and Defect Correction

Keywords

atrous residual convolution defect correction multiple attention fusion road crack semantic segmentation

Data Availability Statement

Data will be made available on request.

Funding

This work was supported by the Postgraduate Research & Practice Innovation Program of Jiangsu Province, China under Grant SJCX25_2194.

Conflicts of Interest

The authors declare no conflicts of interest.

AI Use Statement

The authors declare that no generative AI was used in the preparation of this manuscript.

Ethical Approval and Consent to Participate

This study involved image collection on campus roads using a handheld device. No human subjects were recruited and no personal data were collected. All images were captured at ground level targeting pavement surfaces; any incidentally captured pedestrians or vehicles were anonymized through automatic blurring prior to dataset construction. Formal ethical approval was not required for this type of infrastructure monitoring study under the institutional guidelines of Huaiyin Institute of Technology.

References

Wang, W., Wang, M., Li, H., Zhao, H., Wang, K., He, C., ... & Chen, J. (2019). Pavement crack image acquisition methods and crack extraction algorithms: A review. Journal of Traffic and Transportation Engineering (English Edition), 6(6), 535-556.
[CrossRef] [Google Scholar]
Yang, X., Li, H., Yu, Y., Luo, X., Huang, T., & Yang, X. (2018). Automatic pixel‐level crack detection and measurement using fully convolutional network. Computer‐Aided Civil and Infrastructure Engineering, 33(12), 1090-1109.
[CrossRef] [Google Scholar]
Ronneberger, O., Fischer, P., & Brox, T. (2015, October). U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention (pp. 234-241). Cham: Springer international publishing.
[CrossRef] [Google Scholar]
Chen, L. C., Zhu, Y., Papandreou, G., Schroff, F., & Adam, H. (2018). Encoder-decoder with atrous separable convolution for semantic image segmentation. In Proceedings of the European conference on computer vision (ECCV) (pp. 801-818).
[Google Scholar]
Huang, S., Chen, H., Yan, L., Zou, X., Li, B., & Bi, Y. (2025). A review of the progress in machine vision-based crack detection and identification technology for asphalt pavements. Digital Transportation and Safety, 4(1), 65-79.
[CrossRef] [Google Scholar]
Zhang, Q., Chen, S., Wu, Y., Ji, Z., Yan, F., Huang, S., & Liu, Y. (2024). Improved U-net network asphalt pavement crack detection method. Plos one, 19(5), e0300679.
[CrossRef] [Google Scholar]
Yang, L., Bai, S., Liu, Y., & Yu, H. (2023). Multi-scale triple-attention network for pixelwise crack segmentation. Automation in Construction, 150, 104853.
[CrossRef] [Google Scholar]
Pan, Y., Zhang, G., & Zhang, L. (2020). A spatial-channel hierarchical deep learning network for pixel-level automated crack detection. Automation in Construction, 119, 103357.
[CrossRef] [Google Scholar]
Al-Huda, Z., Peng, B., Algburi, R. N. A., Al-antari, M. A., Al-Jarazi, R., & Zhai, D. (2023). A hybrid deep learning pavement crack semantic segmentation. Engineering Applications of Artificial Intelligence, 122, 106142.
[CrossRef] [Google Scholar]
Ali, L., Jassmi, H. A., Khan, W., & Alnajjar, F. (2023). Crack45K: integration of vision transformer with tubularity flow field (TuFF) and sliding-window approach for crack-segmentation in pavement structures. Buildings, 13(1), 55.
[CrossRef] [Google Scholar]
Hamishebahar, Y., Guan, H., So, S., & Jo, J. (2022). A comprehensive review of deep learning-based crack detection approaches. Applied Sciences, 12(3), 1374.
[CrossRef] [Google Scholar]
Xiang, C., Guo, J., Cao, R., & Deng, L. (2023). A crack-segmentation algorithm fusing transformers and convolutional neural networks for complex detection scenarios. Automation in Construction, 152, 104894.
[CrossRef] [Google Scholar]
Sahragard, E., Farsi, H., & Mohamadzadeh, S. (2025). Advancing semantic segmentation: Enhanced UNet algorithm with attention mechanism and deformable convolution. PloS one, 20(1), e0305561.
[CrossRef] [Google Scholar]
Shi, Y., Cui, L., Qi, Z., Meng, F., & Chen, Z. (2016). Automatic road crack detection using random structured forests. IEEE transactions on intelligent transportation systems, 17(12), 3434-3445.
[CrossRef] [Google Scholar]
Yang, F., Zhang, L., Yu, S., Prokhorov, D., Mei, X., & Ling, H. (2019). Feature pyramid and hierarchical boosting network for pavement crack detection. IEEE transactions on intelligent transportation systems, 21(4), 1525-1535.
[CrossRef] [Google Scholar]
Yan, Y., Deng, C., Li, L., Zhu, L., & Ye, B. (2023). Survey of image semantic segmentation methods in the deep learning era. Journal of Image and Graphics, 28(11), 3342-3362.
[CrossRef] [Google Scholar]
He, K., Zhang, X., Ren, S., & Sun, J. (2016). Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 770-778). IEEE.
[CrossRef] [Google Scholar]
Woo, S., Park, J., Lee, J. Y., & Kweon, I. S. (2018, September). CBAM: Convolutional Block Attention Module. In European Conference on Computer Vision (pp. 3-19). Cham: Springer International Publishing.
[CrossRef] [Google Scholar]
Zhao, H., Shi, J., Qi, X., Wang, X., & Jia, J. (2017, July). Pyramid Scene Parsing Network. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 6230-6239). IEEE.
[CrossRef] [Google Scholar]
Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017, July). Feature Pyramid Networks for Object Detection. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 936-944). IEEE.
[CrossRef] [Google Scholar]
Zou, Q., Zhang, Z., Li, Q., Qi, X., Wang, Q., & Wang, S. (2018). Deepcrack: Learning hierarchical convolutional features for crack detection. IEEE transactions on image processing, 28(3), 1498-1512.
[CrossRef] [Google Scholar]
Wang, W., & Su, C. (2022). Automatic concrete crack segmentation model based on transformer. Automation in Construction, 139, 104275.
[CrossRef] [Google Scholar]
Rakshitha, R., Srinath, S., Vinay Kumar, N., Rashmi, S., & Poornima, B. V. (2024). Crack SAM: enhancing crack detection utilizing foundation models and Detectron2 architecture. Journal of Infrastructure Preservation and Resilience, 5(1), 11.
[CrossRef] [Google Scholar]
Li, H., Yue, Z., Liu, J., Wang, Y., Cai, H., Cui, K., & Chen, X. (2021). Sccdnet: A pixel-level crack segmentation network. Applied Sciences, 11(11), 5074.
[CrossRef] [Google Scholar]
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., & Hu, Q. (2020). ECA-Net: Efficient channel attention for deep convolutional neural networks. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 11534-11542). IEEE.
[CrossRef] [Google Scholar]

Cite This Article

APA Style

Ji, R., Zhang, X., Tang, X., Wang, X., Xu, Y., & Shi, J. (2026). Road Crack Segmentation Algorithm Based on Multiple Attention Fusion and Defect Correction. ICCK Transactions on Intelligent Systematics, 3(2), 126-144. https://doi.org/10.62762/TIS.2025.325163

Export Citation

RIS Format

Compatible with EndNote, Zotero, Mendeley, and other reference managers

TY  - JOUR
AU  - Ji, Rendong
AU  - Zhang, Xiaojun
AU  - Tang, Xiu
AU  - Wang, Xiaoyan
AU  - Xu, Yunlong
AU  - Shi, Jiaxin
PY  - 2026
DA  - 2026/06/29
TI  - Road Crack Segmentation Algorithm Based on Multiple Attention Fusion and Defect Correction
JO  - ICCK Transactions on Intelligent Systematics
T2  - ICCK Transactions on Intelligent Systematics
JF  - ICCK Transactions on Intelligent Systematics
VL  - 3
IS  - 2
SP  - 126
EP  - 144
DO  - 10.62762/TIS.2025.325163
UR  - https://www.icck.org/article/abs/TIS.2025.325163
KW  - atrous residual convolution
KW  - defect correction
KW  - multiple attention fusion
KW  - road crack
KW  - semantic segmentation
AB  - This paper proposes an enhanced U-Net-based segmentation framework for road crack detection that effectively addresses issues such as incomplete segmentation, detail loss, environmental complexity, and crack-pixel imbalance. The model integrates multiple functional modules to improve segmentation performance across varying crack types and scales. Specifically, an atrous residual convolution (ARC) module is embedded in the encoder to expand the receptive field and capture large-scale features. A multiple attention fusion module (MAFM), combined with an efficient channel attention mechanism, is introduced at the bridge stage to emphasize crack-relevant features. In the decoder, a defect correction module (DCM) with deep supervision and adaptive refinement is designed to restore fine-grained crack boundaries, especially for small or subtle defects. The proposed model achieves F1-scores of 78.11%, 90.02%, and 81.60% on CFD dataset, CRACK500 dataset, and HYCrack dataset, respectively. Compared to existing state-of-the-art segmentation models, the proposed approach achieves superior accuracy and better preservation of crack detail. These results demonstrate its practical value and strong potential for widespread application in road crack detection and infrastructure maintenance.
SN  - 3068-5079
PB  - Institute of Central Computation and Knowledge
LA  - English
ER  -

BibTeX Format

Compatible with LaTeX, BibTeX, and other reference managers

@article{Ji2026Road,
  author = {Rendong Ji and Xiaojun Zhang and Xiu Tang and Xiaoyan Wang and Yunlong Xu and Jiaxin Shi},
  title = {Road Crack Segmentation Algorithm Based on Multiple Attention Fusion and Defect Correction},
  journal = {ICCK Transactions on Intelligent Systematics},
  year = {2026},
  volume = {3},
  number = {2},
  pages = {126-144},
  doi = {10.62762/TIS.2025.325163},
  url = {https://www.icck.org/article/abs/TIS.2025.325163},
  abstract = {This paper proposes an enhanced U-Net-based segmentation framework for road crack detection that effectively addresses issues such as incomplete segmentation, detail loss, environmental complexity, and crack-pixel imbalance. The model integrates multiple functional modules to improve segmentation performance across varying crack types and scales. Specifically, an atrous residual convolution (ARC) module is embedded in the encoder to expand the receptive field and capture large-scale features. A multiple attention fusion module (MAFM), combined with an efficient channel attention mechanism, is introduced at the bridge stage to emphasize crack-relevant features. In the decoder, a defect correction module (DCM) with deep supervision and adaptive refinement is designed to restore fine-grained crack boundaries, especially for small or subtle defects. The proposed model achieves F1-scores of 78.11\%, 90.02\%, and 81.60\% on CFD dataset, CRACK500 dataset, and HYCrack dataset, respectively. Compared to existing state-of-the-art segmentation models, the proposed approach achieves superior accuracy and better preservation of crack detail. These results demonstrate its practical value and strong potential for widespread application in road crack detection and infrastructure maintenance.},
  keywords = {atrous residual convolution, defect correction, multiple attention fusion, road crack, semantic segmentation},
  issn = {3068-5079},
  publisher = {Institute of Central Computation and Knowledge}
}

Article Metrics

Citations

Crossref

0

Scopus

0

Views

33

PDF Downloads

7

Publisher's Note

ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and Permissions

Institute of Central Computation and Knowledge (ICCK) or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

ICCK Transactions on Intelligent Systematics

ISSN: 3068-5079 (Online) | ISSN: 3069-003X (Print)

[email protected]

Preserved at
Portico

User

Unlimited Downloads

Complete Library Access

Membership Eligibility

Community Leadership Opportunities