YOLOv7-Bw: A Dense Small Object Efficient Detector Based on Remote Sensing Image

Xuebo Jin; Anshuo Tong; Xudong Ge; Huijun Ma; Jiaxi Li; Heran Fu; Longfei Gao

doi:10.62762/TIS.2024.137321

CiteScore

Impact Factor

Volume 1, Issue 1, ICCK Transactions on Intelligent Systematics

Volume 1, Issue 1, 2024

Submit Manuscript Edit a Special Issue

Academic Editor

Jing Na

Kunming University of Science and Technology, China

Article QR Code

Scan the QR code for reading

Popular articles

Case Studies on Integrating Artificial Intelligence in Finance to Transform Decision Making and Risk Management for Enhanced Financial Outcomes Research on A Ship Trajectory Classification Method Based on Deep Learning Bridging Modalities: A Survey of Cross-Modal Image-Text Retrieval Enhancing Fake News Detection with a Hybrid NLP-Machine Learning Framework Reinforcement Learning for Prompt Optimization in Language Models: A Comprehensive Survey of Methods, Representations, and Evaluation Challenges Plant Disease Detection Using Deep Learning Techniques A Mimic Fusion Algorithm for Dual Channel Video Based on Possibility Distribution Synthesis Theory YOLOv7-Bw: A Dense Small Object Efficient Detector Based on Remote Sensing Image Deep Prediction Network Based on Covariance Intersection Fusion for Sensor Data Analyzing the Translation and Impact of Popular Science Literature in China: A Case Study Approach

ICCK Transactions on Intelligent Systematics, Volume 1, Issue 1, 2024: 30-39

Free to Read | Research Article | 27 May 2024

YOLOv7-Bw: A Dense Small Object Efficient Detector Based on Remote Sensing Image

Xuebo Jin 1

Anshuo Tong 1

Xudong Ge 1

Huijun Ma 2 *

Jiaxi Li 2

Heran Fu 2

Longfei Gao 2

1 School of Computer Science and Artificial Intelligence, Beijing Technology and Business University, Beijing 100048, China

2 National Engineering Laboratory for Agri-product Quality Traceability, BTBU, Beijing, China

* Corresponding Author: Huijun Ma, [email protected]

DOI: 10.62762/TIS.2024.137321

Received: 16 February 2024, Accepted: 19 May 2024, Published: 27 May 2024

Cited by: 10 (Source: Web of Science) , 14 (Source: Google Scholar)

PDF (6.04 MB)

Article Metrics Cite This Article

Abstract

In recent years, deep learning techniques have been increasingly applied to the detection of remote sensing images. However, the substantial size variation and dense distribution of objects in these images present significant challenges to detection algorithms. Current methods often suffer from low efficiency, missed detections, and inaccurate bounding boxes. To address these issues, this paper presents an improved YOLO algorithm, YOLOv7-bw, designed for efficient remote sensing image detection, thereby advancing object detection applications in the remote sensing industry. YOLOv7-bw enhances the original SPPCSPC pooling pyramid network by incorporating a Bi-level Routing Attention module, which focuses on densely populated target areas to improve the network's feature extraction capabilities. Additionally, it introduces a dynamic non-monotonic WIoUv3 loss function to replace the original CIoU loss function. This substitution ensures that the loss function's gradient allocation strategy aligns more effectively with the current detection scenario, enhancing the network's focus on the detection object. Through comparative experiments on the DIOR remote sensing image dataset, we found that YOLOv7-bw achieved a high [email protected] of 85.63% and a high [email protected]:0.95 of 65.93%, surpassing the previous results of 83.7% and 63.9% by approximately 1.93% and 2.03%, respectively. Moreover, compared with commonly used algorithms, YOLOv7-bw demonstrated superior performance, thereby validating the feasibility and enhanced applicability of our proposed algorithm for remote sensing image detection.

Graphical Abstract

Keywords

remote sensing image

YOLO

object detection

mAP

Data Availability Statement

Data will be made available on request.

Funding

This work was supported without any funding.

Conflicts of Interest

The authors declare no conflicts of interest.

Ethical Approval and Consent to Participate

Not applicable.

References

Guang-Tao, N., & Hua, H. (2021). A survey of object detection in optical remote sensing images. Acta Automatica Sinica, 47(8), 1749-1768.
[Google Scholar]
Liu, G., Sun, X., Fu, K., & Wang, H. (2012). Aircraft recognition in high-resolution satellite images using coarse-to-fine shape prior. IEEE Geoscience and Remote Sensing Letters, 10(3), 573-577.
[CrossRef] [Google Scholar]
Liu, Q., Xiang, X., Wang, Y., Luo, Z., & Fang, F. (2020). Aircraft detection in remote sensing image based on corner clustering and deep learning. Engineering Applications of Artificial Intelligence, 87, 103333.
[CrossRef] [Google Scholar]
Zhu, C., Zhou, H., Wang, R., & Guo, J. (2010). A novel hierarchical method of ship detection from spaceborne optical image based on shape and texture features. IEEE Transactions on geoscience and remote sensing, 48(9), 3446-3456.
[CrossRef] [Google Scholar]
Bi, F., Zhu, B., Gao, L., & Bian, M. (2012). A visual search inspired computational model for ship detection in optical satellite images. IEEE Geoscience and Remote Sensing Letters, 9(4), 749-753.
[CrossRef] [Google Scholar]
Ren, S., He, K., Girshick, R., & Sun, J. (2016). Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE transactions on pattern analysis and machine intelligence, 39(6), 1137-1149.
[CrossRef] [Google Scholar]
Pang, J., Chen, K., Shi, J., Feng, H., Ouyang, W., & Lin, D. (2019, June). Libra R-CNN: Towards Balanced Learning for Object Detection. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 821-830). IEEE.
[CrossRef] [Google Scholar]
Nie, X., Duan, M., Ding, H., Hu, B., & Wong, E. K. (2020). Attention mask R-CNN for ship detection and segmentation from remote sensing images. IEEE Access, 8, 9325-9334.
[CrossRef] [Google Scholar]
Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016, June). You Only Look Once: Unified, Real-Time Object Detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 779-788). IEEE.
[CrossRef] [Google Scholar]
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C. Y., & Berg, A. C. (2016, September). Ssd: Single shot multibox detector. In European conference on computer vision (pp. 21-37). Cham: Springer International Publishing.
[CrossRef] [Google Scholar]
Tian, Z., Shen, C., Chen, H., & He, T. (2019, October). FCOS: Fully Convolutional One-Stage Object Detection. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (pp. 9626-9635). IEEE.
[CrossRef] [Google Scholar]
Tong, Z., Chen, Y., Xu, Z., & Yu, R. (2023). Wise-IoU: bounding box regression loss with dynamic focusing mechanism. arXiv preprint arXiv:2301.10051.
[Google Scholar]
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.
[Google Scholar]
Wang, C. Y., Bochkovskiy, A., & Liao, H. Y. M. (2023, June). YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 7464-7475). IEEE.
[CrossRef] [Google Scholar]
Cai, W., Qian, P., Ding, Y., Bi, M., Ning, X., Hong, D., & Bai, X. (2023). Graph-structured convolution-guided continuous context threshold-aware networks for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 61, 1-18.
[CrossRef] [Google Scholar]
Cai, W., Gao, M., Ding, Y., Ning, X., Bai, X., & Qian, P. (2023). Stereo attention cross-decoupling fusion-guided federated neural learning for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 61, 1-16.
[CrossRef] [Google Scholar]
Li, X., Ding, M., & Pižurica, A. (2021). Spectral feature fusion networks with dual attention for hyperspectral image classification. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-14.
[CrossRef] [Google Scholar]
Yang, X., Yang, X., Yang, J., Ming, Q., Wang, W., Tian, Q., & Yan, J. (2021). Learning high-precision bounding box for rotated object detection via kullback-leibler divergence. Advances in Neural Information Processing Systems, 34, 18381-18394.
[CrossRef] [Google Scholar]
Zhang, M., Liu, T., Piao, Y., Yao, S., & Lu, H. (2021, October). Auto-msfnet: Search multi-scale fusion network for salient object detection. In Proceedings of the 29th ACM international conference on multimedia (pp. 667-676).
[CrossRef] [Google Scholar]
Jiang, S., Zhang, J., Wang, W., & Wang, Y. (2023). Automatic inspection of bridge bolts using unmanned aerial vision and adaptive scale unification-based deep learning. Remote Sensing, 15(2), 328.
[CrossRef] [Google Scholar]
Wang, Y., Wang, L., Wang, H., & Li, P. (2019). End-to-end image super-resolution via deep and shallow convolutional networks. IEEE Access, 7, 31959-31970.
[CrossRef] [Google Scholar]
Yang, F., Li, W., Hu, H., Li, W., & Wang, P. (2020). Multi-scale feature integrated attention-based rotation network for object detection in VHR aerial images. Sensors, 20(6), 1686.
[CrossRef] [Google Scholar]
Yao, H., Yu, W., Luo, W., Qiang, Z., Luo, D., & Zhang, X. (2023). Learning global-local correspondence with semantic bottleneck for logical anomaly detection. IEEE Transactions on Circuits and Systems for Video Technology, 34(5), 3589-3605.
[CrossRef] [Google Scholar]
Yan, R., Yan, L., Cao, Y., Geng, G., & Zhou, P. (2024). One-stop multiscale reconciliation attention network with scribble supervision for salient object detection in optical remote sensing images. Applied Intelligence, 54(5), 3737-3755.
[CrossRef] [Google Scholar]
Zhang, H., & Wu, Y. (2024). CSEF-Net: Cross-Scale SAR Ship Detection Network Based on Efficient Receptive Field and Enhanced Hierarchical Fusion. Remote Sensing, 16(4), 622.
[CrossRef] [Google Scholar]
Roy, A. M., & Bhaduri, J. (2023). DenseSPH-YOLOv5: An automated damage detection model based on DenseNet and Swin-Transformer prediction head-enabled YOLOv5 with attention mechanism. Advanced Engineering Informatics, 56, 102007.
[CrossRef] [Google Scholar]
Mahaadevan, V. C., Narayanamoorthi, R., Gono, R., & Moldrik, P. (2023). Automatic identifier of socket for electrical vehicles using SWIN-transformer and SimAM attention mechanism-based EVS YOLO. IEEE Access, 11, 111238-111254.
[CrossRef] [Google Scholar]
Kamilov, U. S., Bouman, C. A., Buzzard, G. T., & Wohlberg, B. (2023). Plug-and-play methods for integrating physical and learned models in computational imaging: Theory, algorithms, and applications. IEEE Signal Processing Magazine, 40(1), 85-97.
[CrossRef] [Google Scholar]

Cite This Article

APA Style

Jin, X., Tong, A., Ge, X., Ma, H., Li, J., Fu, H., & Gao, L. (2024). YOLOv7-Bw: A Dense Small Object Efficient Detector Based on Remote Sensing Image. ICCK Transactions on Intelligent Systematics, 1(1), 30-39. https://doi.org/10.62762/TIS.2024.137321

Article Metrics

Citations:

Google Scholar

Crossref

Scopus

Web of Science

Article Access Statistics:

PDF Downloads: 1126

Publisher's Note

ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and Permissions

Institute of Central Computation and Knowledge (ICCK) or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

ICCK Transactions on Intelligent Systematics

ISSN: 3068-5079 (Online) | ISSN: 3069-003X (Print)

Email: [email protected]

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/icck/

Google Scholar

Crossref

Scopus

Web of Science

We use cookies