-
CiteScore
-
Impact Factor
Volume 1, Issue 3, ICCK Journal of Image Analysis and Processing
Volume 1, Issue 3, 2025
Submit Manuscript Edit a Special Issue
Article QR Code
Article QR Code
Scan the QR code for reading
Popular articles
ICCK Journal of Image Analysis and Processing, Volume 1, Issue 3, 2025: 107-124

Open Access | Research Article | 17 September 2025
Relaxed Bounding Boxes for Object Detection
1 Independent Scientist, 8400 Winterthur, Switzerland
* Corresponding Author: Daniel Aioanei, [email protected]
Received: 08 August 2025, Accepted: 07 September 2025, Published: 17 September 2025  
Abstract
The Generalized Intersection over Union (GIoU) and the Manhattan distance between axis-aligned boxes represented either as corner coordinates or their center and size, are extended to accept a range of bounding boxes as ground truth, producing the metrics RIoU, $R_1$ and $R^t_1$, respectively. In the context of Table Detection it is shown that this box relaxation procedure allows training object detection models with partial or inexact annotations. For the Table Structure Recognition task, several code improvements to Microsoft's open-source Table Transformer increase all $\mathrm{GriTS}$ metrics on PubTables-1M, with the overall accuracy increasing from 0.8326 to 0.8433. Then box relaxation is applied to take advantage in the object detection loss function of the discretizing nature of the post-inference table cell matrix extraction procedure. This further reduces the error of the $\mathrm{GriTS}$ metrics $Acc_{Con}$, $GriTS_{Con}$, $GriTS_{Loc}$ and $GriTS_{Top}$ on the PubTables-1M tables without spanning cells by 1.8%, 13.2%, 10.6% and 14.9%, respectively.

Graphical Abstract
Relaxed Bounding Boxes for Object Detection

Keywords
object detection
table detection
table structure recognition
bounding box regression
loss function

Data Availability Statement
The source code used to generate the results reported here is available under an open-source license at https://github.com/aioaneid/table-transformer. No new training data were collected for this study.

Funding
This work was supported without any funding.

Conflicts of Interest
The author declares no conflicts of interest.

Ethical Approval and Consent to Participate
Not applicable.

References
  1. Cai, D., Zhang, Z., & Zhang, Z. (2023). Corner-point and foreground-area IoU loss: Better localization of small objects in bounding box regression. Sensors, 23(10), 4961.
    [CrossRef]   [Google Scholar]
  2. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., & Zagoruyko, S. (2020, August). End-to-end object detection with transformers. In European conference on computer vision (pp. 213-229). Cham: Springer International Publishing.
    [CrossRef]   [Google Scholar]
  3. Felzenszwalb, P. F., Girshick, R. B., McAllester, D., & Ramanan, D. (2009). Object detection with discriminatively trained part-based models. IEEE transactions on pattern analysis and machine intelligence, 32(9), 1627-1645.
    [CrossRef]   [Google Scholar]
  4. Gou, L., Wu, S., Yang, J., Yu, H., & Li, X. (2022). Gaussian guided IoU: A better metric for balanced learning on object detection. IET Computer Vision, 16(6), 556-566.
    [CrossRef]   [Google Scholar]
  5. He, J., Erfani, S., Ma, X., Bailey, J., Chi, Y., & Hua, X. S. (2021). $\alpha $-IoU: A family of power intersection over union losses for bounding box regression. Advances in neural information processing systems, 34, 20230-20242.
    [CrossRef]   [Google Scholar]
  6. Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., ... & Zitnick, C. L. (2014, September). Microsoft coco: Common objects in context. In European conference on computer vision (pp. 740-755). Cham: Springer International Publishing.
    [CrossRef]   [Google Scholar]
  7. Liu, C., Wang, K., Lu, H., Cao, Z., & Zhang, Z. (2022, October). Robust object detection with inaccurate bounding boxes. In European Conference on Computer Vision (pp. 53-69). Cham: Springer Nature Switzerland.
    [CrossRef]   [Google Scholar]
  8. Ma, S., & Xu, Y. (2023). Mpdiou: a loss for efficient and accurate bounding box regression. arXiv preprint arXiv:2307.07662.
    [Google Scholar]
  9. Pivato, M., De Franceschi, G., Tosatto, L., Frare, E., Kumar, D., Aioanei, D., ... & Bubacco, L. (2012). Covalent α-synuclein dimers: chemico-physical and aggregation properties. PloS one, 7(12), e50027.
    [CrossRef]   [Google Scholar]
  10. Rezatofighi, H., Tsoi, N., Gwak, J., Sadeghian, A., Reid, I., & Savarese, S. (2019, June). Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression. In 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 658-666). IEEE.
    [CrossRef]   [Google Scholar]
  11. Smock, B., Pesala, R., & Abraham, R. (2022, June). PubTables-1M: Towards comprehensive table extraction from unstructured documents. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 4624-4632). IEEE.
    [CrossRef]   [Google Scholar]
  12. Smock, B., Pesala, R., & Abraham, R. (2023, August). Aligning benchmark datasets for table structure recognition. In International Conference on Document Analysis and Recognition (pp. 371-386). Cham: Springer Nature Switzerland.
    [CrossRef]   [Google Scholar]
  13. Smock, B., Pesala, R., & Abraham, R. (2023, August). GriTS: Grid table similarity metric for table structure recognition. In International Conference on Document Analysis and Recognition (pp. 535-549). Cham: Springer Nature Switzerland.
    [CrossRef]   [Google Scholar]
  14. Suzuki, S. (1985). Topological structural analysis of digitized binary images by border following. Computer vision, graphics, and image processing, 30(1), 32-46.
    [CrossRef]   [Google Scholar]
  15. Tang, Y., Wang, J., Wang, X., Gao, B., Dellandréa, E., Gaizauskas, R., & Chen, L. (2017). Visual and semantic knowledge transfer for large scale semi-supervised object detection. IEEE transactions on pattern analysis and machine intelligence, 40(12), 3045-3058.
    [CrossRef]   [Google Scholar]
  16. Tychsen-Smith, L., & Petersson, L. (2018, June). Improving Object Localization with Fitness NMS and Bounded IoU Loss. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 6877-6885). IEEE.
    [CrossRef]   [Google Scholar]
  17. Uma, A., Fornaciari, T., Hovy, D., Paun, S., Plank, B., & Poesio, M. (2020, October). A Case for Soft Loss Functions. In Proceedings of the AAAI Conference on Human Computation and Crowdsourcing (Vol. 8, pp. 173-177).
    [CrossRef]   [Google Scholar]
  18. Wu, Z., Bodla, N., Singh, B., Najibi, M., Chellappa, R., & Davis, L. S. (2020). Soft sampling for robust object detection. In 30th British Machine Vision Conference, BMVC 2019.
    [Google Scholar]
  19. Zhang, X., Wan, F., Liu, C., Ji, R., & Ye, Q. (2019). Freeanchor: Learning to match anchors for visual object detection. Advances in neural information processing systems, 32.
    [Google Scholar]
  20. Zhang, X., Yang, Y., & Feng, J. (2019). Learning to localize objects with noisy labeled instances. In Proceedings of the Thirty-Third AAAI Conference on Artificial Intelligence (AAAI'19), pp. 1--10. AAAI Press.
    [CrossRef]   [Google Scholar]
  21. Xu, J., Ma, Y., He, S., & Zhu, J. (2019). 3D-GIoU: 3D generalized intersection over union for object detection in point cloud. Sensors, 19(19), 4093.
    [CrossRef]   [Google Scholar]
  22. Zhang, Z., & Sabuncu, M. (2018). Generalized cross entropy loss for training deep neural networks with noisy labels. Advances in neural information processing systems, 31.
    [Google Scholar]
  23. Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., & Ren, D. (2020, April). Distance-IoU loss: Faster and better learning for bounding box regression. In Proceedings of the AAAI conference on artificial intelligence (Vol. 34, No. 07, pp. 12993-13000).
    [CrossRef]   [Google Scholar]
  24. Le, D. T., Pham, T., Cai, J., & Rezatofighi, H. (2025). Marginalized Generalized IoU (MGIoU): A Unified Objective Function for Optimizing Any Convex Parametric Shapes. arXiv preprint arXiv:2504.16443.
    [Google Scholar]
  25. Xiao, C., Li, Y., Baytas, I. M., Zhou, J., & Wang, F. (2018). An MCEM framework for drug safety signal detection and combination from heterogeneous real world evidence. Scientific reports, 8(1), 1806.
    [CrossRef]   [Google Scholar]
  26. García-Minguillán López, O., Jiménez Valbuena, A., & Maestu Unturbe, C. (2019). Significant cellular viability dependence on time exposition at ELF-EMF and RF-EMF in vitro studies. International journal of environmental research and public health, 16(12), 2085.
    [CrossRef]   [Google Scholar]
  27. Aadal, L., Pallesen, H., Arntzen, C., & Moe, S. (2018). Municipal Cross‐Disciplinary Rehabilitation following Stroke in Denmark and Norway: A Qualitative Study. Rehabilitation Research and Practice, 2018(1), 1972190.
    [CrossRef]   [Google Scholar]
  28. Wan, Y. W., Sabbagh, E., Raese, R., Qian, Y., Luo, D., Denvir, J., ... & Guo, N. L. (2010). Hybrid models identified a 12-gene signature for lung cancer prognosis and chemoresponse prediction. PLoS One, 5(8), e12222.
    [CrossRef]   [Google Scholar]

Cite This Article
APA Style
Aioanei, D. (2025). Relaxed Bounding Boxes for Object Detection. ICCK Journal of Image Analysis and Processing, 1(3), 107–124. https://doi.org/10.62762/JIAP.2025.507329

Article Metrics
Citations:

Crossref

0

Scopus

0

Web of Science

0
Article Access Statistics:
Views: 220
PDF Downloads: 60

Publisher's Note
ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and Permissions
CC BY Copyright © 2025 by the Author(s). Published by Institute of Central Computation and Knowledge. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
ICCK Journal of Image Analysis and Processing

ICCK Journal of Image Analysis and Processing

ISSN: 3068-6679 (Online)

Email: [email protected]

Portico

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/icck/