Volume 2, Issue 2, ICCK Transactions on Machine Intelligence
Volume 2, Issue 2, 2026
Submit Manuscript Edit a Special Issue
Article QR Code
Article QR Code
Scan the QR code for reading
Popular articles
ICCK Transactions on Machine Intelligence, Volume 2, Issue 2, 2026: 77-87

Free to Read | Research Article | 09 February 2026
Detection of Newspaper Layouts Using YOLO12
1 Department of Computer Science, R.G.M. Government College, Joginder Nagar 175015, India
2 Department of Computer Science, Punjabi University, Patiala 147002, India
* Corresponding Author: Atul Kumar, [email protected]
ARK: ark:/57805/tmi.2025.846033
Received: 10 November 2025, Accepted: 31 December 2025, Published: 09 February 2026  
Abstract
This study presents a robust and scalable method for automatic layout detection in digitized newspapers to facilitate efficient knowledge extraction and information retrieval. A custom dataset comprising annotated newspaper images in English, Hindi, and other languages was developed, with layout regions categorized into five primary classes. An enhanced YOLOv12 object detection model was trained on this dataset and evaluated using the mean Average Precision (mAP) metric across various Intersection over Union (IoU) thresholds. The model achieved a mAP@50 of 0.88, demonstrating strong detection performance and outperforming several stateof-the-art object detection models in the same task. The findings validate the effectiveness of the proposed approach in handling multilingual, structurally diverse newspaper formats. This research provides a practical framework for integrating automated layout analysis into digital archiving systems, OCR pipelines, and media monitoring applications. It also supports broader efforts to digitize historical print media and improve accessibility to regional content, thereby enabling enhanced research, journalism, and public engagement.

Graphical Abstract
Detection of Newspaper Layouts Using YOLO12

Keywords
newspapers
YOLO
segmentation
layout analysis

Data Availability Statement
Data will be made available on request.

Funding
This work was supported without any funding.

Conflicts of Interest
The authors declare no conflicts of interest.

AI Use Statement
The authors declare that no generative AI was used in the preparation of this manuscript.

Ethical Approval and Consent to Participate
Not applicable.

References
  1. Breuel, T. M. (2008, January). The OCRopus open source OCR system. In Document recognition and retrieval XV (Vol. 6815, pp. 120-134). SPIE.
    [CrossRef]   [Google Scholar]
  2. Namboodiri, A. M., & Jain, A. K. (2007). Document structure and layout analysis. In Digital Document Processing: Major Directions and Recent Advances (pp. 29-48). London: Springer London.
    [CrossRef]   [Google Scholar]
  3. Ultralytics. (2025). Ultralytics YOLO12: Attention-centric object detection. Retrieved June 24, 2025, from https://github.com/ultralytics/yolo
    [Google Scholar]
  4. Kise, K. (2014). Page segmentation techniques in document analysis. In Handbook of Document Image Processing and Recognition (pp. 135-175). Springer, London.
    [Google Scholar]
  5. Binmahashen, G. M., & Mahmoud, S. A. (2019). Document layout analysis: A comprehensive survey. ACM Computing Surveys, 52(6), 1–36.
    [CrossRef]   [Google Scholar]
  6. Sutheebanjard, P., & Premchaiswadi, W. (2010, April). A modified recursive xy cut algorithm for solving block ordering problems. In 2010 2nd International Conference on Computer Engineering and Technology (Vol. 3, pp. V3-307). IEEE.
    [CrossRef]   [Google Scholar]
  7. Pavlidis, T., & Zhou, J. (1999). Page segmentation by white streams. In Proceedings of the 1st International Conference on Document Analysis and Recognition (ICDAR) (pp. 945–953).
    [Google Scholar]
  8. Sun, H. M. (2006). Enhanced constrained run-length algorithm for complex layout document processing. International Journal of Applied Science and Engineering, 4(3), 297-309.
    [Google Scholar]
  9. Gutehrlé, N., & Atanassova, I. (2022). Processing the structure of documents: Logical layout analysis of historical newspapers in French. Journal of Data Mining and Digital Humanities.
    [CrossRef]   [Google Scholar]
  10. Zhu, W., Sokhandan, N., Yang, G., Martin, S., & Sathyanarayana, S. (2022, June). DocBed: A multi-stage OCR solution for documents with complex layouts. In Proceedings of the AAAI Conference on Artificial Intelligence (Vol. 36, No. 11, pp. 12643-12649).
    [CrossRef]   [Google Scholar]
  11. Elanwar, R., Qin, W., Betke, M., & Wijaya, D. (2021). Extracting text from scanned Arabic books: a large-scale benchmark dataset and a fine-tuned Faster-R-CNN model. International Journal on Document Analysis and Recognition (IJDAR), 24(4), 349-362.
    [CrossRef]   [Google Scholar]
  12. Shen, Z., Zhang, K., & Dell, M. (2020, June). A Large Dataset of Historical Japanese Documents with Complex Layouts. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (pp. 2336-2343). IEEE.
    [CrossRef]   [Google Scholar]
  13. Iskandar, N. A. A. (2023, June). Manga Layout Analysis via Deep Learning. In IRC-SET 2022: Proceedings of the 8th IRC Conference on Science, Engineering and Technology, August 2022, Singapore (pp. 63-73). Singapore: Springer Nature Singapore.
    [CrossRef]   [Google Scholar]
  14. Davis, B., Morse, B., Price, B., Tensmeyer, C., Wigington, C., & Morariu, V. (2022, October). End-to-end document recognition and understanding with dessurt. In European Conference on Computer Vision (pp. 280-296). Cham: Springer Nature Switzerland.
    [CrossRef]   [Google Scholar]
  15. Shanthakumari, A., Kalpana, R., Jayashankari, J., Umamaheswari, B., & Sirija, M. (2022, May). Mask RCNN and Tesseract OCR for vehicle plate character recognition. In AIP Conference Proceedings (Vol. 2393, No. 1, p. 020135). AIP Publishing LLC.
    [CrossRef]   [Google Scholar]
  16. Smith, R. (2007, September). An overview of the Tesseract OCR engine. In Ninth international conference on document analysis and recognition (ICDAR 2007) (Vol. 2, pp. 629-633). IEEE.
    [CrossRef]   [Google Scholar]
  17. Shen, Z., Zhang, R., Dell, M., Lee, B. C. G., Carlson, J., & Li, W. (2021, September). Layoutparser: A unified toolkit for deep learning based document image analysis. In International Conference on Document Analysis and Recognition (pp. 131-146). Cham: Springer International Publishing.
    [CrossRef]   [Google Scholar]
  18. Lin, T. Y., Dollár, P., Girshick, R., He, K., Hariharan, B., & Belongie, S. (2017, July). Feature Pyramid Networks for Object Detection. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (pp. 936-944). IEEE.
    [CrossRef]   [Google Scholar]
  19. Redmon, J., Divvala, S., Girshick, R., & Farhadi, A. (2016). You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 779-788).
    [CrossRef]   [Google Scholar]
  20. Kay, A. (2007). Tesseract: an open-source optical character recognition engine. Linux Journal, 159.
    [Google Scholar]
  21. Jaha, E. S. (2023). Semantic document layout analysis of handwritten manuscripts. Computers, Materials & Continua, 75(2), 2805–2831.
    [CrossRef]   [Google Scholar]
  22. Barman, R., Ehrmann, M., Clematide, S., Oliveira, S. A., & Kaplan, F. (2021). Combining visual and textual features for semantic segmentation of historical newspapers. Journal of Data Mining & Digital Humanities, (HistoInformatics).
    [CrossRef]   [Google Scholar]
  23. Aljiffry, L., Al-Barhamtoshy, H., Jamal, A., & Abukhodair, F. (2022, October). Arabic Documents Layout Analysis (ADLA) using Fine-tuned Faster RCN. In 2022 20th International Conference on Language Engineering (ESOLEC) (Vol. 20, pp. 66-71). IEEE.
    [CrossRef]   [Google Scholar]
  24. Singh, V., & Kumar, B. (2014, January). Document layout analysis for Indian newspapers using contour based symbiotic approach. In 2014 International Conference on Computer Communication and Informatics (pp. 1-4). IEEE.
    [CrossRef]   [Google Scholar]
  25. Lombardi, F., & Marinai, S. (2020). Deep learning for historical document analysis and recognition—A survey. Journal of Imaging, 6(10), 110.
    [CrossRef]   [Google Scholar]
  26. Zhao, H., Min, W., Wang, Q., & Wei, Z. (2023). Memory-efficient document layout analysis method using LD-net. Multimedia Tools and Applications, 82(3), 4371-4386.
    [CrossRef]   [Google Scholar]
  27. Roboflow. (n.d.). Give your software the power to see objects in images and video. Retrieved June 24, 2025, from https://roboflow.com/
    [Google Scholar]
  28. Wang, A., Chen, H., Liu, L., Chen, K., Lin, Z., & Han, J. (2024). Yolov10: Real-time end-to-end object detection. Advances in Neural Information Processing Systems, 37, 107984-108011.
    [Google Scholar]
  29. He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2020). Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2), 386–397.
    [CrossRef]   [Google Scholar]
  30. Ultralytics. (2024). YOLOv11: The latest version of the YOLO series for object detection. Ultralytics Documentation. Retrieved June 24, 2025, from https://docs.ultralytics.com/models/yolo11/
    [Google Scholar]
  31. Ren, S., He, K., Girshick, R., & Sun, J. (2017). Faster R-CNN: Towards real-time object detection with region proposal networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), 1137–1149.
    [CrossRef]   [Google Scholar]

Cite This Article
APA Style
Kumar, A., & Lehal, G. S. (2026). Detection of Newspaper Layouts Using YOLO12. ICCK Transactions on Machine Intelligence, 2(2), 77–87. https://doi.org/10.62762/TMI.2025.846033
Export Citation
RIS Format
Compatible with EndNote, Zotero, Mendeley, and other reference managers
RIS format data for reference managers
TY  - JOUR
AU  - Kumar, Atul
AU  - Lehal, Gurpreet Singh
PY  - 2026
DA  - 2026/02/09
TI  - Detection of Newspaper Layouts Using YOLO12
JO  - ICCK Transactions on Machine Intelligence
T2  - ICCK Transactions on Machine Intelligence
JF  - ICCK Transactions on Machine Intelligence
VL  - 2
IS  - 2
SP  - 77
EP  - 87
DO  - 10.62762/TMI.2025.846033
UR  - https://www.icck.org/article/abs/TMI.2025.846033
KW  - newspapers
KW  - YOLO
KW  - segmentation
KW  - layout analysis
AB  - This study presents a robust and scalable method for automatic layout detection in digitized newspapers to facilitate efficient knowledge extraction and information retrieval. A custom dataset comprising annotated newspaper images in English, Hindi, and other languages was developed, with layout regions categorized into five primary classes. An enhanced YOLOv12 object detection model was trained on this dataset and evaluated using the mean Average Precision (mAP) metric across various Intersection over Union (IoU) thresholds. The model achieved a mAP@50 of 0.88, demonstrating strong detection performance and outperforming several stateof-the-art object detection models in the same task. The findings validate the effectiveness of the proposed approach in handling multilingual, structurally diverse newspaper formats. This research provides a practical framework for integrating automated layout analysis into digital archiving systems, OCR pipelines, and media monitoring applications. It also supports broader efforts to digitize historical print media and improve accessibility to regional content, thereby enabling enhanced research, journalism, and public engagement.
SN  - 3068-7403
PB  - Institute of Central Computation and Knowledge
LA  - English
ER  - 
BibTeX Format
Compatible with LaTeX, BibTeX, and other reference managers
BibTeX format data for LaTeX and reference managers
@article{Kumar2026Detection,
  author = {Atul Kumar and Gurpreet Singh Lehal},
  title = {Detection of Newspaper Layouts Using YOLO12},
  journal = {ICCK Transactions on Machine Intelligence},
  year = {2026},
  volume = {2},
  number = {2},
  pages = {77-87},
  doi = {10.62762/TMI.2025.846033},
  url = {https://www.icck.org/article/abs/TMI.2025.846033},
  abstract = {This study presents a robust and scalable method for automatic layout detection in digitized newspapers to facilitate efficient knowledge extraction and information retrieval. A custom dataset comprising annotated newspaper images in English, Hindi, and other languages was developed, with layout regions categorized into five primary classes. An enhanced YOLOv12 object detection model was trained on this dataset and evaluated using the mean Average Precision (mAP) metric across various Intersection over Union (IoU) thresholds. The model achieved a mAP@50 of 0.88, demonstrating strong detection performance and outperforming several stateof-the-art object detection models in the same task. The findings validate the effectiveness of the proposed approach in handling multilingual, structurally diverse newspaper formats. This research provides a practical framework for integrating automated layout analysis into digital archiving systems, OCR pipelines, and media monitoring applications. It also supports broader efforts to digitize historical print media and improve accessibility to regional content, thereby enabling enhanced research, journalism, and public engagement.},
  keywords = {newspapers, YOLO, segmentation, layout analysis},
  issn = {3068-7403},
  publisher = {Institute of Central Computation and Knowledge}
}

Article Metrics
Citations:

Crossref

0

Scopus

0

Web of Science

0
Article Access Statistics:
Views: 45
PDF Downloads: 14

Publisher's Note
ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and Permissions
Institute of Central Computation and Knowledge (ICCK) or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
ICCK Transactions on Machine Intelligence

ICCK Transactions on Machine Intelligence

ISSN: 3068-7403 (Online)

Email: [email protected]

Portico

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/icck/