Towards Economical Long-Form Summarization: A Chunk-Based Approach Using LLMs

Avishto Banerjee

doi:10.62762/TLLM.2025.674475

CiteScore

Impact Factor

Volume 1, Issue 1, ICCK Transactions on Large Language Models

Volume 1, Issue 1, 2025

Submit Manuscript Edit a Special Issue

Article QR Code

Scan the QR code for reading

Popular articles

Case Studies on Integrating Artificial Intelligence in Finance to Transform Decision Making and Risk Management for Enhanced Financial Outcomes Reinforcement Learning for Prompt Optimization in Language Models: A Comprehensive Survey of Methods, Representations, and Evaluation Challenges Research on A Ship Trajectory Classification Method Based on Deep Learning Bridging Modalities: A Survey of Cross-Modal Image-Text Retrieval AI and the Future of Education: Advancing Personalized Learning and Intelligent Tutoring Systems Enhancing Fake News Detection with a Hybrid NLP-Machine Learning Framework Plant Disease Detection Using Deep Learning Techniques Acrylamide in Food: Sources and Prevention Modeling Brain Functional Networks Using Graph Neural Networks: A Review and Clinical Application Analyzing the Translation and Impact of Popular Science Literature in China: A Case Study Approach

ICCK Transactions on Large Language Models, Volume 1, Issue 1, 2025: 4-8

Free to Read | Research Article | 17 November 2025

Towards Economical Long-Form Summarization: A Chunk-Based Approach Using LLMs

Avishto Banerjee 1 *

1 SAP Labs India Pvt. Ltd., Bengaluru 560066, India

* Corresponding Author: Avishto Banerjee, [email protected]

DOI: 10.62762/TLLM.2025.674475

Received: 27 April 2025, Accepted: 25 July 2025, Published: 17 November 2025

PDF (794.67 KB)

Article Metrics Cite This Article

Abstract

In today's world anything almost everything related to literature can be achieved by LLMs. Be it summarization, abstraction, translation, transformation, etc. But not always is it possible to do those operations on extremely large content. Even with the large token output limits of newly launched advanced LLMs it is not always economically and technically feasible to perform such operations. To cater to such a problem this paper explores the idea of summarization of extensive contents by a chunk-based approach which is both efficient and economical. This approach also understands the drawback of loss of information while chunking and efficiently solves that issue. The usage of such a framework is highly demandable across various enterprise software industries as well as healthcare and financial industries to store, summarize as well as query various large contents which are sometimes challenging to maintain and query. To create a generic framework the approach used for the summarization is mainly zero-shot summarization.

Graphical Abstract

Keywords

LLMs

summarization

chunking

generative AI

NLP

Data Availability Statement

Data will be made available on request.

Funding

This work was supported without any funding.

Conflicts of Interest

Avishto Banerjee is an employee of SAP Labs India Pvt. Ltd., Bengaluru 560066, India. The author declares no conflicts of interest.

Ethical Approval and Consent to Participate

Not applicable.

References

Soares, E. R., & Barrére, E. (2018, October). Automatic topic segmentation for video lectures using low and high-level audio features. In Proceedings of the 24th Brazilian Symposium on Multimedia and the Web (pp. 189-196).
[CrossRef] [Google Scholar]
Alesh, Y., Aoudia, M., Abdulghani, O., Al Ali, O., & Abu Talib, M. (2024, July). Abstractive Summarization of Lectures and Lecture Segments Transcripts with BART. In International Conference on Artificial Intelligence in Education Technology (pp. 43-55). Singapore: Springer Nature Singapore.
[CrossRef] [Google Scholar]
Parmar, M., Deilamsalehy, H., Dernoncourt, F., Yoon, S., Rossi, R. A., & Bui, T. (2024). Towards enhancing coherence in extractive summarization: Dataset and experiments with LLMs. arXiv preprint arXiv:2407.04855.
[Google Scholar]
Kotkar, A. D., Mahadik, R. S., More, P. G., & Thorat, S. A. (2024, August). Comparative analysis of transformer-based large language models (llms) for text summarization. In 2024 1st International Conference on Advanced Computing and Emerging Technologies (ACET) (pp. 1-7). IEEE.
[CrossRef] [Google Scholar]
Wilson, E., Saxena, A., Mahajan, J., Panikulangara, L., Kulkarni, S., & Jain, P. (2024, March). FIN2SUM: advancing AI-driven financial text summarization with LLMs. In 2024 International Conference on Trends in Quantum Computing and Emerging Business Technologies (pp. 1-5). IEEE.
[CrossRef] [Google Scholar]
Sojitra, D., Jain, R., Saha, S., Jatowt, A., & Gupta, M. (2024, July). Timeline summarization in the era of llms. In Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 2657-2661).
[CrossRef] [Google Scholar]
Provakar, M. M. (2024, October). Evaluating the Text Summarization Efficiency of Large Language Models. In 2024 2nd International Conference on Information and Communication Technology (ICICT) (pp. 6-10). IEEE.
[CrossRef] [Google Scholar]
Jiang, Z., Yang, J., & Rao, D. (2024, November). An Empirical Study of Leveraging PLMs and LLMs for Long-Text Summarization. In Pacific Rim International Conference on Artificial Intelligence (pp. 424-435). Singapore: Springer Nature Singapore.
[CrossRef] [Google Scholar]
Sultana, F., Fuad, M. T. H., Fahim, M., Rahman, R. R., Hossain, M., Amin, M. A., ... & Ali, A. A. (2024, December). How Good are LM and LLMs in Bangla Newspaper Article Summarization?. In International Conference on Pattern Recognition (pp. 72-86). Cham: Springer Nature Switzerland.
[CrossRef] [Google Scholar]
VarastehNezhad, A., Tavasoli, R., Masumi, M., Majd, S. S., & Shamsfard, M. (2024, December). Evaluating LLMs in Persian News Summarization. In 2024 15th International Conference on Information and Knowledge Technology (IKT) (pp. 195-201). IEEE.
[CrossRef] [Google Scholar]
Aljohani, A., Alharbi, R., Alkhaldi, A., & Aljedaani, W. (2025, February). Evaluating LLMs for Arabic Code Summarization: Challenges and Insights from GPT-4. In 2025 8th International Conference on Data Science and Machine Learning Applications (CDMA) (pp. 67-72). IEEE.
[CrossRef] [Google Scholar]
Xiao, W., Liu, Y., Li, X., Gao, F., & Gu, J. (2024, December). TKG-RAG: A Retrieval-Augmented Generation Framework with Text-chunk Knowledge Graph. In 2024 25th International Arab Conference on Information Technology (ACIT) (pp. 1-9). IEEE.
[CrossRef] [Google Scholar]
Thulke, D., Gao, Y., Jalota, R., Dugast, C., & Ney, H. (2024, November). Prompting and Fine-Tuning of Small LLMs for Length-Controllable Telephone Call Summarization. In 2024 2nd International Conference on Foundation and Large Language Models (FLLM) (pp. 305-312). IEEE.
[CrossRef] [Google Scholar]

Cite This Article

APA Style

Banerjee, A. (2025). Towards Economical Long-Form Summarization: A Chunk-Based Approach Using LLMs. ICCK Transactions on Large Language Models, 1(1), 4–8. https://doi.org/10.62762/TLLM.2025.674475

Article Metrics

Citations:

Google Scholar

Crossref

Scopus

Web of Science

Article Access Statistics:

PDF Downloads: 14

Publisher's Note

ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and Permissions

Institute of Central Computation and Knowledge (ICCK) or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

ICCK Transactions on Large Language Models

ISSN: request pending (Online) | ISSN: request pending (Print)

Email: [email protected]

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/icck/

Google Scholar

Crossref

Scopus

Web of Science

We use cookies