ICCK Transactions on Intelligent Systematics | Volume 2, Issue 3: 160-168, 2025 | DOI: 10.62762/TIS.2025.405393
Abstract
Multimodal intelligent systems that integrate natural language processing with generative visual synthesis represent a frontier in intelligent information processing. This work addresses the design and evaluation of such a pipeline, using poetic content as a stress-test domain due to its high density of figurative language and abstract semantics. Building upon the PoemSum dataset, we construct a two-stage multimodal pipeline: first employing transformer-based models (BART and T5) for abstractive summarization, then leveraging Stable Diffusion for visual synthesis from the generated summaries. The summarization stage focuses on figurative interpretation that captures metaphorical and symbolic... More >
Graphical Abstract