A Recent Survey on Multi-modal Medical Image Fusion
Article Information
Abstract
Fusion of multi-modal medical images has transformed healthcare by overcoming the limitations of single-modality imaging, where modalities such as CT, MRI, PET, and SPECT provide complementary information. This review systematically traces the evolution of multi-modal medical image fusion from conventional mathematical models to state-of-the-art artificial intelligence (AI) techniques. We examine the transition from classical approaches---such as multiscale transformations, wavelet decompositions, and sparse representation---to modern deep learning methods, including convolutional neural networks, generative adversarial networks, and transformer architectures. Key limitations of existing methods are highlighted, including limited interpretability, insufficient preservation of modality-specific details, poor cross-dataset generalization, and inadequate post-fusion refinement. The review categorizes fusion strategies into three hierarchical levels---pixel-level, feature-level, and decision-level---and analyzes their benefits and computational complexities. We provide a comparative evaluation of hybrid methods combining classical transform-based approaches with deep learning models, which show improved preservation of anatomical structures and functional information. Major clinical applications are discussed in oncology (tumor detection, staging), neurology (brain tumor localization, surgical planning), ophthalmology (disease characterization), and orthopedics (fracture detection). Challenges to clinical adoption---such as modality misalignment, information loss, high computational requirements, and lack of universal benchmarks---are examined alongside emerging concerns over data privacy and security in telehealth. We also analyze interpretability frameworks, modality-preservation techniques, cross-dataset generalization strategies, and post-fusion refinement methods. The review concludes with strategic recommendations to develop interpretable, real-time AI fusion models with robust clinical validation and standardized benchmarks to bridge the gap between theoretical advances and practical clinical integration, ultimately enhancing diagnostic accuracy and patient outcomes.
Graphical Abstract
Keywords
Data Availability Statement
Funding
Conflicts of Interest
Ethical Approval and Consent to Participate
References
- Hermessi, H., Mourali, O., & Zagrouba, E. (2021). Multimodal medical image fusion review: Theoretical background and recent advances. Signal Processing, 183, 108036.
[CrossRef] [Google Scholar] - Li, Y., Daho, M. E. H., Conze, P. H., Zeghlache, R., Le Boité, H., Tadayoni, R., ... & Quellec, G. (2024). A review of deep learning-based information fusion techniques for multimodal medical image classification. Computers in Biology and Medicine, 177, 108635.
[CrossRef] [Google Scholar] - Zhou, T., Ruan, S., & Canu, S. (2019). A review: Deep learning for medical image segmentation using multi-modality fusion. Array, 3, 100004.
[CrossRef] [Google Scholar] - Huang, B., Yang, F., Yin, M., Mo, X., & Zhong, C. (2020). A review of multimodal medical image fusion techniques. Computational and mathematical methods in medicine, 2020(1), 8279342.
[CrossRef] [Google Scholar] - Tirupal, T., Mohan, B. C., & Kumar, S. S. (2021). Multimodal medical image fusion techniques–a review. Current Signal Transduction Therapy, 16(2), 142-163.
[CrossRef] [Google Scholar] - Alseelawi, N., Hazim, H. T., & Salim ALRikabi, H. T. (2022). A novel method of multimodal medical image fusion based on hybrid approach of NSCT and DTCWT. International Journal of Online & Biomedical Engineering, 18(3).
[CrossRef] [Google Scholar] - Maqsood, S., & Javed, U. (2020). Multi-modal medical image fusion based on two-scale image decomposition and sparse representation. Biomedical Signal Processing and Control, 57, 101810.
[CrossRef] [Google Scholar] - Bhatnagar, G., Wu, Q. J., & Liu, Z. (2015). A new contrast based multimodal medical image fusion framework. Neurocomputing, 157, 143-152.
[CrossRef] [Google Scholar] - Jose, J., Gautam, N., Tiwari, M., Tiwari, T., Suresh, A., Sundararaj, V., & MR, R. (2021). An image quality enhancement scheme employing adolescent identity search algorithm in the NSST domain for multimodal medical image fusion. Biomedical Signal Processing and Control, 66, 102480.
[CrossRef] [Google Scholar] - Zhang, Y., Sidibé, D., Morel, O., & Mériaudeau, F. (2021). Deep multimodal fusion for semantic image segmentation: A survey. Image and Vision Computing, 105, 104042.
[CrossRef] [Google Scholar] - Liu, Z., Yin, H., Chai, Y., & Yang, S. X. (2014). A novel approach for multimodal medical image fusion. Expert systems with applications, 41(16), 7425-7435.
[CrossRef] [Google Scholar] - Tang, W., He, F., Liu, Y., & Duan, Y. (2022). MATR: Multimodal medical image fusion via multiscale adaptive transformer. IEEE Transactions on Image Processing, 31, 5134-5149.
[CrossRef] [Google Scholar] - Bhavana, V., & Krishnappa, H. K. (2015). Multi-modality medical image fusion using discrete wavelet transform. Procedia Computer Science, 70, 625-631.
[CrossRef] [Google Scholar] - Parvathy, V. S., Pothiraj, S., & Sampson, J. (2020). Optimal deep neural network model based multimodality fused medical image classification. Physical Communication, 41, 101119.
[CrossRef] [Google Scholar] - He, C., Liu, Q., Li, H., & Wang, H. (2010). Multimodal medical image fusion based on IHS and PCA. Procedia Engineering, 7, 280-285.
[CrossRef] [Google Scholar] - Tan, W., Tiwari, P., Pandey, H. M., Moreira, C., & Jaiswal, A. K. (2020). Multimodal medical image fusion algorithm in the era of big data. Neural computing and applications, 1-21.
[CrossRef] [Google Scholar] - Singh, K. N., Singh, O. P., Singh, A. K., & Agrawal, A. K. (2024). Watmif: Multimodal medical image fusion-based watermarking for telehealth applications. Cognitive Computation, 16(4), 1947-1963.
[CrossRef] [Google Scholar] - Lin, C., Chen, Y., Feng, S., & Huang, M. (2024). A multibranch and multiscale neural network based on semantic perception for multimodal medical image fusion. Scientific Reports, 14(1), 17609.
[CrossRef] [Google Scholar]
Cited By (3)
-
Ramesh Ramamoorthy, R. S. Shanmugasundaram, A. Athiraja, Shitharth Selvarajan. Hybrid diagnostic framework for bone cancer detection using deep learning and radiomics analysis.
Scientific Reports, 2026 , 16 (1).
[CrossRef] -
Tao Jiang, Hongyang Zhao, Yun Liu, Jiayi Sun, Xingdong Li, Jing Jin. SFOFusion: a task-oriented meta-learning framework for spatial–frequency fusion of infrared and visible images.
Measurement Science and Technology, 2026 , 37 (24).
[CrossRef] -
Lingxin Gongye, Shiyuan He, Jianhua Guo. Dual adaptive graph laplacian regularization for high-fidelity hyperspectral image fusion.
Information Fusion, 2026 , 133 .
[CrossRef]
Cite This Article
TY - JOUR AU - Barola, Vandit Akhilesh AU - Singh, Prabhishek AU - Diwakar, Manoj PY - 2025 DA - 2025/11/07 TI - A Recent Survey on Multi-modal Medical Image Fusion JO - Biomedical Informatics and Smart Healthcare T2 - Biomedical Informatics and Smart Healthcare JF - Biomedical Informatics and Smart Healthcare VL - 1 IS - 3 SP - 89 EP - 97 DO - 10.62762/BISH.2025.414869 UR - https://www.icck.org/article/abs/BISH.2025.414869 KW - biomedical informatics KW - smart healthcare KW - artificial intelligence in medicine KW - precision medicine KW - digital health AB - Fusion of multi-modal medical images has transformed healthcare by overcoming the limitations of single-modality imaging, where modalities such as CT, MRI, PET, and SPECT provide complementary information. This review systematically traces the evolution of multi-modal medical image fusion from conventional mathematical models to state-of-the-art artificial intelligence (AI) techniques. We examine the transition from classical approaches---such as multiscale transformations, wavelet decompositions, and sparse representation---to modern deep learning methods, including convolutional neural networks, generative adversarial networks, and transformer architectures. Key limitations of existing methods are highlighted, including limited interpretability, insufficient preservation of modality-specific details, poor cross-dataset generalization, and inadequate post-fusion refinement. The review categorizes fusion strategies into three hierarchical levels---pixel-level, feature-level, and decision-level---and analyzes their benefits and computational complexities. We provide a comparative evaluation of hybrid methods combining classical transform-based approaches with deep learning models, which show improved preservation of anatomical structures and functional information. Major clinical applications are discussed in oncology (tumor detection, staging), neurology (brain tumor localization, surgical planning), ophthalmology (disease characterization), and orthopedics (fracture detection). Challenges to clinical adoption---such as modality misalignment, information loss, high computational requirements, and lack of universal benchmarks---are examined alongside emerging concerns over data privacy and security in telehealth. We also analyze interpretability frameworks, modality-preservation techniques, cross-dataset generalization strategies, and post-fusion refinement methods. The review concludes with strategic recommendations to develop interpretable, real-time AI fusion models with robust clinical validation and standardized benchmarks to bridge the gap between theoretical advances and practical clinical integration, ultimately enhancing diagnostic accuracy and patient outcomes. SN - 3068-5524 PB - Institute of Central Computation and Knowledge LA - English ER -
@article{Barola2025A,
author = {Vandit Akhilesh Barola and Prabhishek Singh and Manoj Diwakar},
title = {A Recent Survey on Multi-modal Medical Image Fusion},
journal = {Biomedical Informatics and Smart Healthcare},
year = {2025},
volume = {1},
number = {3},
pages = {89-97},
doi = {10.62762/BISH.2025.414869},
url = {https://www.icck.org/article/abs/BISH.2025.414869},
abstract = {Fusion of multi-modal medical images has transformed healthcare by overcoming the limitations of single-modality imaging, where modalities such as CT, MRI, PET, and SPECT provide complementary information. This review systematically traces the evolution of multi-modal medical image fusion from conventional mathematical models to state-of-the-art artificial intelligence (AI) techniques. We examine the transition from classical approaches---such as multiscale transformations, wavelet decompositions, and sparse representation---to modern deep learning methods, including convolutional neural networks, generative adversarial networks, and transformer architectures. Key limitations of existing methods are highlighted, including limited interpretability, insufficient preservation of modality-specific details, poor cross-dataset generalization, and inadequate post-fusion refinement. The review categorizes fusion strategies into three hierarchical levels---pixel-level, feature-level, and decision-level---and analyzes their benefits and computational complexities. We provide a comparative evaluation of hybrid methods combining classical transform-based approaches with deep learning models, which show improved preservation of anatomical structures and functional information. Major clinical applications are discussed in oncology (tumor detection, staging), neurology (brain tumor localization, surgical planning), ophthalmology (disease characterization), and orthopedics (fracture detection). Challenges to clinical adoption---such as modality misalignment, information loss, high computational requirements, and lack of universal benchmarks---are examined alongside emerging concerns over data privacy and security in telehealth. We also analyze interpretability frameworks, modality-preservation techniques, cross-dataset generalization strategies, and post-fusion refinement methods. The review concludes with strategic recommendations to develop interpretable, real-time AI fusion models with robust clinical validation and standardized benchmarks to bridge the gap between theoretical advances and practical clinical integration, ultimately enhancing diagnostic accuracy and patient outcomes.},
keywords = {biomedical informatics, smart healthcare, artificial intelligence in medicine, precision medicine, digital health},
issn = {3068-5524},
publisher = {Institute of Central Computation and Knowledge}
}
Publisher's Note
ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and Permissions
Copyright © 2025 by the Author(s). Published by Institute of Central Computation and Knowledge. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
Portico