RFS-codec: A Novel Encoding Approach to Store Image Data in DNA
Article Information
Abstract
DNA data storage is a promising technology that utilizes computer simulation and offers high-density and durable digital information storage. It is challenging to store massive image data in a small amount of DNA without losing the original data since nonspecific hybridization errors occur frequently and severely affect the durability of stored data. This work proposes a novel approach (RFS-codec) comprising an image fraction strategy and an innovative codec method to split and encode image data into DNA storage, respectively. The fraction strategy contributes by delivering a cost-effective solution for image storage in DNA. The codec method offers an encryption mechanism to convert binary data into DNA bases by avoiding hybridization errors and satisfying the critical bio-coding constraints responsible for DNA storage durability. The robustness of RFS-codec is computed with GC and homopolymer constraints. Experimentally, different image data are efficiently encoded and decoded successfully with 1.8 bit/nt average density. RFS-codec's results demonstrate substantial advantages in constructing cost-effective, scalable, and durable DNA data storage.
Graphical Abstract
Keywords
Data Availability Statement
Funding
Conflicts of Interest
Ethical Approval and Consent to Participate
References
- Church, G. M., Gao, Y., & Kosuri, S. (2012). Next-generation digital information storage in DNA. Science, 337(6102), 1628.
[CrossRef] [Google Scholar] - Erlich, Y., & Zielinski, D. (2017). DNA Fountain enables a robust and efficient storage architecture. Science, 355(6328), 950-953.
[CrossRef] [Google Scholar] - Cao, B., Wang, K., Xie, L., Zhang, J., Zhao, Y., Wang, B., & Zheng, P. (2024). PELMI: Realize robust DNA image storage under general errors via parity encoding and local mean iteration. Briefings in Bioinformatics, 25(5), bbae463.
[CrossRef] [Google Scholar] - Song, L., Geng, F., Gong, Z. Y., Chen, X., Tang, J., Gong, C., ... & Yuan, Y. J. (2022). Robust data storage in DNA by de Bruijn graph-based de novo strand assembly. Nature communications, 13(1), 5361.
[CrossRef] [Google Scholar] - Goldman, N., Bertone, P., Chen, S., Dessimoz, C., LeProust, E. M., Sipos, B., & Birney, E. (2013). Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. nature, 494(7435), 77-80.
[CrossRef] [Google Scholar] - Davis, J. (1996). Microvenus. Art Journal, 55(1), 70-74.
[CrossRef] [Google Scholar] - Bancroft, C., Bowler, T., Bloom, B., & Clelland, C. T. (2001). Long-term storage of information in DNA. Science, 293(5536), 1763-1765.
[CrossRef] [Google Scholar] - Pan, C., Tabatabaei, S. K., Tabatabaei Yazdi, S. M. H., Hernandez, A. G., Schroeder, C. M., & Milenkovic, O. (2022). Rewritable two-dimensional DNA-based data storage with machine learning reconstruction. Nature Communications, 13(1), 2984.
[CrossRef] [Google Scholar] - Cao, B., Zheng, Y., Shao, Q., Liu, Z., Xie, L., Zhao, Y., ... & Wei, X. (2024). Efficient data reconstruction: The bottleneck of large-scale application of DNA storage. Cell Reports, 43(4).
[CrossRef] [Google Scholar] - Antkowiak, P. L., Lietard, J., Darestani, M. Z., Somoza, M. M., Stark, W. J., Heckel, R., & Grass, R. N. (2020). Low cost DNA data storage using photolithographic synthesis and advanced information reconstruction and error correction. Nature communications, 11(1), 5345.
[CrossRef] [Google Scholar] - Banal, J. L., Shepherd, T. R., Berleant, J., Huang, H., Reyes, M., Ackerman, C. M., ... & Bathe, M. (2021). Random access DNA memory using Boolean search in an archival file storage system. Nature materials, 20(9), 1272-1280.
[CrossRef] [Google Scholar] - Zhou, S., Zhang, Q., & Wei, X. (2010). Image encryption algorithm based on DNA sequences for the big image. 2010 International Conference on Multimedia Information Networking and Security, 884-888.
[CrossRef] [Google Scholar] - Fan, Q., Lilja, D. J., & Sapatnekar, S. S. (2020). Adaptive-length coding of image data for low-cost approximate storage. IEEE Transactions on Computers, 69(2), 239-252.
[CrossRef] [Google Scholar] - Li, Q., Shi, L., Yang, J., Zhang, Y., & Xue, C. J. (2019). Leveraging approximate data for robust flash storage. 2019 56th ACM/IEEE Design Automation Conference (DAC), 1-6.
[CrossRef] [Google Scholar] - Organick, L., Ang, S. D., Chen, Y. J., Lopez, R., Yekhanin, S., Makarychev, K., ... Strauss, K. (2018). Random access in large-scale DNA data storage. Nature Biotechnology, 36(3), 242-248.
[CrossRef] [Google Scholar] - Cao, B., Zhang, X., Cui, S., & Zhang, Q. (2022). Adaptive coding for DNA storage with high storage density and low coverage. NPJ systems biology and applications, 8(1), 23.
[CrossRef] [Google Scholar]
Cited By (6)
-
Lingli Li, Hongxiao Li, Jing Fang, Jianzhuo Yan, Yongchuan Yu. Brain asymmetry-guided network model for infarct lesion segmentation in acute ischemic stroke.
Expert Systems with Applications, 2026 , 299 .
[CrossRef] -
Xue Li, Yanfen Zheng, Qi Shao, Jiadong Wang, Wei Li, Bin Wang, Shihua Zhou, Ben Cao, Pan Zheng. Highly biased DNA sequence reconstruction in DNA storage with multi-scale attention mechanism and contrast learning.
Synthetic and Systems Biotechnology, 2026 , 12 .
[CrossRef] -
Jiadong Wang, Bin Wang, Shihua Zhou, Ben Cao, Wei Li, Pan Zheng. DNACSE: Enhancing Genomic LLMs with Contrastive Learning for DNA Barcode Identification.
Journal of Chemical Information and Modeling, 2026 , 66 (2).
[CrossRef] -
Xiang Liu, Yanfen Zheng, Xue Li, Bin Wang, Shihua Zhou, Ben Cao, Pan Zheng. An end-to-end DNA storage coding method based on a low-complexity multiple biological constraints loss and RL-inspired differentiable solver.
Expert Systems with Applications, 2026 , 315 .
[CrossRef] -
Qian Liu, Jie Zhang, Jingsong Cui, Dong-Po Song, Hao Qi. Toward Portable DNA Data Storage: A Paper-Based System for Rewritable and Random Access.
ACS Applied Materials & Interfaces, 2026 , 18 (16).
[CrossRef] -
Lina Tan, Yi Li, Yan Zeng, Peng Chen. An Adaptive JPEG Steganography Algorithm Based on the UT-GAN Model.
Electronics, 2025 , 14 (20).
[CrossRef]
Cite This Article
TY - JOUR AU - Rasool, Abdur PY - 2025 DA - 2025/06/30 TI - RFS-codec: A Novel Encoding Approach to Store Image Data in DNA JO - Journal of Artificial Intelligence in Bioinformatics T2 - Journal of Artificial Intelligence in Bioinformatics JF - Journal of Artificial Intelligence in Bioinformatics VL - 1 IS - 1 SP - 41 EP - 50 DO - 10.62762/JAIB.2025.146324 UR - https://www.icck.org/article/abs/JAIB.2025.146324 KW - DNA data storage KW - image fraction KW - codec approach KW - bio-coding constraints AB - DNA data storage is a promising technology that utilizes computer simulation and offers high-density and durable digital information storage. It is challenging to store massive image data in a small amount of DNA without losing the original data since nonspecific hybridization errors occur frequently and severely affect the durability of stored data. This work proposes a novel approach (RFS-codec) comprising an image fraction strategy and an innovative codec method to split and encode image data into DNA storage, respectively. The fraction strategy contributes by delivering a cost-effective solution for image storage in DNA. The codec method offers an encryption mechanism to convert binary data into DNA bases by avoiding hybridization errors and satisfying the critical bio-coding constraints responsible for DNA storage durability. The robustness of RFS-codec is computed with GC and homopolymer constraints. Experimentally, different image data are efficiently encoded and decoded successfully with 1.8 bit/nt average density. RFS-codec's results demonstrate substantial advantages in constructing cost-effective, scalable, and durable DNA data storage. SN - 3068-7535 PB - Institute of Central Computation and Knowledge LA - English ER -
@article{Rasool2025RFScodec,
author = {Abdur Rasool},
title = {RFS-codec: A Novel Encoding Approach to Store Image Data in DNA},
journal = {Journal of Artificial Intelligence in Bioinformatics},
year = {2025},
volume = {1},
number = {1},
pages = {41-50},
doi = {10.62762/JAIB.2025.146324},
url = {https://www.icck.org/article/abs/JAIB.2025.146324},
abstract = {DNA data storage is a promising technology that utilizes computer simulation and offers high-density and durable digital information storage. It is challenging to store massive image data in a small amount of DNA without losing the original data since nonspecific hybridization errors occur frequently and severely affect the durability of stored data. This work proposes a novel approach (RFS-codec) comprising an image fraction strategy and an innovative codec method to split and encode image data into DNA storage, respectively. The fraction strategy contributes by delivering a cost-effective solution for image storage in DNA. The codec method offers an encryption mechanism to convert binary data into DNA bases by avoiding hybridization errors and satisfying the critical bio-coding constraints responsible for DNA storage durability. The robustness of RFS-codec is computed with GC and homopolymer constraints. Experimentally, different image data are efficiently encoded and decoded successfully with 1.8 bit/nt average density. RFS-codec's results demonstrate substantial advantages in constructing cost-effective, scalable, and durable DNA data storage.},
keywords = {DNA data storage, image fraction, codec approach, bio-coding constraints},
issn = {3068-7535},
publisher = {Institute of Central Computation and Knowledge}
}
Article Metrics
Publisher's Note
ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and Permissions
Copyright © 2025 by the Author(s). Published by Institute of Central Computation and Knowledge. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
Portico