RFS-codec: A Novel Encoding Approach to Store Image Data in DNA
Research Article  ·  Published: 30 June 2025
Issue cover
Journal of Artificial Intelligence in Bioinformatics
Volume 1, Issue 1, 2025: 41-50
Research Article Open Access

RFS-codec: A Novel Encoding Approach to Store Image Data in DNA

1 Department of Information and Computer Sciences, University of Hawaii at Manoa, Honolulu, HI 96822, United States
* Corresponding Author: Abdur Rasool, [email protected]
Volume 1, Issue 1

Abstract

DNA data storage is a promising technology that utilizes computer simulation and offers high-density and durable digital information storage. It is challenging to store massive image data in a small amount of DNA without losing the original data since nonspecific hybridization errors occur frequently and severely affect the durability of stored data. This work proposes a novel approach (RFS-codec) comprising an image fraction strategy and an innovative codec method to split and encode image data into DNA storage, respectively. The fraction strategy contributes by delivering a cost-effective solution for image storage in DNA. The codec method offers an encryption mechanism to convert binary data into DNA bases by avoiding hybridization errors and satisfying the critical bio-coding constraints responsible for DNA storage durability. The robustness of RFS-codec is computed with GC and homopolymer constraints. Experimentally, different image data are efficiently encoded and decoded successfully with 1.8 bit/nt average density. RFS-codec's results demonstrate substantial advantages in constructing cost-effective, scalable, and durable DNA data storage.

Graphical Abstract

RFS-codec: A Novel Encoding Approach to Store Image Data in DNA

Keywords

DNA data storage image fraction codec approach bio-coding constraints

Data Availability Statement

Data will be made available on request.

Funding

This work was supported without any funding.

Conflicts of Interest

The author declare no conflicts of interest.

Ethical Approval and Consent to Participate

Not applicable.

References

  1. Church, G. M., Gao, Y., & Kosuri, S. (2012). Next-generation digital information storage in DNA. Science, 337(6102), 1628.
    [CrossRef] [Google Scholar]
  2. Erlich, Y., & Zielinski, D. (2017). DNA Fountain enables a robust and efficient storage architecture. Science, 355(6328), 950-953.
    [CrossRef] [Google Scholar]
  3. Cao, B., Wang, K., Xie, L., Zhang, J., Zhao, Y., Wang, B., & Zheng, P. (2024). PELMI: Realize robust DNA image storage under general errors via parity encoding and local mean iteration. Briefings in Bioinformatics, 25(5), bbae463.
    [CrossRef] [Google Scholar]
  4. Song, L., Geng, F., Gong, Z. Y., Chen, X., Tang, J., Gong, C., ... & Yuan, Y. J. (2022). Robust data storage in DNA by de Bruijn graph-based de novo strand assembly. Nature communications, 13(1), 5361.
    [CrossRef] [Google Scholar]
  5. Goldman, N., Bertone, P., Chen, S., Dessimoz, C., LeProust, E. M., Sipos, B., & Birney, E. (2013). Towards practical, high-capacity, low-maintenance information storage in synthesized DNA. nature, 494(7435), 77-80.
    [CrossRef] [Google Scholar]
  6. Davis, J. (1996). Microvenus. Art Journal, 55(1), 70-74.
    [CrossRef] [Google Scholar]
  7. Bancroft, C., Bowler, T., Bloom, B., & Clelland, C. T. (2001). Long-term storage of information in DNA. Science, 293(5536), 1763-1765.
    [CrossRef] [Google Scholar]
  8. Pan, C., Tabatabaei, S. K., Tabatabaei Yazdi, S. M. H., Hernandez, A. G., Schroeder, C. M., & Milenkovic, O. (2022). Rewritable two-dimensional DNA-based data storage with machine learning reconstruction. Nature Communications, 13(1), 2984.
    [CrossRef] [Google Scholar]
  9. Cao, B., Zheng, Y., Shao, Q., Liu, Z., Xie, L., Zhao, Y., ... & Wei, X. (2024). Efficient data reconstruction: The bottleneck of large-scale application of DNA storage. Cell Reports, 43(4).
    [CrossRef] [Google Scholar]
  10. Antkowiak, P. L., Lietard, J., Darestani, M. Z., Somoza, M. M., Stark, W. J., Heckel, R., & Grass, R. N. (2020). Low cost DNA data storage using photolithographic synthesis and advanced information reconstruction and error correction. Nature communications, 11(1), 5345.
    [CrossRef] [Google Scholar]
  11. Banal, J. L., Shepherd, T. R., Berleant, J., Huang, H., Reyes, M., Ackerman, C. M., ... & Bathe, M. (2021). Random access DNA memory using Boolean search in an archival file storage system. Nature materials, 20(9), 1272-1280.
    [CrossRef] [Google Scholar]
  12. Zhou, S., Zhang, Q., & Wei, X. (2010). Image encryption algorithm based on DNA sequences for the big image. 2010 International Conference on Multimedia Information Networking and Security, 884-888.
    [CrossRef] [Google Scholar]
  13. Fan, Q., Lilja, D. J., & Sapatnekar, S. S. (2020). Adaptive-length coding of image data for low-cost approximate storage. IEEE Transactions on Computers, 69(2), 239-252.
    [CrossRef] [Google Scholar]
  14. Li, Q., Shi, L., Yang, J., Zhang, Y., & Xue, C. J. (2019). Leveraging approximate data for robust flash storage. 2019 56th ACM/IEEE Design Automation Conference (DAC), 1-6.
    [CrossRef] [Google Scholar]
  15. Organick, L., Ang, S. D., Chen, Y. J., Lopez, R., Yekhanin, S., Makarychev, K., ... Strauss, K. (2018). Random access in large-scale DNA data storage. Nature Biotechnology, 36(3), 242-248.
    [CrossRef] [Google Scholar]
  16. Cao, B., Zhang, X., Cui, S., & Zhang, Q. (2022). Adaptive coding for DNA storage with high storage density and low coverage. NPJ systems biology and applications, 8(1), 23.
    [CrossRef] [Google Scholar]

Cited By (6)

  1. Lingli Li, Hongxiao Li, Jing Fang, Jianzhuo Yan, Yongchuan Yu. Brain asymmetry-guided network model for infarct lesion segmentation in acute ischemic stroke. Expert Systems with Applications, 2026 , 299 .
    [CrossRef]
  2. Xue Li, Yanfen Zheng, Qi Shao, Jiadong Wang, Wei Li, Bin Wang, Shihua Zhou, Ben Cao, Pan Zheng. Highly biased DNA sequence reconstruction in DNA storage with multi-scale attention mechanism and contrast learning. Synthetic and Systems Biotechnology, 2026 , 12 .
    [CrossRef]
  3. Jiadong Wang, Bin Wang, Shihua Zhou, Ben Cao, Wei Li, Pan Zheng. DNACSE: Enhancing Genomic LLMs with Contrastive Learning for DNA Barcode Identification. Journal of Chemical Information and Modeling, 2026 , 66 (2).
    [CrossRef]
  4. Xiang Liu, Yanfen Zheng, Xue Li, Bin Wang, Shihua Zhou, Ben Cao, Pan Zheng. An end-to-end DNA storage coding method based on a low-complexity multiple biological constraints loss and RL-inspired differentiable solver. Expert Systems with Applications, 2026 , 315 .
    [CrossRef]
  5. Qian Liu, Jie Zhang, Jingsong Cui, Dong-Po Song, Hao Qi. Toward Portable DNA Data Storage: A Paper-Based System for Rewritable and Random Access. ACS Applied Materials & Interfaces, 2026 , 18 (16).
    [CrossRef]
  6. Lina Tan, Yi Li, Yan Zeng, Peng Chen. An Adaptive JPEG Steganography Algorithm Based on the UT-GAN Model. Electronics, 2025 , 14 (20).
    [CrossRef]
* Citation data provided by Crossref Cited-by.

Cite This Article

APA Style
Rasool, A. (2025). RFS-codec: A Novel Encoding Approach to Store Image Data in DNA. Journal of Artificial Intelligence in Bioinformatics, 1(1), 41–50. https://doi.org/10.62762/JAIB.2025.146324
Export Citation
RIS Format
Compatible with EndNote, Zotero, Mendeley, and other reference managers
TY  - JOUR
AU  - Rasool, Abdur
PY  - 2025
DA  - 2025/06/30
TI  - RFS-codec: A Novel Encoding Approach to Store Image Data in DNA
JO  - Journal of Artificial Intelligence in Bioinformatics
T2  - Journal of Artificial Intelligence in Bioinformatics
JF  - Journal of Artificial Intelligence in Bioinformatics
VL  - 1
IS  - 1
SP  - 41
EP  - 50
DO  - 10.62762/JAIB.2025.146324
UR  - https://www.icck.org/article/abs/JAIB.2025.146324
KW  - DNA data storage
KW  - image fraction
KW  - codec approach
KW  - bio-coding constraints
AB  - DNA data storage is a promising technology that utilizes computer simulation and offers high-density and durable digital information storage. It is challenging to store massive image data in a small amount of DNA without losing the original data since nonspecific hybridization errors occur frequently and severely affect the durability of stored data. This work proposes a novel approach (RFS-codec) comprising an image fraction strategy and an innovative codec method to split and encode image data into DNA storage, respectively. The fraction strategy contributes by delivering a cost-effective solution for image storage in DNA. The codec method offers an encryption mechanism to convert binary data into DNA bases by avoiding hybridization errors and satisfying the critical bio-coding constraints responsible for DNA storage durability. The robustness of RFS-codec is computed with GC and homopolymer constraints. Experimentally, different image data are efficiently encoded and decoded successfully with 1.8 bit/nt average density. RFS-codec's results demonstrate substantial advantages in constructing cost-effective, scalable, and durable DNA data storage.
SN  - 3068-7535
PB  - Institute of Central Computation and Knowledge
LA  - English
ER  - 
BibTeX Format
Compatible with LaTeX, BibTeX, and other reference managers
@article{Rasool2025RFScodec,
  author = {Abdur Rasool},
  title = {RFS-codec: A Novel Encoding Approach to Store Image Data in DNA},
  journal = {Journal of Artificial Intelligence in Bioinformatics},
  year = {2025},
  volume = {1},
  number = {1},
  pages = {41-50},
  doi = {10.62762/JAIB.2025.146324},
  url = {https://www.icck.org/article/abs/JAIB.2025.146324},
  abstract = {DNA data storage is a promising technology that utilizes computer simulation and offers high-density and durable digital information storage. It is challenging to store massive image data in a small amount of DNA without losing the original data since nonspecific hybridization errors occur frequently and severely affect the durability of stored data. This work proposes a novel approach (RFS-codec) comprising an image fraction strategy and an innovative codec method to split and encode image data into DNA storage, respectively. The fraction strategy contributes by delivering a cost-effective solution for image storage in DNA. The codec method offers an encryption mechanism to convert binary data into DNA bases by avoiding hybridization errors and satisfying the critical bio-coding constraints responsible for DNA storage durability. The robustness of RFS-codec is computed with GC and homopolymer constraints. Experimentally, different image data are efficiently encoded and decoded successfully with 1.8 bit/nt average density. RFS-codec's results demonstrate substantial advantages in constructing cost-effective, scalable, and durable DNA data storage.},
  keywords = {DNA data storage, image fraction, codec approach, bio-coding constraints},
  issn = {3068-7535},
  publisher = {Institute of Central Computation and Knowledge}
}

Article Metrics

Citations
Crossref
7
Scopus
3
Views
1533
PDF Downloads
587

Publisher's Note

ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and Permissions

CC BY Copyright © 2025 by the Author(s). Published by Institute of Central Computation and Knowledge. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
Journal of Artificial Intelligence in Bioinformatics
Journal of Artificial Intelligence in Bioinformatics
ISSN: 3068-7535 (Online)
Portico
Preserved at
Portico