Volume 3, Issue 1, Chinese Journal of Information Fusion
Volume 3, Issue 1, 2026
Submit Manuscript Edit a Special Issue
Academic Editor
Jian Lan
Jian Lan
Xi'an Jiaotong University, China
Article QR Code
Article QR Code
Scan the QR code for reading
Popular articles
Chinese Journal of Information Fusion, Volume 3, Issue 1, 2026: 31-45

Open Access | Research Article | 08 February 2026
Transformer Fusing Chromosome Conformation and Genomic Information for Soybean Trait Prediction
1 College of Mathematics, Sichuan University, Chengdu 610064, China
2 State Key Laboratory of Swine and Poultry Breeding Industry, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu 611130, China
* Corresponding Author: Jie Zhou, [email protected]
ARK: ark:/57805/cjif.2025.226807
Received: 20 June 2025, Accepted: 27 December 2025, Published: 08 February 2026  
Abstract
Genomic information is increasingly leveraged for the precise prediction of crop traits, with the adoption of advanced genomic prediction techniques resulting in substantial improvements in both crop yield and quality. However, traditional genomic prediction methods exhibit notable limitations in capturing long-range dependencies and fully utilizing prior information from chromosome structure. In this work, two novel Transformer models fusing chromosome conformation and genomic information have been proposed. One is the chromosomal self-attention fusion model, which captures cross-chromosomal interactions more precisely by introducing chromosomal conformation information into the self-attention mechanism of the Transformer. The other is the chromatin interaction squeeze excitation model, which extracts global features of the chromosome from all single nucleotide polymorphism sites on each chromosome. It then employs the chromatin interaction matrix to perform a weighted fusion of these global features, enabling the effective utilization of inter-chromosomal information. In addition, two novel metrics are introduced to comprehensively assess the effectiveness of the internal self-attention mechanism. They quantify the concentration of attention while measuring the alignment between the attention distribution and the chromosomal interaction priors. Experiments show that the two proposed models exhibit significant advantages in predicting soybean oil content and protein.

Graphical Abstract
Transformer Fusing Chromosome Conformation and Genomic Information for Soybean Trait Prediction

Keywords
transformer
information fusion
chromosome interaction
genomic prediction
soybean traits

Data Availability Statement
Data will be made available on request.

Funding
This work was supported in part by Sichuan Science and Technology Program under Grant 2024NSFSC0444, and in part by the Fundamental Research Funds for the Central Universities under Grant SCU2023D008.

Conflicts of Interest
The authors declare no conflicts of interest.

AI Use Statement
The authors declare that no generative AI was used in the preparation of this manuscript.

Ethical Approval and Consent to Participate
Not applicable.

References
  1. Cabanos, C., Matsuoka, Y., & Maruyama, N. (2021). Soybean proteins/peptides: A review on their importance, biosynthesis, vacuolar sorting, and accumulation in seeds. Peptides, 143, 170598.
    [CrossRef]   [Google Scholar]
  2. Vargas-Almendra, A., Ruiz-Medrano, R., Núñez-Muñoz, L. A., Ramírez-Pool, J. A., Calderón-Pérez, B., & Xoconostle-Cázares, B. (2024). Advances in Soybean Genetic Improvement. Plants, 13(21), 3073.
    [CrossRef]   [Google Scholar]
  3. Ravelombola, W., Qin, J., Shi, A., Song, Q., Yuan, J., Wang, F., … & Zhang, M. (2021). Genome-wide association study and genomic selection for yield and related traits in soybean. PLOS ONE, 16(8), e0255761.
    [CrossRef]   [Google Scholar]
  4. Gao, P., Zhao, H., Luo, Z., Lin, Y., Feng, W., Li, Y., … & Wang, X. (2023). SoyDNGP: a web-accessible deep learning framework for genomic prediction in soybean breeding. Briefings in Bioinformatics, 24(6), bbad349.
    [CrossRef]   [Google Scholar]
  5. VanRaden, P. M. (2008). Efficient Methods to Compute Genomic Predictions. Journal of Dairy Science, 91(11), 4414–4423.
    [CrossRef]   [Google Scholar]
  6. Goddard, M. E., Hayes, B. J., & Meuwissen, T. H. E. (2011). Using the genomic relationship matrix to predict the accuracy of genomic selection. Journal of Animal Breeding and Genetics, 128(6), 409–421.
    [CrossRef]   [Google Scholar]
  7. Meuwissen, T. H. E., Hayes, B. J., & Goddard, M. E. (2001). Prediction of Total Genetic Value Using Genome-Wide Dense Marker Maps. Genetics, 157(4), 1819–1829.
    [CrossRef]   [Google Scholar]
  8. Habier, D., Fernando, R. L., & Dekkers, J. C. M. (2009). Genomic Selection Using Low-Density Marker Panels. Genetics, 182(1), 343–353.
    [CrossRef]   [Google Scholar]
  9. Kärkkäinen, H. P., & Sillanpää, M. J. (2012). Back to Basics for Bayesian Model Building in Genomic Selection. Genetics, 191(3), 969–987.
    [CrossRef]   [Google Scholar]
  10. Cortes, C., & Vapnik, V. (1995). Support-vector networks. Machine Learning, 20(3), 273–297.
    [CrossRef]   [Google Scholar]
  11. Friedman, J. H. (2001). Greedy function approximation: A gradient boosting machine. The Annals of Statistics, 29(5), 1189–1232.
    [CrossRef]   [Google Scholar]
  12. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5–32.
    [CrossRef]   [Google Scholar]
  13. Su, Y., Xu, H., & Yan, L. (2017). Support vector machine-based open crop model (SBOCM): Case of rice production in China. Saudi Journal of Biological Sciences, 24(3), 537–547.
    [CrossRef]   [Google Scholar]
  14. Wang, P., Lehti-Shiu, M. D., Lotreck, S., Segura Abá, K., Krysan, P. J., & Shiu, S. H. (2024). Prediction of plant complex traits via integration of multi-omics data. Nature Communications, 15(1), 6856.
    [CrossRef]   [Google Scholar]
  15. Mokry, F., Higa, R., de Alvarenga Mudadu, M., Oliveira de Lima, A., Meirelles, S. L., Barbosa da Silva, M. V., … & Correia de Almeida Regitano, L. (2013). Genome-wide association study for backfat thickness in Canchim beef cattle using Random Forest approach. BMC Genetics, 14(1), 47.
    [CrossRef]   [Google Scholar]
  16. Ennaji, O., Baha, S., Vergutz, L., & El Allali, A. (2024). Gradient boosting for yield prediction of elite maize hybrid ZhengDan 958. PLOS ONE, 19(12), e0315493.
    [CrossRef]   [Google Scholar]
  17. Lourenço, V. M., Ogutu, J. O., Rodrigues, R. A. P., Posekany, A., & Piepho, H. (2024). Genomic prediction using machine learning: a comparison of the performance of regularized regression, ensemble, instance-based and deep learning methods on synthetic and empirical data. BMC Genomics, 25(1), 152.
    [CrossRef]   [Google Scholar]
  18. Crossa, J., Pérez-Rodríguez, P., Cuevas, J., Montesinos-López, O., Jarquín, D., de los Campos, G., … & Varshney, R. K. (2017). Genomic Selection in Plant Breeding: Methods, Models, and Perspectives. Trends in Plant Science, 22(11), 961–975.
    [CrossRef]   [Google Scholar]
  19. Tong, H., & Nikoloski, Z. (2021). Machine learning approaches for crop improvement: Leveraging phenotypic and genotypic big data. Journal of Plant Physiology, 257, 153354.
    [CrossRef]   [Google Scholar]
  20. Hinton, G. E., & Salakhutdinov, R. (2006). Reducing the Dimensionality of Data with Neural Networks. Science, 313(5786), 504–507.
    [CrossRef]   [Google Scholar]
  21. Lecun, Y., Bottou, L., Bengio, Y., & Haffner, P. (1998). Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11), 2278–2324.
    [CrossRef]   [Google Scholar]
  22. Min, S., Lee, B., & Yoon, S. (2016). Deep learning in bioinformatics. Briefings in Bioinformatics, 18(5), 851–869.
    [CrossRef]   [Google Scholar]
  23. Wang, K., Abid, M. A., Rasheed, A., Crossa, J., Hearne, S., & Li, H. (2023). DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants. Molecular Plant, 16(1), 279–293.
    [CrossRef]   [Google Scholar]
  24. Liu, Y., Wang, D., He, F., Wang, J., Joshi, T., & Xu, D. (2019). Phenotype Prediction and Genome-Wide Association Study Using Deep Convolutional Neural Network of Soybean. Frontiers in Genetics, 10, 1091.
    [CrossRef]   [Google Scholar]
  25. Li, W., Guo, Y., Wang, B., & Yang, B. (2023). Learning spatiotemporal embedding with gated convolutional recurrent networks for translation initiation site prediction. Pattern Recognition, 136, 109234.
    [CrossRef]   [Google Scholar]
  26. Wang, Z., Li, W., & Tang, Z. (2024). Enhancing the genomic prediction accuracy of swine agricultural economic traits using an expanded one-hot encoding in CNN models. Journal of Integrative Agriculture.
    [CrossRef]   [Google Scholar]
  27. Hochreiter, S., & Schmidhuber, J. (1997). Long Short-Term Memory. Neural Computation, 9(8), 1735–1780.
    [CrossRef]   [Google Scholar]
  28. Zhang, Y., Qiao, S., Ji, S., & Li, Y. (2019). DeepSite: bidirectional LSTM and CNN models for predicting DNA–protein binding. International Journal of Machine Learning and Cybernetics, 11(4), 841–851.
    [CrossRef]   [Google Scholar]
  29. Ji, Y., Zhou, Z., Liu, H., & Davuluri, R. V. (2021). DNABERT: pre-trained Bidirectional Encoder Representations from Transformers model for DNA-language in genome. Bioinformatics, 37(15), 2112–2120.
    [CrossRef]   [Google Scholar]
  30. Avsec, Ž., Agarwal, V., Visentin, D., Ledsam, J. R., Grabska-Barwinska, A., Taylor, K. R., … & Kelley, D. R. (2021). Effective gene expression prediction from sequence by integrating long-range interactions. Nature Methods, 18(10), 1196–1203.
    [CrossRef]   [Google Scholar]
  31. Wu, C., Zhang, Y., Ying, Z., Li, L., Wang, J., Yu, H., … & Xu, X. (2023). A transformer-based genomic prediction method fused with knowledge-guided module. Briefings in Bioinformatics, 25(1), bbad438.
    [CrossRef]   [Google Scholar]
  32. Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., … & Kaiser, Ł. (2017). Attention Is All You Need. In 31st Conference on Neural Information Processing Systems (NIPS 2017).
    [Google Scholar]
  33. Browning, B. L., & Browning, S. R. (2009). A Unified Approach to Genotype Imputation and Haplotype-Phase Inference for Large Data Sets of Trios and Unrelated Individuals. The American Journal of Human Genetics, 84(2), 210–223.
    [CrossRef]   [Google Scholar]
  34. Purcell, S., Neale, B., Todd-Brown, K., Thomas, L., Ferreira, M. A. R., Bender, D., … & Sham, P. C. (2007). PLINK: A Tool Set for Whole-Genome Association and Population-Based Linkage Analyses. The American Journal of Human Genetics, 81(3), 559–575.
    [CrossRef]   [Google Scholar]
  35. Ji, L., Hou, W., Zhou, H., Xiong, L., Liu, C., Yuan, Z., & Li, L. (2025). EBMGP: a deep learning model for genomic prediction based on Elastic Net feature selection and bidirectional encoder representations from transformer's embedding and multi-head attention pooling. Theoretical and Applied Genetics, 138(5), 1-15.
    [CrossRef]   [Google Scholar]
  36. Lu, X., Liu, C., & Wang, J. (2025, May). Soybean genomic phenotype prediction method based on improving the transformer model with batch normalization and cosine annealing algorithm. In International Conference on Artificial Intelligence and Machine Learning Research (CAIMLR 2024) (Vol. 13635, pp. 227-233). SPIE.
    [CrossRef]   [Google Scholar]
  37. Montesinos-López, O. A., Chavira-Flores, M., Kiasmiantini, Crespo-Herrera, L., Saint Piere, C., Li, H., … & Crossa, J. (2024). A review of multimodal deep learning methods for genomic-enabled prediction in plant breeding. GENETICS, 228(4), iyae161.
    [CrossRef]   [Google Scholar]
  38. Consens, M. E., Diaz-Navarro, A., Chu, V., Stein, L., He, H. H., Moses, A., & Wang, B. (2025). Interpreting attention mechanisms in genomic transformer models: a framework for biological insights. bioRxiv, 2025-06.
    [CrossRef]   [Google Scholar]
  39. Javed, N., Weingarten, T., Sehanobish, A., Roberts, A., Dubey, A., Choromanski, K., & Bernstein, B. E. (2025). A multi-modal transformer for cell type-agnostic regulatory predictions. Cell Genomics, 5(2), 100762.
    [CrossRef]   [Google Scholar]

Cite This Article
APA Style
Chen, A., Zou, Q., Yang, X., & Zhou, J. (2026). Transformer Fusing Chromosome Conformation and Genomic Information for Soybean Trait Prediction. Chinese Journal of Information Fusion, 3(1), 31–45. https://doi.org/10.62762/CJIF.2025.226807
Export Citation
RIS Format
Compatible with EndNote, Zotero, Mendeley, and other reference managers
RIS format data for reference managers
TY  - JOUR
AU  - Chen, Ailing
AU  - Zou, Qingke
AU  - Yang, Xidi
AU  - Zhou, Jie
PY  - 2026
DA  - 2026/02/08
TI  - Transformer Fusing Chromosome Conformation and Genomic Information for Soybean Trait Prediction
JO  - Chinese Journal of Information Fusion
T2  - Chinese Journal of Information Fusion
JF  - Chinese Journal of Information Fusion
VL  - 3
IS  - 1
SP  - 31
EP  - 45
DO  - 10.62762/CJIF.2025.226807
UR  - https://www.icck.org/article/abs/CJIF.2025.226807
KW  - transformer
KW  - information fusion
KW  - chromosome interaction
KW  - genomic prediction
KW  - soybean traits
AB  - Genomic information is increasingly leveraged for the precise prediction of crop traits, with the adoption of advanced genomic prediction techniques resulting in substantial improvements in both crop yield and quality. However, traditional genomic prediction methods exhibit notable limitations in capturing long-range dependencies and fully utilizing prior information from chromosome structure. In this work, two novel Transformer models fusing chromosome conformation and genomic information have been proposed. One is the chromosomal self-attention fusion model, which captures cross-chromosomal interactions more precisely by introducing chromosomal conformation information into the self-attention mechanism of the Transformer. The other is the chromatin interaction squeeze excitation model, which extracts global features of the chromosome from all single nucleotide polymorphism sites on each chromosome. It then employs the chromatin interaction matrix to perform a weighted fusion of these global features, enabling the effective utilization of inter-chromosomal information. In addition, two novel metrics are introduced to comprehensively assess the effectiveness of the internal self-attention mechanism. They quantify the concentration of attention while measuring the alignment between the attention distribution and the chromosomal interaction priors. Experiments show that the two proposed models exhibit significant advantages in predicting soybean oil content and protein.
SN  - 2998-3371
PB  - Institute of Central Computation and Knowledge
LA  - English
ER  - 
BibTeX Format
Compatible with LaTeX, BibTeX, and other reference managers
BibTeX format data for LaTeX and reference managers
@article{Chen2026Transforme,
  author = {Ailing Chen and Qingke Zou and Xidi Yang and Jie Zhou},
  title = {Transformer Fusing Chromosome Conformation and Genomic Information for Soybean Trait Prediction},
  journal = {Chinese Journal of Information Fusion},
  year = {2026},
  volume = {3},
  number = {1},
  pages = {31-45},
  doi = {10.62762/CJIF.2025.226807},
  url = {https://www.icck.org/article/abs/CJIF.2025.226807},
  abstract = {Genomic information is increasingly leveraged for the precise prediction of crop traits, with the adoption of advanced genomic prediction techniques resulting in substantial improvements in both crop yield and quality. However, traditional genomic prediction methods exhibit notable limitations in capturing long-range dependencies and fully utilizing prior information from chromosome structure. In this work, two novel Transformer models fusing chromosome conformation and genomic information have been proposed. One is the chromosomal self-attention fusion model, which captures cross-chromosomal interactions more precisely by introducing chromosomal conformation information into the self-attention mechanism of the Transformer. The other is the chromatin interaction squeeze excitation model, which extracts global features of the chromosome from all single nucleotide polymorphism sites on each chromosome. It then employs the chromatin interaction matrix to perform a weighted fusion of these global features, enabling the effective utilization of inter-chromosomal information. In addition, two novel metrics are introduced to comprehensively assess the effectiveness of the internal self-attention mechanism. They quantify the concentration of attention while measuring the alignment between the attention distribution and the chromosomal interaction priors. Experiments show that the two proposed models exhibit significant advantages in predicting soybean oil content and protein.},
  keywords = {transformer, information fusion, chromosome interaction, genomic prediction, soybean traits},
  issn = {2998-3371},
  publisher = {Institute of Central Computation and Knowledge}
}

Article Metrics
Citations:

Crossref

0

Scopus

0

Web of Science

0
Article Access Statistics:
Views: 39
PDF Downloads: 12

Publisher's Note
ICCK stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and Permissions
CC BY Copyright © 2026 by the Author(s). Published by Institute of Central Computation and Knowledge. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/), which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made.
Chinese Journal of Information Fusion

Chinese Journal of Information Fusion

ISSN: 2998-3371 (Online) | ISSN: 2998-3363 (Print)

Email: [email protected]

Portico

Portico

All published articles are preserved here permanently:
https://www.portico.org/publishers/icck/