Machine learning and deep learning have been employed in genomic selection (GS) to expedite the identification of superior genotypes and accelerate breeding cycles. However, a significant challenge for current data-driven deep learning models in GS is their low robustness and interpretability. To address this challenge, we developed Cropformer, a deep learning framework for predicting crop phenotypes and exploring downstream tasks. The framework consists of a combination of convolutional neural networks and multiple self-attention mechanisms to improve accuracy. Here, Cropformers ability to predict complex phenotypic traits was extensively evaluated on more than 20 traits across five major crops: maize, rice, wheat, foxtail millet, and tomato. Evaluation results show that Cropformer outperforms other GS methods in precision and robustness. Compared to the runner-up model, Cropformer's prediction accuracy improved by up to 7.5%. Additionally, Cropformer enhances the ability to analyze and assist the mining of genes associated with traits. With Cropformer, we identify dozens of single nucleotide polymorphisms (SNPs) with potential effects on maize phenotypic traits and reveal key genetic variations t underlying these differences. Cropformer makes considerable advances in predictive performance and assisted gene identification, representing a powerful general approach to facilitating the genomic design of crop breeding. Cropformer is freely accessible at https://cgris.net/cropformer.
Keywords: Deep learning; Genomic selection; Multiple self-attention mechanisms; Phenotypic prediction.
Copyright © 2024 The Author(s). Published by Elsevier Inc. All rights reserved.