The origin of genes from non-coding sequences is a long-term and fundamental biological question. However, how de novo genes originate and integrate into the existing pathways to regulate phenotypic variations is largely unknown. Here, we selected seven genes from 782 de novo genes for functional exploration based on transcriptional and translational evidence. Subsequently, we revealed that SWK, a de novo gene that originated from a non-coding sequence in Arabidopsis thaliana, plays a role in seed germination under osmotic stress. SWK is primarily expressed in dry seed, imbibing seed and silique. SWK can be fully translated into an 8 kDa protein, which is mainly located in the nucleus. Intriguingly, SWK was integrated into an extant pathway of hydrogen peroxide content (folate synthesis pathway) via the upstream gene cytHPPK/DHPS, an Arabidopsis-specific gene that originated from the duplication of mitHPPK/DHPS, and downstream gene GSTF9, to improve seed germination in osmotic stress. In addition, we demonstrated that the presence of SWK may be associated with drought tolerance in natural populations of Arabidopsis. Overall, our study highlights how a de novo gene originated and integrated into the existing pathways to regulate stress adaptation.
Keywords: Arabidopsis thaliana; Adaptive evolution; de novo gene; regulatory networks.
© The Author(s) 2024. Published by Oxford University Press on behalf of Society for Molecular Biology and Evolution.