Repetitive sequences are ubiquitous components of eukaryotic genomes affecting genome size and evolution as well as gene regulation. Among them, short interspersed nuclear elements (SINEs) are non-coding retrotransposons usually shorter than 1000 bp. They contain only few short conserved structural motifs, in particular an internal promoter derived from cellular RNAs and a mostly AT-rich 3' tail, whereas the remaining regions are highly variable. SINEs emerge and vanish during evolution, and often diversify into numerous families and subfamilies that are usually specific for only a limited number of species. In contrast, at the 3' end of multiple plant SINEs we detected the highly conserved 'Angio-domain'. This 37 bp segment defines the Angio-SINE superfamily, which encompasses 24 plant SINE families widely distributed across 13 orders within the plant kingdom. We retrieved 28 433 full-length Angio-SINE copies from genome assemblies of 46 plant species, frequently located in genes. Compensatory mutations in and adjacent to the Angio-domain imply selective restraints maintaining its RNA structure. Angio-SINE families share segmental sequence similarities, indicating a modular evolution with strong Angio-domain preservation. We suggest that the conserved domain contributes to the evolutionary success of Angio-SINEs through either structural interactions between SINE RNA and proteins increasing their transpositional efficiency, or by enhancing their accumulation in genes.
Keywords: SINE superfamily; angiosperms; comparative genomics; retrotransposon; short interspersed nuclear element; transposable element.
© 2019 The Authors. The Plant Journal published by Society for Experimental Biology and John Wiley & Sons Ltd.