Background: Despite thousands of variants identified by genome-wide association studies (GWAS) to be associated with autism spectrum disorder (ASD), it is unclear which mutations are causal because most are noncoding. Consequently, reliable diagnostic biomarkers are lacking. RNA-seq analysis captures biomolecular complexity that GWAS cannot by considering transcriptomic patterns. Therefore, integrating DNA and RNA testing may reveal causal genes and useful biomarkers for ASD.
Methods: We performed gene-based association studies using an adaptive test method with GWAS summary statistics from two large Psychiatric Genomics Consortium (PGC) datasets (ASD2019: 18,382 cases and 27,969 controls; ASD2017: 6,197 cases and 7,377 controls). We also investigated differential expression for genes identified with the adaptive test using an RNA-seq dataset (GSE30573: 3 cases and 3 controls) and DESeq2.
Results: We identified 5 genes significantly associated with ASD in ASD2019 (KIZ-AS1, p = 8.67×10- 10; KIZ, p = 1.16×10- 9; XRN2, p = 7.73×10- 9; SOX7, p = 2.22×10- 7; LOC101929229 (also known as PINX1-DT), p = 2.14×10- 6). Two of the five genes were replicated in ASD2017: SOX7 (p = 0.00087) and LOC101929229 (p = 0.009), and KIZ was close to the replication boundary of replication (p = 0.06). We identified significant expression differences for SOX7 (p = 0.0017, adjusted p = 0.0085), LOC101929229 (p = 5.83×10- 7, adjusted p = 1.18×10- 5), and KIZ (p = 0.00099, adjusted p = 0.0055). SOX7 encodes a transcription factor that regulates developmental pathways, alterations in which may contribute to ASD.
Limitations: The limitation of the gene-based analysis is the reliance on a reference population for estimating linkage disequilibrium between variants. The similarity of this reference population to the population of study is crucial to the accuracy of many gene-based analyses, including those performed in this study. As a result, the extent of our findings is limited to European populations, as this was our reference of choice. Future work includes a tighter integration of DNA and RNA information as well as extensions to non-European populations that have been under-researched.
Conclusions: These findings suggest that SOX7 and its related SOX family genes encode transcription factors that are critical to the downregulation of the canonical Wnt/β-catenin signaling pathway, an important developmental signaling pathway, providing credence to the biologic plausibility of the association between gene SOX7 and autism spectrum disorder.
Keywords: Genome-wide association studies (GWAS); RNA-seq data analysis; autism spectrum disorder (ASD); gene-based association test.