The androgen receptor (AR) gene encodes a type of nuclear receptor that functions as a steroid-hormone activated transcription factor. In its coding region, AR includes a CAG repeat, which has been intensely studied due to the inverse correlation between repeat size and AR transcriptional activity. Several studies have reported different (CAG)n sizes associated with the risk of androgen-linked diseases. We aimed at clarifying the mechanisms on the origin of newly CAG sized alleles through a strategy involving the analysis of the associated haplotype diversity. We genotyped 374 control individuals of European and Asian ancestry, and reconstructed the haplotypes associated with the CAG repeat, defined by 10 SNPs and 6 flanking STRs. The most powerful SNPs to tag AR lineages are rs7061037-rs12012620 and rs34191540-rs6625187-rs2768578 in Europeans and Asians, respectively. In the most frequent AR lineage, (CAG)18 alleles seem to have been generated by a multistep mutation mechanism, most probably from longer alleles. We further noticed that the DXS1194-DXS1111 haplotype, in linkage disequilibrium with AR-(CAG)n expanded alleles responsible for spinal bulbar muscular atrophy (SBMA), is rare among our controls; however, the haplotype strategy here described may be used to clarify the origin of expansions in other populations, as in future association studies.
Keywords: SBMA; evolution; haplotype; polyglutamine diseases.
© 2014 Wiley Periodicals, Inc.