Very little is currently known about the major histocompatibility complex (MHC) region of cynomolgus macaques (Macaca fascicularis; Mafa) from Chinese breeding centers. We performed comprehensive MHC class I haplotype analysis of 100 cynomolgus macaques from two different centers, with animals from different reported original geographic origins (Vietnamese, Cambodian, and Cambodian/Indonesian mixed-origin). Many of the samples were of known relation to each other (sire, dam, and progeny sets), making it possible to characterize lineage-level haplotypes in these animals. We identified 52 Mafa-A and 74 Mafa-B haplotypes in this cohort, many of which were restricted to specific sample origins. We also characterized full-length MHC class I transcripts using Pacific Biosciences (PacBio) RS II single-molecule real-time (SMRT) sequencing. This technology allows for complete read-through of unfragmented MHC class I transcripts (~1100 bp in length), so no assembly is required to unambiguously resolve novel full-length sequences. Overall, we identified 311 total full-length transcripts in a subset of 72 cynomolgus macaques from these Chinese breeding facilities; 130 of these sequences were novel and an additional 115 extended existing short database sequences to span the complete open reading frame. This significantly expands the number of Mafa-A, Mafa-B, and Mafa-I full-length alleles in the official cynomolgus macaque MHC class I database. The PacBio technique described here represents a general method for full-length allele discovery and genotyping that can be extended to other complex immune loci such as MHC class II, killer immunoglobulin-like receptors, and Fc gamma receptors.
Keywords: MHC class I; Macaca fascicularis; PacBio long-amplicon sequencing; RNA transcript-based haplotypes.