The complement protein C4 is a non-enzymatic component of the C3 and C5 convertases and thus essential for the propagation of the classical complement pathway. The covalent binding of C4 to immunoglobulins and immune complexes (IC) also enhances the solubilization of immune aggregates, and the clearance of IC through complement receptor one (CR1) on erythrocytes. Human C4 is the most polymorphic protein of the complement system. In this review, we summarize the current concepts on the 1-2-3 loci model of C4A and C4B genes in the population, factors affecting the expression levels of C4 transcripts and proteins, and the structural, functional and serological diversities of the C4A and C4B proteins. The diversities and polymorphisms of the mouse homologues Slp and C4 proteins are described and contrasted with their human homologues. The human C4 genes are located in the MHC class III region on chromosome 6. Each human C4 gene consists of 41 exons coding for a 5.4-kb transcript. The long gene is 20.6 kb and the short gene is 14.2 kb. In the Caucasian population 55% of the MHC haplotypes have the 2-locus, C4A-C4B configurations and 45% have an unequal number of C4A and C4B genes. Moreover, three-quarters of C4 genes harbor the 6.4 kb endogenous retrovirus HERV-K(C4) in the intron 9 of the long genes. Duplication of a C4 gene always concurs with its adjacent genes RP, CYP21 and TNX, which together form a genetic unit termed an RCCX module. Monomodular, bimodular and trimodular RCCX structures with 1, 2 and 3 complement C4 genes have frequencies of 17%, 69% and 14%, respectively. Partial deficiencies of C4A and C4B, primarily due to the presence of monomodular haplotypes and homo-expression of C4A proteins from bimodular structures, have a combined frequency of 31.6%. Multiple structural isoforms of each C4A and C4B allotype exist in the circulation because of the imperfect and incomplete proteolytic processing of the precursor protein to form the beta-alpha-gamma structures. Immunofixation experiments of C4A and C4B demonstrate > 41 allotypes in the two classes of proteins. A compilation of polymorphic sites from limited C4 sequences revealed the presence of 24 polymophic residues, mostly clustered C-terminal to the thioester bond within the C4d region of the alpha-chain. The covalent binding affinities of the thioester carbonyl group of C4A and C4B appear to be modulated by four isotypic residues at positions 1101, 1102, 1105 and 1106. Site directed mutagenesis experiments revealed that D1106 is responsible for the effective binding of C4A to form amide bonds with immune aggregates or protein antigens, and H1106 of C4B catalyzes the transacylation of the thioester carbonyl group to form ester bonds with carbohydrate antigens. The expression of C4 is inducible or enhanced by gamma-interferon. The liver is the main organ that synthesizes and secretes C4A and C4B to the circulation but there are many extra-hepatic sites producing moderate quantities of C4 for local defense. The plasma protein levels of C4A and C4B are mainly determined by the corresponding gene dosage. However, C4B proteins encoded by monomodular short genes may have relatively higher concentrations than those from long C4A genes. The 5' regulatory sequence of a C4 gene contains a Spl site, three E-boxes but no TATA box. The sequences beyond--1524 nt may be completely different as the C4 genes at RCCX module I have RPI-specific sequences, while those at Modules II, III and IV have TNXA-specific sequences. The remarkable genetic diversity of human C4A and C4B probably promotes the exchange of genetic information to create and maintain the quantitative and qualitative variations of C4A and C4B proteins in the population, as driven by the selection pressure against a great variety of microbes. An undesirable accompanying byproduct of this phenomenon is the inherent deleterious recombinations among the RCCX constituents leading to autoimmune and genetic disorders.