PCR-assisted binding site selection was used to define the sequence characteristics of high affinity YY1 binding sites. Compilation of the sequences of 189 selected oligonucleotides containing high affinity YY1 binding sites revealed two types of core sequence: ACAT and CCAT. ACAT cores were surrounded by other invariant nucleotides, forming the consensus GACATNTT. A search of the 73 kb human beta-like globin cluster with this consensus revealed eight matching motifs, six of which were located within 1-3 kb upstream of the gamma and beta genes. CCAT-type cores were more variable in surrounding sequence context; the consensus VDCCATNWY was found to fit 89% of the selected CCAT-containing oligonucleotides. A search of the human beta globin cluster with CCAT consensus sequences revealed 171 potential YY1 binding sites. Several of these were tested directly in gel shift assays and confirmed as high affinity YY1 binding sites. Finally, a strategy called motif-based phylogenetic analysis was employed to determine which of the 179 total sites are evolutionarily conserved. This analysis permits the detection of functionally conserved binding sites despite sequence differences present between the two species. The 21 conserved sites identified will serve as important starting points in further dissection of the possible role of YY1 in globin gene regulation.