A three-level hierarchical mixture model for classification is presented that models the following data generation process: (1) the data are generated by a finite number of sources (clusters), and (2) the generation mechanism of each source assumes the existence of individual internal class-labeled sources (subclusters of the external cluster). The model estimates the posterior probability of class membership similar to a mixture of experts classifier. In order to learn the parameters of the model, we have developed a general training approach based on maximum likelihood that results in two efficient training algorithms. Compared to other classification mixture models, the proposed hierarchical model exhibits several advantages and provides improved classification performance as indicated by the experimental results.