Purpose The goal of this study was to determine whether the results obtained from a 25-utterance conversational language sample were as reliable as those obtained from a 50-utterance sample. Method Robust conversational language samples from 220 children with typically developing language (106 boys, 114 girls) ranging in age from 3;2 to 7;10 (years;months) were collected. The language samples were randomly assigned to one of two conditions: a 25-utterance condition and a 50-utterance condition. Transcripts were examined for three metrics, including mean length of utteranceSUGAR, words per sentence, and clauses per sentence. Results Data were analyzed using two methods. A linear mixed-model analysis was used to assess absolute and relative reliability, and the Bland-Altman procedure was used to assess absolute reliability and clinical acceptability. Results of the mixed-model analysis indicated that mean length of utteranceSUGAR and words per sentence demonstrated relative reliability; however, none of the metrics demonstrated absolute reliability. In contrast, results of the Bland-Altman scatter plots indicated that all three metrics demonstrated absolute reliability because 94%-96% of participants' scores fell within the limits of agreement. Taken together, the results suggested that the statistically significant differences indicated by the mixed-model analysis were not clinically significant. Conclusion These results highlighted the importance of using different methods of analysis in studies of reliability. The findings indicated that reliable language sample results can be obtained from 25-utterance samples. Furthermore, by including practices already in use (e.g., collecting samples ≤ 50 utterances) and including only minimal changes to current practices, the methods used in this study are feasible for school-based clinicians, could be easily integrated into clinical practice, and could increase the use of evidence-based assessment practices in schools.