Cluster size tests used in analyses of brain images can have more sensitivity compared to intensity based tests. The random field (RF) theory has been widely used in implementation of such tests, however the behavior of such tests is not well understood, especially when the RF assumptions are in doubt. In this paper, we carried out a simulation study of cluster size tests under varying smoothness, thresholds, and degrees of freedom, comparing RF performance to that of the permutation test, which is known to be exact. For Gaussian images, we find that the RF methods are generally conservative, especially for low smoothness and low threshold. For t images, the RF tests are found to be conservative at lower thresholds and do not perform well unless the threshold is high and images are sufficiently smooth. The permutation test performs well for any settings though the discreteness in cluster size must be accounted for. We make specific recommendations on when permutation tests are to be preferred to RF tests.