Background: Longitudinal prevalence (ie, the proportion of time with the disease) is used to describe morbidity from diarrhea and other episodic conditions. The aim of this analysis was to compare estimates of longitudinal prevalence based on intermittent sampling at regular intervals with 24- or 48-hour recall, with estimates based on continuous surveillance.
Methods: Based on 2 real datasets from Brazil and Guatemala, we developed a simulated dataset representing the diarrhea morbidity of 10,000 individuals followed over 365 days.
Results: Both the model and the real datasets showed that the standard deviation of the longitudinal prevalence increases with decreasing numbers of days sampled, so that a study sampling only a fraction of days would require a larger sample size. However, due to the correlation of diarrhea between consecutive days, sampling at 7- to 14-day intervals results in relatively small loss of precision and power compared with daily morbidity records, especially when the average diarrheal episode is long. A study based on morbidity data for every seventh day may require only a 5%-24% larger sample size than a study with daily records, depending on the average duration of episodes. Using a recall period of 48 hours instead of 24 hours increases power if the average episode is short.
Conclusions: The results question the necessity of continuous surveillance to estimate longitudinal prevalence. In addition to savings in cost and staff time, intermittent sampling of morbidity may improve validity by minimizing recall error and reducing the influence of surveillance on participants' behavior.