Sleep tracking by consumers is becoming increasingly prevalent; yet, few studies have evaluated the accuracy of such devices. We sought to evaluate the accuracy of three devices (Oura Ring Gen3, Fitbit Sense 2, and Apple Watch Series 8) compared to the gold standard sleep assessment (polysomnography (PSG)). Thirty-five participants (aged 20-50 years) without a sleep disorder were enrolled in a single-night inpatient study, during which they wore the Oura Ring, Fitbit, and Apple Watch, and were monitored with PSG. For detecting sleep vs. wake, the sensitivity was ≥95% for all devices. For discriminating between sleep stages, the sensitivity ranged from 50 to 86%, as follows: Oura ring sensitivity 76.0-79.5% and precision 77.0-79.5%; Fitbit sensitivity 61.7-78.0% and precision 72.8-73.2%; and Apple sensitivity 50.5-86.1% and precision 72.7-87.8%. The Oura ring was not different from PSG in terms of wake, light sleep, deep sleep, or REM sleep estimation. The Fitbit overestimated light (18 min; p < 0.001) sleep and underestimated deep (15 min; p < 0.001) sleep. The Apple underestimated the duration of wake (7 min; p < 0.01) and deep (43 min; p < 0.001) sleep and overestimated light (45 min; p < 0.001) sleep. In adults with healthy sleep, all the devices were similar to PSG in the estimation of sleep duration, with the devices also showing moderate to substantial agreement with PSG-derived sleep stages.
Keywords: Apple Watch; Fitbit; Oura ring; consumer sleep tracking devices; polysomnography; sleep technology; validation.