Purpose: Saliency models that predict observers' visual attention to facial differences could enable psychosocial interventions to help patients and their families anticipate staring behaviors. The purpose of this study was to assess the ability of existing saliency models to predict observers' visual attention to acquired facial differences arising from head and neck cancer and its treatment.
Approach: Saliency maps predicted by graph-based visual saliency (GBVS), an artificial neural network (ANN), and a face-specific model were compared to observer fixation maps generated from eye-tracking of lay observers presented with clinical facial photographs of patients with a visible or functional impairment manifesting in the head and neck region. We used a linear mixed-effects model to investigate observer and stimulus factors associated with the saliency models' accuracy.
Results: The GBVS model predicted many irrelevant regions (e.g., shirt collars) as being salient. The ANN model underestimated observers' attention to facial differences relative to the central region of the face. Compared with GBVS and ANN, the face-specific saliency model was more accurate on this task; however, the face-specific model underestimated the saliency of deviations from the typical structure of human faces. The linear mixed-effects model revealed that the location of the facial difference (midface versus periphery) was significantly associated with saliency model performance. Model performance was also significantly impacted by interobserver variability.
Conclusions: Existing saliency models are not adequate for predicting observers' visual attention to facial differences. Extensions of face-specific saliency models are needed to accurately predict the saliency of acquired facial differences arising from head and neck cancer and its treatment.
Keywords: body image; eye tracking; facial difference; head and neck cancer; saliency model.
© 2023 Society of Photo-Optical Instrumentation Engineers (SPIE).