In the present study we applied a paradigm often used in face-voice affect perception to solo music improvisation to examine how the emotional valence of sound and gesture are integrated when perceiving an emotion. Three brief excerpts expressing emotion produced by a drummer and three by a saxophonist were selected. From these bimodal congruent displays the audio-only, visual-only, and audiovisually incongruent conditions (obtained by combining the two signals both within and between instruments) were derived. In Experiment 1 twenty musical novices judged the perceived emotion and rated the strength of each emotion. The results indicate that sound dominated the visual signal in the perception of affective expression, though this was more evident for the saxophone. In Experiment 2 a further sixteen musical novices were asked to either pay attention to the musicians' movements or to the sound when judging the perceived emotions. The results showed no effect of visual information when judging the sound. On the contrary, when judging the emotional content of the visual information, a worsening in performance was obtained for the incongruent condition that combined different emotional auditory and visual information for the same instrument. The effect of emotionally discordant information thus became evident only when the auditory and visual signals belonged to the same categorical event despite their temporal mismatch. This suggests that the integration of emotional information may be reinforced by its semantic attributes but might be independent from temporal features.
Copyright 2010 Elsevier B.V. All rights reserved.