Audio-based recognition dropped to 93.6% accuracy if the recognition of joy and surprise is included along with the angry emotion. We have observed that the proposed rule-based combined audiovisual emotion detection technique improves the recognition accuracy of all six universal emotions even if ...
Bejani M, Gharavian D, Charkari NM (2014) Audiovisual emotion recognition using ANOVA feature selection method and multi-classifier neural networks. Neural Comput & Applic 24(2):399-412Bejani, M., Gharavian, D., Charkari, N.M.: Audiovisual emotion recognition using anova feature selec- ...
Emotion expression associated with human communication is known to be a multimodal process. In this work, we investigate the way that emotional information is conveyed by facial and vocal modalities, and how these modalities can be effectively combined to achieve improved emotion recognition accuracy. ...
To this end, this paper proposes an infrastructure that combines the potential of emotion-aware big data and cloud technology towards 5G. With this proposed infrastructure, a bimodal system of big data emotion recognition is proposed, where the modalities consist of speech and face video. ...
Index Terms: audiovisual, multilevel, fusion, emotion, 1.doi:10.1002/9781118910566.ch17Girija ChettyMichael WagnerRoland GoeckeJohn Wiley & Sons, Inc.G. Chetty, M. Wagner, R. Goecke, A multilevel fusion approach for audiovisual emotion recognition., in: AVSP, 2008, pp. 115-120....
A more basic audio-visual speech emotion recognition system is composed of four components: audio feature extraction, visual feature extraction, feature selection and classification. What may be considered the structure of a standard audio-visual emotion recognition system is illustrated in Figure 1. ...
Emotion recognition is challenging due to the emotional gap between emotions and audio-visual features. Motivated by the powerful feature learning ability of deep neural networks, this paper proposes to bridge the emotional gap by using a hybrid deep model, which first produces audio-visual segment ...
recognitionaudioaffectvisual视听multimodal 424 IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 9, NO. 2, FEBRUARY 2007 Audio-Visual Affect Recognition Zhihong Zeng, Jilin Tu, Ming Liu, Thomas S. Huang, Brian Pianfetti, Dan Roth, and Stephen Levinson Abstract—The ability of a computer to detect and appro...
Automatic emotion recognition systems predict high-level affective content from low-level human-centered signal cues. These systems have seen great improvements in classification accuracy, due in part to advances in feature selection methods. However, many of these feature selection methods capture only ...
Moreover, how to fully utilize both audio and visual information is still an open problem. In this paper, we propose a novel multimodal fusion attention network for audio-visual emotion recognition based on adaptive and multi-level factorized bilinear pooling (FBP). First, for the audio stream,...