3-D Convolutional Recurrent Neural Networks With Attention Model for Speech Emotion Recognition.

series information in the current speech emotion classification model, an AA-CBGRU network model is proposed for speech emotion recognition. ... with residual blocks, then uses the BGRU network with an attention layer to mine deep time series information, and finally uses the full connection layer to achieve the final emotion recognition. ... Conclusions This paper designs a reasonable and feasible speech emotion recognition system with the help of an advanced convolutional neural network and BGRU with attention mechanism. ...

doi:10.3390/electronics11091409 doaj:1593bbffd0e3471dbbfc8a336f9afaac fatcat:3tzuy6pxzfd3nfqdga5ujs5h5m

DOAJ

In this paper, we construct a model of convolutional neural network speech emotion algorithm, analyze the classroom identified by the neural network with a certain degree of confidence together with the ... model based on convolutional neural network speech emotion algorithm according to these characteristics. ... Then, a native recurrent neural network model is constructed for classroom text emotion recognition using GRU computational units, text recurrent encoder, or TRE model for short. ...

doi:10.1155/2022/9563877 pmid:35912313 pmcid:PMC9282988 fatcat:f5eyvu7dezctdj4noaax7wecte

DOAJ

Speech Emotion Recognition (SER) task has known significant improvements over the last years with the advent of Deep Neural Networks (DNNs). ... In this paper, we present novel work based on the idea of teaching the emotion recognition network about speaker identity. ... This research is supported by the MoVe project: "MOdelling of speech attitudes and application to an expressive conversationnal agent", and funded by the Paris Region Ph2D grant. ...

arXiv:2104.07288v1 fatcat:kdl6ca6iebgkrnoqbkj4gaeexu

Open Access

Automated emotion recognition in speech is a long-standing problem. ... Our results demonstrate that long-range dependencies in the speech signal are critical for emotion recognition and that speed/rate augmentation offers the most robust performance gain across models. ... Shama (deeksha1@jhu.edu) and Natalie Aw (naw1@jhu.edu) for their helpful suggestions in designing experiments and setting it up on computing cluster. ...

arXiv:2211.05047v1 fatcat:7dnexr7hzjd2dc7wt7hiacrjju

Open Access

All representatives are fed into a self-attention mechanism with bidirectional recurrent neural networks to learn long term global features and exploit context for each time step. ... Feature extraction and emotional classification are significant roles in speech emotion recognition. ... Acknowledgement The authors would like to thank the Vietnam National Foundation for Science and Technology Development (NAFOSTED) under grant number 107.01-2019.328. ...

doi:10.25073/jaec.202044.311 fatcat:ykgxnz3yrnebvpvo7h5pb7s33m

DOAJ OJS

and attention-based sliding recurrent neural networks (ASRNNs) for emotion recognition. ... INDEX TERMS Auditory front-ends, 3D convolutions, joint spectral-temporal representations, attentionbased sliding recurrent networks, speech emotion recognition. ... Then, a temporal attention model is used to capture the important information related to emotion in each utterance. 1) SLIDING RECURRENT NEURAL NETWORKS The sliding recurrent neural networks (SRNNs) ...

doi:10.1109/access.2020.2967791 fatcat:ewbn7nn4prbxlm22hizebvla3y

DOAJ

To better acquire emotional features in speech signals, a parallelized convolutional recurrent neural network (PCRN) with spectral features is proposed for speech emotion recognition. ... INDEX TERMS Speech emotion recognition, parallelized convolutional recurrent neural network, convolutional neural network, long short-term memory. ... [13] proposed a 3-D attention-based convolutional Recurrent Neural Networks (ACRNN) for speech emotion recognition, they combine CNN with LSTM, and 3-D spectral feature of segments are used as input ...

doi:10.1109/access.2019.2927384 fatcat:5kkafezat5ah7ix4m7vj6xpbke

DOAJ

How to model spatio-temporal dynamics for speech emotion recognition effectively is still under active investigation. ... In this paper, we propose a method to tackle the problem of emotional relevant feature extraction from speech by leveraging Attention-based Bidirectional Long Short-Term Memory Recurrent Neural Networks ... In [18] , a network architecture of convolutional recurrent neural network (CRNN) is proposed for large vocabulary speech recognition by combining the CNN and LSTM-RNN. ...

doi:10.21437/interspeech.2018-1477 dblp:conf/interspeech/ZhaoZZWZL18 fatcat:syliqnnrdbhg3fouzruawoew3q

Here, an SER method has been proposed based on a concatenated Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN). ... A B S T R A C T Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. ... In this paper, an approach has been proposed to recognize emotion in speech signals using deep convolutional and recurrent networks with attention mechanisms. ...

doi:10.5829/ije.2020.33.02b.13 fatcat:aa3jbj3nrrbnzc7ezi5r3drjoy

Our experiments on the IEMOCAP dataset show ~10% relative improvements in the accuracy and F1-score over the baseline recurrent neural network which is trained end-to-end for emotion recognition. ... Acoustic emotion recognition aims to categorize the affective state of the speaker and is still a difficult task for machine learning models. ... ASR model Our ASR model (see Figure 2 ) is a combination of convolutional and recurrent layers inspired by the DeepSpeech (Hannun et al., 2014) architecture for speech recognition. ...

arXiv:1803.11508v1 fatcat:5i7wdcyczbfrfkfv6kugsyyqlm

Index Terms: speech emotion recognition, path signature feature, convolutional neural network 1 Following Rough Path theory notation, a path refers to a continuous function mapping from a compact time ... A simple tree-based convolutional neural network (TBCNN) is used for learning the structural information stemming from dyadic path-tree signatures. ... or recurrent neural networks for learning longer context and salient features [3, 4, 6, 7] . ...

doi:10.21437/interspeech.2019-2624 dblp:conf/interspeech/0034LNLNS19 fatcat:jsycn3rg45fnlcwsslua7i3vza

Fookes, Clinton (2020) Attention driven fusion for multi-modal emotion recognition. ... It is a condition of access that users recognise and abide by the legal requirements associated with these rights. ... Baseline systems model emotion information in text and acoustic modes independently using Deep Convolutional Neural Networks (DCNN) and Recurrent Neural Networks (RNN), followed by applying attention, ...

doi:10.1109/icassp40776.2020.9054441 dblp:conf/icassp/PriyasadFDSF20 fatcat:nplbb7rnkveatkt4okse6ib37u

Convolutional recurrent neural network (CRNN), based on a combination of an attention mechanism, is used for internal training of experts. (2) By designing an ensemble learning model, each expert can play ... Because emotional expression is often correlated with the global features, local features, and model design of speech, it is often difficult to find a universal solution for effective speech emotion recognition ... Acknowledgments: This paper is funded by the Natural Science Foundation of Liaoning Province, Research on Emotional Analysis and Evaluation Model of Speech Reading Based on Machine Learning (20180551068 ...

doi:10.3390/app10010205 fatcat:2fefwnhgandnfbdfa6beycgzwi

DOAJ

In this paper, we propose a compact speech recognition network with spatio-temporal features for edge computing, named EdgeRNN. ... Alternatively, EdgeRNN uses 1-Dimensional Convolutional Neural Network (1-D CNN) to process the overall spatial information of each frequency domain of the acoustic features. ... For example, Sun and Wu [14] combined attention mechanisms with sparse-autoencoder for speech emotion recognition. ...

doi:10.1109/access.2020.2990974 fatcat:quhnet2xkbbd7nqfrtgsujr3na

DOAJ

Ultimately, we present a multi-aspect comparison between practical neural network approaches in speech emotion recognition. ... The advancements in neural networks and the on-demand need for accurate and near real-time Speech Emotion Recognition (SER) in human–computer interactions make it mandatory to compare available methods ... convolutional and recurrent neural networks as a deep learning method. ...

doi:10.3390/s21041249 pmid:33578714 pmcid:PMC7916477 fatcat:nj5ihjhvnfcxtk7hu3n4zx4bka

DOAJ

Research on Speech Emotion Recognition Based on AA-CBGRU Network

Preserved Fulltext

A Classroom Emotion Recognition Model Based on a Convolutional Neural Network Speech Emotion Algorithm

Preserved Fulltext

Speaker Attentive Speech Emotion Recognition [article]

Preserved Fulltext

A Comparative Study of Data Augmentation Techniques for Deep Learning Based Emotion Recognition [article]

Preserved Fulltext

A Method upon Deep Learning for Speech Emotion Recognition

Preserved Fulltext

Speech Emotion Recognition Using 3D Convolutions and Attention-based Sliding Recurrent Networks with Auditory Front-ends

Preserved Fulltext

Parallelized convolutional recurrent neural network with spectral features for speech emotion recognition

Preserved Fulltext

Exploring Spatio-Temporal Representations by Integrating Attention-based Bidirectional-LSTM-RNNs and FCNs for Speech Emotion Recognition

Preserved Fulltext

Speech Emotion Recognition Using Scalogram Based Deep Structure

Preserved Fulltext

Reusing Neural Speech Representations for Auditory Emotion Recognition [article]

Preserved Fulltext

A Path Signature Approach for Speech Emotion Recognition

Preserved Fulltext

Attention Driven Fusion for Multi-Modal Emotion Recognition

Preserved Fulltext

An Ensemble Model for Multi-Level Speech Emotion Recognition

Preserved Fulltext

EdgeRNN: A Compact Speech Recognition Network with Spatio-temporal Features for Edge Computing

Preserved Fulltext

Deep Learning Techniques for Speech Emotion Recognition, from Databases to Models

Preserved Fulltext