Evaluation of a Sliding Window mechanism as DataAugmentation over Emotion Detection on Speech


  • Matheus Almeida Farias da Silva
  • Rafael Lima de Carvalho
  • Tiago da Silva Almeida




Speech Emotion Recognition, Voice Processing, Machine Learning, Deep Neural Networks


Emotion analysis is an important field of study, with many applications for security, financial, and politician. Despite beinga subjective branch of study, emotion analysis can be simulated by Machine Learning algorithms that are trained for this purpose, throughcataloged audio datasets, they can recognize patterns in these media that could be related to corresponding emotion. Neural NetworkAlgorithms are able to work on the recognition of these emotions, with a focus only on audio, known as Speech Emotion Recognition(SER). Neural Network Algorithms generally obtain unequal averages of referring results such as recognition of emotions when applied todifferent audio datasets. This research evaluates a Data Augmentation method called Slide Window, which generates more data samplesin order to increase the averages of classification rates. The method has been applied to three public datasets: EMO-DB, SAVEE, and RAVEDESS. The experiments have shown effectiveness in the increasing of the recognition rates of about to 11.95% on the EMO-DBbase, 22.76% on SAVEE, and 18.82% on RAVEDESS when compared to other approaches in the literature.




Como Citar

Farias da Silva, M.A. et al. 2021. Evaluation of a Sliding Window mechanism as DataAugmentation over Emotion Detection on Speech. Academic Journal on Computing, Engineering and Applied Mathematics. 2, 1 (abr. 2021), 11–18. DOI:https://doi.org/10.20873/uft.2675-3588.2021.v2n1.p11-18.



Artigos de Pesquisa