SPEECH BASED EMOTION RECOGNITION USING VOICE

VamshidharSingh 15,856 views 26 slides Jun 27, 2021
Slide 1
Slide 1 of 26
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26

About This Presentation

using voice as an input we detect human emotion


Slide Content

SPEECH BASED EMOTION RECOGNITION

“SPEECH BASED EMOTION RECOGNITION” MAJOR PROJECT REVIEW BY G.HANNAH SANJANA 17P71A1209 MANASA MITTIPALLY 17P71A1218 VAMSHIDHAR SINGH 17P71A1248 UNDER THE GUIDANCE OF MRS M.SUPRIYA , ASSOCIATE PROFESSOR DEPARTMENT OF INFORMATION TECHNOLOGY SWAMI VIVEKANANDA INSTITUTE OF TECHNOLOGY Mahbub College Campus, R.P Road, Secunderabad-03 (Affiliated to JNTUH) 2017-2021

CONTENTS : Abstract Introduction Existing System Disadvantages of Existing System Proposed System Advantages of Proposed System System Specifications UML Diagrams Output Screens Conclusion

ABSTRACT Speech emotion recognition is a trending research topic these days, with its main motive to improve the human-machine interaction. At present, most of the work in this area utilizes extraction of discriminatory features for the purpose of classification of emotions into various categories. Most of the present work involves the utterance of words which is used for lexical analysis for emotion recognition . In our project, a technique is utilized for classifying emotions into Angry', 'Calm', 'Fearful', 'Happy', and 'Sad' categories.

ABSTRACT In previous works, the maximum cross correlation between audio files is computed for labeling the speech data into one of the only few (three) emotion categories. Accordingly, there was one more work developed in MATLAB for Identification of an emotion for any audio file passed as an argument. A variety of classifiers are used through the MATLAB classification learner toolbox, to classify only few emotion categories. The proposed techniques pave way for a real-time prototype for speech emotion recognition with open-source features.

INTRODUCTION Speech emotion recognition is a technology that extracts emotion features from computer speech signals, compares them, and analyzes the feature parameters and the obtained emotion changes. Recognizing emotions from audio signals requires feature extraction and classifier training. The feature vector is composed of audio signal elements that characterize the specific characteristics of the speaker (such as pitch, volume , energy), which is essential for training the classifier model to accurately recognize specific emotions.

EXISTING SYSTEM The existing work in this area reveals that most of the present work relies on lexical analysis for emotion recognition, that have been used for the purpose of classification of emotions into three categories, i.e., Angry, Happy and Neutral. The maximum cross-correlation between the discrete time sequences of the audio signals is computed and the highest degree of correlation between the testing audio file and the training audio file is used as an integral parameter for identification of a particular emotion type. The second technique is used with the feature extraction of discriminatory features with the Cubic SVM classifier for recognition of Angry, Happy and Neutral emotion segments only .

DISADVANTAGES OF EXISTING SYSTEM: The system is very static in nature and cannot provide any good performance in real time systems. The system is very slow as to compare the correlations of the complete dataset with just one audio file. Variable length audio files are not understandable. Long pre-processing steps are required for the model to understand the audio signal. Expensive and not upgradable.

PROPOSED SYSTEM In the project, MFCC has been used as the feature for classifying the speech data into various emotion categories employing artificial neural networks. The usage of the Neural Networks provides us the advantage of classifying many different types of emotions in a variable length of audio signal in a real time environment. This technique manages to establish a good balance between computational volume and performance accuracy of the real-time processes.

ADVANTAGES OF PROPOSED SYSTEM: Can be implemented in any hardware supporting the python language. Very fast in processing the audio and easy to use. Variable length audio files are understood by the system.

SYSTEM SPECIFICATIONS: HARDWARE: Processor : CORE i3 Hard disk : 250GB RAM : 8 GB SOFTWARE: Operating system : WINDOWS 10 Programming language : PYTHON 3.8.6 )

System Architecture:

CNN Model for Speech Recognition:

UML DIAGRAMS

USE CASE DIAGRAM:

SEQUENCE DIAGRAM:

CLASS DIAGRAM:

ACTIVITY DIAGRAM:

OUTPUT SCREENS

GUI FOR EMOTION RECOGNISER:

OUTPUT FOR HAPPY EMOTION:

OUTPUT FOR DIFFERENT EMOTIONS:

OUTPUT FOR DIFFERENT EMOTIONS:

C ONCLUSION The CNN model was trained and based on this we were able to give the emotions of a person based on speech. The trained model is giving us the F1 score of 91.04. ‘Happy’, ‘Sad’, ‘Fearful, ’Calm’, ‘Angry’ are the five different emotions which are given using this project. This speech based emotion recognition can be used in understanding the opinions/ sentiments they express regarding a product or a political opinion etc.. by giving the audio as the input to this model.

FUTURE ENHANCEMENTS Making the system more accurate. Various other emotions can be added to it like disgusted, surprised etc.. Integrating the system with different platforms.

THANK YOU!