IVE 2024 Short Course Lecture10 - Multimodal Emotion Recognition in Conversational Settings

marknb00 118 views 44 slides Jul 21, 2024

Slide 1 of 44

About This Presentation

IVE 2024 short course Lecture10 on Multimodal Emotion Recognition in Conversational Settings.

Lecture taught by Nastaran Saffaryazdi on July 17th 2024 at the University of South Australia.

Size: 2.44 MB

Language: en

Added: Jul 21, 2024

Slides: 44 pages

Slide Content

Multimodal Emotion Recognition in Conversational
Settings
Nastaran Saffaryazdi
Empathic Computing Lab

Motivation
●Emotions are multimodal processes that play a crucial role in our everyday lives
●Recognizing emotions is becoming more essential in a wide range of application
domains
○Health-care
○Education
○Entertainment
○Advertisement
○Customer services

Motivation
●Many of application areas include human-human or human-machine conversations
●The focus of conversational emotion recognition is on facial expressions and text
●Human behavior can be controlled or faked
●Designing a general model using behavioral changes is difficult because of cultural or
language differences
●Physiological data are reliable but are very weak

Research Focus
How can we combinevarious behavioraland physiological
cues to recognize emotionsin human conversationsand
enhance empathyin human-machine interactions?
5

Research Questions
How can human body responses be employed to
identify emotions?
How can the data be obtained, captured, and
processed simultaneously from multiple sensors?
Can a combination of physiological cues be used
to recognize emotions in conversations accurately?
Can we increase the level of empathy between
humans and machines using neural and physiological
signals?
6
RQ
1
RQ
2
RQ
3
RQ
4

What are the human body's responses to emotional stimuli, and how can these
diverse responses be employed to identify emotions?
Reviewing and replicating existing research in human emotion recognition using
various modalities
8
RQ1

●Behavioral datasets
○CMU_MOSEI
○SEMANE
○IEMOCAP
○…
●Multimodal datasets with neural or physiological signals
○DEAP
○MAHNOB-HCI
○RECOLA
○SEED-IV
○AMIGOS
No research has specifically studied brain activity and physiological signals
for recognizing emotion in human-human and human-machine
conversations.
Multimodal Emotion Recognition
11

Multimodal Emotion Recognition
12
23
Participants
21 to 44 years old
!= 30, "= 6
13 females
10 males
Watching
video
Study 1: Multimodal Emotion
Recognition in watching video
Saffaryazdi, N., Wasim, S. T., Dileep, K., Nia, A. F., Nanayakkara, S., Broadbent, E., & Billinghurst, M. (2022). Using facial micro-expressions
in combination with EEG and physiological signals for emotion recognition. Frontiers in Psychology, 13, 864047.

●OpenBCI EEG cap
○EEG
●Shimmer3 GSR+ module
○EDA
○PPG
●Realsense camera
○Facial video
Sensors
1313

17
All
AllAll
AllAll

●Identified various modalities and recognition methods
●Fusing facial micro-expressions with EEG and physiological signals
●Identified the limitation and challenges
○Acquiring data from multiple sensors simultaneously
○Real-time data monitoring
○Automatic scenario running
○Personality differences
19
SummaryRQ1

How can the required data for emotion recognition be obtained, captured, and
processed simultaneously in conversation from multiple sensors?
Developing software for simultaneously acquiring, visualizing, and processing
multimodal data.
20
RQ2

●Octopus-sensing
○Simple unified interface for
●Simultaneous data acquisition
●Simultaneous data Recording
○Study design components
●Octopus-sensing-monitoring
○Real-time monitoring
●Octopus-sensing-visualizer
○Offline synchronous data visualizer
●Octopus-sensing-processing
○Real-time processing
21
Octopus Sensing

●Multiplatform
●Open-source (https://github.com/octopus-sensing)
●https://octopus-sensing.nastaran-saffar.me/
●Support various sensors
a.OpenBCI
b.Brainflow
c.Shimmer3
d.Camera
e.Audio
f.Network (Unity and matlab)
Saffaryazdi, N., Gharibnavaz, A., & Billinghurst, M. (2022). Octopus Sensing: A Python library for human behavior studies. Journal of
Open Source Software, 7(71), 4045.
Octopus Sensing
22

How to use
24

Octopus-sensing-Monitoring
●Monitor data from any machine in the same network.
●monitoring = MonitoringEndpoint(device_coordinator)
monitoring.start()
●pipenv run octopus-sensing-monitoring
●http://machine-IP:8080
25

Octopus-sensing-visualizer
●pipenv run octopus-sensing-visualizer
●http://localhost:8080
●Visualizing Raw or processed data using a config file
26

Octopus-sensing-processing
27

Can a combination of physiological cues be used to recognize emotions in
conversations accurately?
Conducting user studies and creating datasets of multimodal data in
various settings to compare human responses and explore recognition
methods in different settings.
28
RQ3

29
10 minutes conversation for
each emotion topic
Self-report
(Beginning, Middle, End)
Arousal, Valence, Emotion
Creating a conversational settings that people
could feel emotions spontaneously
Multimodal Emotion Recognition in Conversation

30
23
Participants
21 to 44 years old
!= 30, "= 6
13 females
10 males
Face-to-face
Conversation
Multimodal Emotion Recognition in conversation
Study 2: Multimodal Emotion
Recognition in Conversation
Saffaryazdi, N., Goonesekera, Y., Saffaryazdi, N., Hailemariam, N. D., Temesgen, E. G., Nanayakkara, S., ... & Billinghurst, M. (2022, March).
Emotion recognition in conversations using brain and physiological signals. In 27th International Conference on Intelligent User Interfaces(pp.
229-242).

Study 2 -Result
Self-report vs
target emotion
31

Study 2 -Result
Recognition
FScore
33

Comparing Various Conversational Settings
15
Participants
21 to 36 years old
!= 28.6, "= 5.7
7 females
8 males
Face-to-face
And Zoom
Conversation
Study 3:
Face-to-Face vs Remote conversations
35
Saffaryazdi, N., Kirkcaldy, N., Lee, G., Loveys, K., Broadbent, E., & Billinghurst, M. (2024). Exploring the impact of
computer-mediated emotional interactions on human facial and physiological responses. Telematics and Informatics
Reports, 14, 100131.

Study 3: Features
36
DataFeatures
EDAphasic statistics, tonic, and peaks statistics
PPGHeart rate variability time domain features which are features extracted by
hertpy and neurokit2 library
FaceOpenFace action units including these numbers:
1, 2,4, 5, 6, 7, 9, 10, 12, 14, 15, 17, 20, 23, 25, 26,28, and 45

Study 3: Result
3-way Repeated measure ART ANOVA
●Physiological differences between face-to-face and remote conversations
○Facial Action Units:
■Action units associated with negative emotions were higher in Face-to-face
■Action units associated with positive emotions Higher in Remote
○ED (tonic, phasic and peaks statistics)
■Reaction were substantial and immediate in f2f (Phasic mean higher in face-to-face)
○PPG (HRV time domain features)
■Higher HRV in remote conversation -> lower stress level, enhanced emotion
regulation, more engagement
37

Study 3: Result
●One-way and two-way Repeated-measure ART ANOVA
●Significant Empathy factors
○Interviewer to participant
38

Study 3: Result
●Emotion Recognition
○Feature Extraction
○Random Forest Classifier
○Leave-One-Subject-Out
●The high accuracy-> the high similarity
39

Findings
●People showed more positive facial expressions in remote conversations
●People felt stronger emotions and more immediate ones in f2f condition
●People felt lower level of stress in the remote condition
●Limitations
○Sample size
○Effect of interviewer
○Familiarity with remote conversations
●Cross-usage of multimodal dataset is quite successful
●Physiological emotion recognition are effective in conversational emotion recognition
●We can use these datasets to train models for real-time emotions recognition
RQ3

RQ4
Can we increase the level of empathy between humans and machines by using
neural and physiological signals to detect emotions in real-time during
conversations?
●Developing a real-time emotion recognition system using multimodal data.
●Prototyping an empathetic conversational agent by feeding the detected
emotions in real-time.
41

Human -computer conversation
23
Participants
21 to 44 years old
!= 30, "= 6
17 females
6 males
Human-Digital Human
Conversation
Study 4: Interaction with a Digital
human from Soul Machines
42

●Study 4
●Interaction between human and Digital Human
■Neutral
■Empathetic
●Real-time emotion recognition based on physiological cues
●Evaluating interaction with digital human
■Empathy factors
■Quality of Interaction
■Degree of Raport
■Degree of Liking
■Degree of Social Presence
Human-Machine Conversation
43

●Emotion Recognition in Real-time
○Arousal % 69.1 and Valence %57.3
●Induction method evaluation
Study 4 -Result
45

●appropriateness of reflected emotions in different agents
○Appropriate emotion (F(1, 166) = 10, p < 0.002)
●Appropriate time (F(1, 166) = 6, p < 0.01)
Real-time expression evaluation
47

Empathy evaluation
●Overall Empathy: F(1, 166) = 27, p < 0.001
●Cognitive empathy: F(1, 166) = 4.7, p < 0.03
●Affective empathy: F(1, 166) = 5.4, p < 0.02
49

Human-Agent Rapport
●Degree of Rapport (DoR)
○F(1, 42) = 8.38, p = 0.006
●Degree of Liking (DOL)
○F(1, 42) = 6.64, p < 0.01
●Degree of Social Presence (DSP)
○Not significantly different
●Quality of Interaction (QoI)
○Not significantly different
50

Physiological Responses
●EEG
○Not significant differences
○Shared neural processes
●EDA
○Higher skin conductance in interaction with empathetic agent
■Higher emotional arousal (excitement, engagement)
●PPG
○Higher HRV in interaction with empathetic agent
■Better mental health
■Lower anxiety
■Improved regulation of emotional responses
■Higher Empathy
■Increase attention / engagement
51

Conclusions
●Multimodal emotion recognition using physiological modalities is a promising approach
●Four Multimodal datasets in various settings
●The Octopus-Sensing software suite
●Improve empathy between humans and machines using neural and physiological data
52

IVE 2024 Short Course Lecture10 - Multimodal Emotion Recognition in Conversational Settings

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

IVE 2024 Short Course Lecture10 - Multimodal Emotion Recognition in Conversational Settings

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Slide 44

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

TLE-9-Prepare-Salad-and-Dressing.pptxkkk

LESSON 1 ABOUT MEDIA AND INFORMATION.pptx

GRADE-8-AQUACULTURE-WEEKQ1.pdfdfawgwyrsewru

Feelings PP Game FOR CHILDREN IN ELEMENTARY SCHOOL.pptx

Jeopardy_Figures_of_Speech_Template.pptx [Autosaved].pptx

Jeopardy_Figures_of_Speech.pptxvdsvdsvsdvsd