Research Publications
Homepage, Email, Resume, Research Notes, and, Publications
Capturing Mismatch between Textual and Acoustic Emotion Expressions for Mood Identification in Bipolar Disorder
In Interspeech, Sep 2023
Emotion is expressed through language, vocal and facial expressions. Lack of emotional alignment between modalities is a symptom of mental disorders. We propose to quantify the mismatch between emotion expressed through language and acoustics, which we refer to as Emotional Mismatch (EMM). EMM patterns differ between symptomatic and euthymic moods. EMM statistics serve as an effective feature for mood recognition, reducing annotation cost while preserving mood identification.
Designing Interfaces for Delivering and Obtaining Generation Explanation Annotations
In Submission, Mar 2023
CAPSTONE: Capability Assessment Protocol for Systematic Testing of Natural Language Models Expertise
In Submission, Mar 2023
Prompt-based language models introduce uncertainty to classification and require users to try multiple prompts with varying temperatures to find the best fit. However, this approach lacks the ability to capture implicit differences in prompts and provide adequate vocabulary. To address this, a text annotation framework is proposed to provide a structured approach to prompt definition and annotation. Better validation structures and structured prompts are necessary for using prompt-based systems at scale for labeling or retrieval.
Human-Centered Metric Design to Promote Generalizable and Debiased Emotion Recognition
In arXiv, Nov 2022
Metrics for emotion recognition can be challenging due to their dependence on subjective human perception. This paper proposes a template formulation that derives human-centered, automatic, optimizable evaluation metrics for emotion recognition models. The template uses model explanations and sociolinguistic wordlists and can be applied to a sample or whole dataset. The proposed metrics include generalizability and debiasing improvement, and are tested on three models, datasets and sensitive variables. The metrics correlate with the models' performance and biased representations, and can be used to train models with increased generalizability, decreased bias, or both. The template is the first to provide quantifiable metrics for training and evaluating generalizability and bias in emotion recognition models.
Mind the Gap: On the Value of Silence Representations to Lexical-Based Speech Emotion Recognition
In Interspeech, Sep 2022
Silence is crucial in speech perception, conveying emphasis and emotion. However, little research has been done on the effect of silence on linguistics and emotion recognition. We present a novel framework that fuses linguistic and silence representations for emotion recognition in naturalistic speech. Two methods to represent silence are investigated, with results showing improved performance. Modeling silence as a token in a transformer language model significantly improves performance on the MSP-Podcast dataset. Analyses show that silence emphasizes the attention of its surrounding words.
Controlled Evaluation of Explanations: What Might Have Influenced Your Model Explanation Efficacy Evaluation?
In Submission, Mar 2022
Factors affecting explanation efficacy include the algorithm used and the end user. NLP papers focus on algorithms for generating explanations, but overlook other factors. This paper examines how saliency-based explanation methods for machine learning models change with controlled variables. We aim to provide a standardized list of variables to evaluate these explanations and show how SoTA algorithms can have different rankings when controlling for evaluation criteria.
Noise-Based Augmentation Techniques for Emotion Datasets: What Do We Recommend?
In ACL-SRW, 2020
Multiple noise-based data augmentation approaches have been proposed to counteract this challenge in other speech domains. But, unlike speech recognition and speaker verification, the underlying label of emotion data may change given the addition of noise. In this work, we propose a set of recommendations for noise-based augmentation of emotion datasets based on human and machine performance evaluation of generated realistic noisy samples using multiple categories of environmental and synthetic noise.
MuSE: Multimodal Stressed Emotion Dataset
In LREC, May 2020
This paper presents a dataset, Multimodal Stressed Emotion (MuSE), to study the multimodal interplay between the presence of stress and expressions of affect. We describe the data collection protocol, the possible areas of use, and the annotations for the emotional content of the recordings.
Privacy Enhanced Multimodal Neural Representations for Emotion Recognition
In AAAI and NeuRIPS-W, Feb 2020
This paper presents a dataset, Multimodal Stressed Emotion (MuSE), to study the multimodal interplay between the presence of stress and expressions of affect. We describe the data collection protocol, the possible areas of use, and the annotations for the emotional content of the recordings.
Identifying Mood Episodes Using Dialogue Features from Clinical Interviews
In Interspeech, Sep 2019
Mental health professionals assess symptom severity through semi-structured clinical interviews. During these interviews, they observe their patients’ spoken behaviors, including both what the patients say and how they say it. In this work, we move beyond acoustic and lexical information, investigating how higher-level interactive patterns also change during mood episodes.
MuSE-ing on the Impact of Utterance Ordering on Crowdsourced Emotion Annotations
In ICASSP, May 2019
Emotion expression and perception are inherently subjective. There is generally not a single annotation that can be unambiguously declared “correct.” As a result, annotations are colored by the manner in which they were collected, i.e., with or without context.
The PRIORI Emotion Dataset: Linking Mood to Emotion Detected In-the-Wild
In Interspeech, Sep 2018
This paper presents critical steps in developing this pipeline, including (a) a new in the wild emotion dataset, the PRIORI Emotion Dataset, (b) activation/valence emotion recognition baselines, and, (c) establish emotion as a meta-feature for mood state monitoring.
'Hang in there': Lexical and Visual Analysis to Identify Posts Warranting Empathetic Responses
In FLAIRS, Dec 2017
Saying "You deserved it!" to "I failed the test" is not a good idea. In this paper, we propose a method supported by hand-crafted features to judge if the discourse or statement requires an empathetic response.
'The Truth and Nothing But The Truth': Multimodal Analysis for Deception Detection
In ICDM-W, Jul 2017
We propose a data-driven method (SVMs) for automatic deception detection in real-life trial data using visual (OpenFace) and verbal cues (Bag of Words).