Research Publications

Quick Navigation

Capturing Mismatch between Textual and Acoustic Emotion Expressions for Mood Identification in Bipolar Disorder

In Interspeech, Sep 2023

Minxue Niu, Amrit Romana, Mimansa Jaiswal, Melvin McInnis, Emily Mower Provost

Emotion Recognition
Mental Health
Text
Speech and Audio
Metric Design

Emotion is expressed through language, vocal and facial expressions. Lack of emotional alignment between modalities is a symptom of mental disorders. We propose to quantify the mismatch between emotion expressed through language and acoustics, which we refer to as Emotional Mismatch (EMM). EMM patterns differ between symptomatic and euthymic moods. EMM statistics serve as an effective feature for mood recognition, reducing annotation cost while preserving mood identification.

Designing Interfaces for Delivering and Obtaining Generation Explanation Annotations

In Submission, Mar 2023

Mimansa Jaiswal

Text
Data Annotation
Design

Designing a user interface where human annotators can provide explanations for text data. This can help improve the transparency and interpretability of machine learning models, as well as improve their performance.

CAPSTONE: Capability Assessment Protocol for Systematic Testing of Natural Language Models Expertise

In Submission, Mar 2023

Mimansa Jaiswal

Text
Evaluation
Metric Design
Schema
Interpretation
Data Annotation
Foundation Models

Prompt-based language models introduce uncertainty to classification and require users to try multiple prompts with varying temperatures to find the best fit. However, this approach lacks the ability to capture implicit differences in prompts and provide adequate vocabulary. To address this, a text annotation framework is proposed to provide a structured approach to prompt definition and annotation. Better validation structures and structured prompts are necessary for using prompt-based systems at scale for labeling or retrieval.

Human-Centered Metric Design to Promote Generalizable and Debiased Emotion Recognition

In arXiv, Nov 2022

Mimansa Jaiswal, Emily Mower Provost

Debiasing
Emotion Recognition
Text
Model Training
Empirical Analysis
Generalization
Evaluation
Metric Design
Interpretation

Metrics for emotion recognition can be challenging due to their dependence on subjective human perception. This paper proposes a template formulation that derives human-centered, automatic, optimizable evaluation metrics for emotion recognition models. The template uses model explanations and sociolinguistic wordlists and can be applied to a sample or whole dataset. The proposed metrics include generalizability and debiasing improvement, and are tested on three models, datasets and sensitive variables. The metrics correlate with the models' performance and biased representations, and can be used to train models with increased generalizability, decreased bias, or both. The template is the first to provide quantifiable metrics for training and evaluating generalizability and bias in emotion recognition models.

Mind the Gap: On the Value of Silence Representations to Lexical-Based Speech Emotion Recognition

In Interspeech, Sep 2022

Matthew Perez, Mimansa Jaiswal, Minxue Niu, Cristina Gorrostieta, Matthew Roddy, Kye Taylor, Reza Lotfian, John Kane, Emily Mower Provost

Emotion Recognition
Text
Speech and Audio
Model Training
Interpretation

Silence is crucial in speech perception, conveying emphasis and emotion. However, little research has been done on the effect of silence on linguistics and emotion recognition. We present a novel framework that fuses linguistic and silence representations for emotion recognition in naturalistic speech. Two methods to represent silence are investigated, with results showing improved performance. Modeling silence as a token in a transformer language model significantly improves performance on the MSP-Podcast dataset. Analyses show that silence emphasizes the attention of its surrounding words.

Controlled Evaluation of Explanations: What Might Have Influenced Your Model Explanation Efficacy Evaluation?

In Submission, Mar 2022

Mimansa Jaiswal, Minxue Niu

Text
Evaluation
Metric Design
Schema
Interpretation
Data Annotation

Factors affecting explanation efficacy include the algorithm used and the end user. NLP papers focus on algorithms for generating explanations, but overlook other factors. This paper examines how saliency-based explanation methods for machine learning models change with controlled variables. We aim to provide a standardized list of variables to evaluate these explanations and show how SoTA algorithms can have different rankings when controlling for evaluation criteria.

Noise-Based Augmentation Techniques for Emotion Datasets: What Do We Recommend?

In ACL-SRW, 2020

Mimansa Jaiswal, Emily Mower Provost

Data Augmentation
Emotion Recognition
Speech and Audio
Empirical Analysis

Multiple noise-based data augmentation approaches have been proposed to counteract this challenge in other speech domains. But, unlike speech recognition and speaker verification, the underlying label of emotion data may change given the addition of noise. In this work, we propose a set of recommendations for noise-based augmentation of emotion datasets based on human and machine performance evaluation of generated realistic noisy samples using multiple categories of environmental and synthetic noise.

MuSE: Multimodal Stressed Emotion Dataset

In LREC, May 2020

Mimansa Jaiswal, Cristian-Paul Bara, Yuanhang Luo, Rada Mihalcea, Mihai Burzo, Emily Mower Provost

Data Collection
Confounding Factors
Emotion Recognition
Speech and Audio

This paper presents a dataset, Multimodal Stressed Emotion (MuSE), to study the multimodal interplay between the presence of stress and expressions of affect. We describe the data collection protocol, the possible areas of use, and the annotations for the emotional content of the recordings.

Privacy Enhanced Multimodal Neural Representations for Emotion Recognition

In AAAI and NeuRIPS-W, Feb 2020

Mimansa Jaiswal, Emily Mower Provost

Confounding Factors
Emotion Recognition
Speech and Audio
Text
Model Training

This paper presents a dataset, Multimodal Stressed Emotion (MuSE), to study the multimodal interplay between the presence of stress and expressions of affect. We describe the data collection protocol, the possible areas of use, and the annotations for the emotional content of the recordings.

Identifying Mood Episodes Using Dialogue Features from Clinical Interviews

In Interspeech, Sep 2019

Zakaria Aldeneh, Mimansa Jaiswal, Emily Mower Provost

Emotion Recognition
Text
Model Training
Speech and Audio
Empirical Analysis
Mental Health
Dialogue

Mental health professionals assess symptom severity through semi-structured clinical interviews. During these interviews, they observe their patients’ spoken behaviors, including both what the patients say and how they say it. In this work, we move beyond acoustic and lexical information, investigating how higher-level interactive patterns also change during mood episodes.

MuSE-ing on the Impact of Utterance Ordering on Crowdsourced Emotion Annotations

In ICASSP, May 2019

Mimansa Jaiswal, Zakaria Aldeneh, Cristian-Paul Bara, Yuanhang Luo, Mihai Burzo, Rada Mihalcea, Emily Mower Provost

Emotion Recognition
Data Annotation
Empirical Analysis
Crowdsourcing

Emotion expression and perception are inherently subjective. There is generally not a single annotation that can be unambiguously declared “correct.” As a result, annotations are colored by the manner in which they were collected, i.e., with or without context.

The PRIORI Emotion Dataset: Linking Mood to Emotion Detected In-the-Wild

In Interspeech, Sep 2018

Soheil Khorram, Mimansa Jaiswal, John Gideon, Melvin McInnis, Emily Mower Provost

Emotion Recognition
Model Training
Speech and Audio
Empirical Analysis
Mental Health

This paper presents critical steps in developing this pipeline, including (a) a new in the wild emotion dataset, the PRIORI Emotion Dataset, (b) activation/valence emotion recognition baselines, and, (c) establish emotion as a meta-feature for mood state monitoring.

'Hang in there': Lexical and Visual Analysis to Identify Posts Warranting Empathetic Responses

In FLAIRS, Dec 2017

Mimansa Jaiswal, Sairam Tabibu, Erik Cambria

Emotion Recognition
Mental Health
Text

Saying "You deserved it!" to "I failed the test" is not a good idea. In this paper, we propose a method supported by hand-crafted features to judge if the discourse or statement requires an empathetic response.

'The Truth and Nothing But The Truth': Multimodal Analysis for Deception Detection

In ICDM-W, Jul 2017

Mimansa Jaiswal, Sairam Tabibu, Rajiv Bajpai

Emotion Recognition
Mental Health
Multimodal
Text
Speech and Audio

We propose a data-driven method (SVMs) for automatic deception detection in real-life trial data using visual (OpenFace) and verbal cues (Bag of Words).

No matching items
Back to top