Research Publications

Quick Navigation

Homepage, Email, Resume, Research Notes, and, Publications

Sep 2023

Interspeech

Capturing Mismatch between Textual and Acoustic Emotion Expressions for Mood Identification in Bipolar Disorder

In Interspeech, Sep 2023

Minxue Niu, Amrit Romana, Mimansa Jaiswal, Melvin McInnis, Emily Mower Provost

Emotion Recognition

Mental Health

Text

Speech and Audio

Metric Design

Emotion is expressed through language, vocal and facial expressions. Lack of emotional alignment between modalities is a symptom of mental disorders. We propose to quantify the mismatch between emotion expressed through language and acoustics, which we refer to as Emotional Mismatch (EMM). EMM patterns differ between symptomatic and euthymic moods. EMM statistics serve as an effective feature for mood recognition, reducing annotation cost while preserving mood identification.

Mar 2023

Submission

Designing Interfaces for Delivering and Obtaining Generation Explanation Annotations

In Submission, Mar 2023

Mimansa Jaiswal

Text

Data Annotation

Design

Designing a user interface where human annotators can provide explanations for text data. This can help improve the transparency and interpretability of machine learning models, as well as improve their performance.

Note Demo Repo

Mar 2023

Submission

CAPSTONE: Capability Assessment Protocol for Systematic Testing of Natural Language Models Expertise

In Submission, Mar 2023

Mimansa Jaiswal

Text

Evaluation

Metric Design

Schema

Interpretation

Data Annotation

Foundation Models

Prompt-based language models introduce uncertainty to classification and require users to try multiple prompts with varying temperatures to find the best fit. However, this approach lacks the ability to capture implicit differences in prompts and provide adequate vocabulary. To address this, a text annotation framework is proposed to provide a structured approach to prompt definition and annotation. Better validation structures and structured prompts are necessary for using prompt-based systems at scale for labeling or retrieval.

Note

Nov 2022

arXiv

Human-Centered Metric Design to Promote Generalizable and Debiased Emotion Recognition

In arXiv, Nov 2022

Mimansa Jaiswal, Emily Mower Provost

Debiasing

Emotion Recognition

Text

Model Training

Empirical Analysis

Generalization

Evaluation

Metric Design

Interpretation

Metrics for emotion recognition can be challenging due to their dependence on subjective human perception. This paper proposes a template formulation that derives human-centered, automatic, optimizable evaluation metrics for emotion recognition models. The template uses model explanations and sociolinguistic wordlists and can be applied to a sample or whole dataset. The proposed metrics include generalizability and debiasing improvement, and are tested on three models, datasets and sensitive variables. The metrics correlate with the models' performance and biased representations, and can be used to train models with increased generalizability, decreased bias, or both. The template is the first to provide quantifiable metrics for training and evaluating generalizability and bias in emotion recognition models.

PDF

Sep 2022

Interspeech

Mind the Gap: On the Value of Silence Representations to Lexical-Based Speech Emotion Recognition

In Interspeech, Sep 2022

Matthew Perez, Mimansa Jaiswal, Minxue Niu, Cristina Gorrostieta, Matthew Roddy, Kye Taylor, Reza Lotfian, John Kane, Emily Mower Provost

Emotion Recognition

Text

Speech and Audio

Model Training

Interpretation

Silence is crucial in speech perception, conveying emphasis and emotion. However, little research has been done on the effect of silence on linguistics and emotion recognition. We present a novel framework that fuses linguistic and silence representations for emotion recognition in naturalistic speech. Two methods to represent silence are investigated, with results showing improved performance. Modeling silence as a token in a transformer language model significantly improves performance on the MSP-Podcast dataset. Analyses show that silence emphasizes the attention of its surrounding words.

PDF

Mar 2022

Submission

Controlled Evaluation of Explanations: What Might Have Influenced Your Model Explanation Efficacy Evaluation?

In Submission, Mar 2022

Mimansa Jaiswal, Minxue Niu

Text

Evaluation

Metric Design

Schema

Interpretation

Data Annotation

Factors affecting explanation efficacy include the algorithm used and the end user. NLP papers focus on algorithms for generating explanations, but overlook other factors. This paper examines how saliency-based explanation methods for machine learning models change with controlled variables. We aim to provide a standardized list of variables to evaluate these explanations and show how SoTA algorithms can have different rankings when controlling for evaluation criteria.

Note

2020

ACL-SRW

Noise-Based Augmentation Techniques for Emotion Datasets: What Do We Recommend?

In ACL-SRW, 2020

Mimansa Jaiswal, Emily Mower Provost

Data Augmentation

Emotion Recognition

Speech and Audio

Empirical Analysis

Multiple noise-based data augmentation approaches have been proposed to counteract this challenge in other speech domains. But, unlike speech recognition and speaker verification, the underlying label of emotion data may change given the addition of noise. In this work, we propose a set of recommendations for noise-based augmentation of emotion datasets based on human and machine performance evaluation of generated realistic noisy samples using multiple categories of environmental and synthetic noise.

PDF Talk

May 2020

LREC

MuSE: Multimodal Stressed Emotion Dataset

In LREC, May 2020

Mimansa Jaiswal, Cristian-Paul Bara, Yuanhang Luo, Rada Mihalcea, Mihai Burzo, Emily Mower Provost

Data Collection

Confounding Factors

Emotion Recognition

Speech and Audio

This paper presents a dataset, Multimodal Stressed Emotion (MuSE), to study the multimodal interplay between the presence of stress and expressions of affect. We describe the data collection protocol, the possible areas of use, and the annotations for the emotional content of the recordings.

PDF

Feb 2020

AAAI and NeuRIPS-W

Privacy Enhanced Multimodal Neural Representations for Emotion Recognition

In AAAI and NeuRIPS-W, Feb 2020

Mimansa Jaiswal, Emily Mower Provost

Confounding Factors

Emotion Recognition

Speech and Audio

Text

Model Training

PDF

Sep 2019

Interspeech

Identifying Mood Episodes Using Dialogue Features from Clinical Interviews

In Interspeech, Sep 2019

Zakaria Aldeneh, Mimansa Jaiswal, Emily Mower Provost

Emotion Recognition

Text

Model Training

Speech and Audio

Empirical Analysis

Mental Health

Dialogue

Mental health professionals assess symptom severity through semi-structured clinical interviews. During these interviews, they observe their patients’ spoken behaviors, including both what the patients say and how they say it. In this work, we move beyond acoustic and lexical information, investigating how higher-level interactive patterns also change during mood episodes.

PDF

May 2019

ICASSP

MuSE-ing on the Impact of Utterance Ordering on Crowdsourced Emotion Annotations

In ICASSP, May 2019

Mimansa Jaiswal, Zakaria Aldeneh, Cristian-Paul Bara, Yuanhang Luo, Mihai Burzo, Rada Mihalcea, Emily Mower Provost

Emotion Recognition

Data Annotation

Empirical Analysis

Crowdsourcing

Emotion expression and perception are inherently subjective. There is generally not a single annotation that can be unambiguously declared “correct.” As a result, annotations are colored by the manner in which they were collected, i.e., with or without context.

PDF

Sep 2018

Interspeech

The PRIORI Emotion Dataset: Linking Mood to Emotion Detected In-the-Wild

In Interspeech, Sep 2018

Soheil Khorram, Mimansa Jaiswal, John Gideon, Melvin McInnis, Emily Mower Provost

Emotion Recognition

Model Training

Speech and Audio

Empirical Analysis

Mental Health

This paper presents critical steps in developing this pipeline, including (a) a new in the wild emotion dataset, the PRIORI Emotion Dataset, (b) activation/valence emotion recognition baselines, and, (c) establish emotion as a meta-feature for mood state monitoring.

PDF

Dec 2017

FLAIRS

'Hang in there': Lexical and Visual Analysis to Identify Posts Warranting Empathetic Responses

In FLAIRS, Dec 2017

Mimansa Jaiswal, Sairam Tabibu, Erik Cambria

Emotion Recognition

Mental Health

Text

Saying "You deserved it!" to "I failed the test" is not a good idea. In this paper, we propose a method supported by hand-crafted features to judge if the discourse or statement requires an empathetic response.

PDF

Jul 2017

ICDM-W

'The Truth and Nothing But The Truth': Multimodal Analysis for Deception Detection

In ICDM-W, Jul 2017

Mimansa Jaiswal, Sairam Tabibu, Rajiv Bajpai

Emotion Recognition

Mental Health

Multimodal

Text

Speech and Audio

We propose a data-driven method (SVMs) for automatic deception detection in real-life trial data using visual (OpenFace) and verbal cues (Bag of Words).

PDF

Categories

Capturing Mismatch between Textual and Acoustic Emotion Expressions for Mood Identification in Bipolar Disorder

Designing Interfaces for Delivering and Obtaining Generation Explanation Annotations

CAPSTONE: Capability Assessment Protocol for Systematic Testing of Natural Language Models Expertise

Human-Centered Metric Design to Promote Generalizable and Debiased Emotion Recognition

Mind the Gap: On the Value of Silence Representations to Lexical-Based Speech Emotion Recognition

Controlled Evaluation of Explanations: What Might Have Influenced Your Model Explanation Efficacy Evaluation?

Noise-Based Augmentation Techniques for Emotion Datasets: What Do We Recommend?

MuSE: Multimodal Stressed Emotion Dataset

Privacy Enhanced Multimodal Neural Representations for Emotion Recognition

Identifying Mood Episodes Using Dialogue Features from Clinical Interviews

MuSE-ing on the Impact of Utterance Ordering on Crowdsourced Emotion Annotations

The PRIORI Emotion Dataset: Linking Mood to Emotion Detected In-the-Wild

'Hang in there': Lexical and Visual Analysis to Identify Posts Warranting Empathetic Responses

'The Truth and Nothing But The Truth': Multimodal Analysis for Deception Detection