Search Articles

View query in Help articles search

Search Results (1 to 10 of 23 Results)

Download search results: CSV END BibTex RIS


Speech Emotion Recognition in Mental Health: Systematic Review of Voice-Based Applications

Speech Emotion Recognition in Mental Health: Systematic Review of Voice-Based Applications

Database queries were performed using the Pub Med, IEEE Xplore, ar Xiv, and Science Direct databases up until February 2025, with the following keyword search: (“emotion recognition” OR “affective computing” OR “emotional analysis”) AND (“psychiatry” OR “psychology”) AND (“speech” OR “voice”). During the screening process, 2 authors applied the eligibility criteria and selected the studies to be included in the systematic review.

Eric Jordan, Raphaël Terrisse, Valeria Lucarini, Motasem Alrahabi, Marie-Odile Krebs, Julien Desclés, Christophe Lemey

JMIR Ment Health 2025;12:e74260


Collection and Analysis of Repeated Speech Samples: Methodological Framework and Example Protocol

Collection and Analysis of Repeated Speech Samples: Methodological Framework and Example Protocol

Potential confounding factors related to the speaker include hormonal variations within the menstrual cycle [12,13], fatigue [14], voice use habits [15-17], emotion [18], and hydration [19]. Systematic changes with age, menopause, and medication use have also been reported [20-23].

Nicholas Cummins, Lauren Louise White, Zahia Rahman, Catriona Lucas, Tian Pan, Ewan Carr, Faith Matcham, Johnny Downs, Richard Dobson, Thomas F Quatieri, Judith Dineley

JMIR Res Protoc 2025;14:e69431


Acoustic and Natural Language Markers for Bipolar Disorder: A Pilot, mHealth Cross-Sectional Study

Acoustic and Natural Language Markers for Bipolar Disorder: A Pilot, mHealth Cross-Sectional Study

Clinicians are trained to recognize variations in language and voice, along with gestures and facial expressions, implicitly assessing both coherence and organization of speech and natural language features. However, this process is inevitably vulnerable to inconsistencies and biases.

Cristina Crocamo, Riccardo Matteo Cioni, Aurelia Canestro, Christian Nasti, Dario Palpella, Susanna Piacenti, Alessandra Bartoccetti, Martina Re, Valentina Simonetti, Chiara Barattieri di San Pietro, Maria Bulgheroni, Francesco Bartoli, Giuseppe Carrà

JMIR Form Res 2025;9:e65555


Acoustic Features for Identifying Suicide Risk in Crisis Hotline Callers: Machine Learning Approach

Acoustic Features for Identifying Suicide Risk in Crisis Hotline Callers: Machine Learning Approach

However, voice messages become particularly important in special contexts such as in crisis hotline calls [24]. As a suicide prevention method, crisis hotlines play a crucial role in early detection and response to suicide risk [25]. The World Health Organization estimates that there are more than 1000 crisis hotlines worldwide.

Zhengyuan Su, Huadong Jiang, Ying Yang, Xiangqing Hou, Yanli Su, Li Yang

J Med Internet Res 2025;27:e67772


Speech and Language Therapists’ Perspectives of Virtual Reality as a Clinical Tool for Autism: Cross-Sectional Survey

Speech and Language Therapists’ Perspectives of Virtual Reality as a Clinical Tool for Autism: Cross-Sectional Survey

Whose voice?’ Co-creating a technology research roadmap with autism stakeholdersvoice

Jodie Mills, Orla Duffy

JMIR Rehabil Assist Technol 2025;12:e63235


Longitudinal Changes in Pitch-Related Acoustic Characteristics of the Voice Throughout the Menstrual Cycle: Observational Study

Longitudinal Changes in Pitch-Related Acoustic Characteristics of the Voice Throughout the Menstrual Cycle: Observational Study

One of the drawbacks of these studies is that, apart from a single study [15], voice recordings were collected at a single time point within a menstrual phase; thus, day-to-day changes within the voice were notably not assessed. Ovulation itself is a result of longitudinal changes in hormone values compared to the previous days. Therefore, it is essential to trace day-to-day changes of acoustic characteristics to understand the effect of ovulation on voice signals.

Jaycee Kaufman, Jouhyun Jeon, Jessica Oreskovic, Anirudh Thommandram, Yan Fossat

JMIR Form Res 2025;9:e65448


Use of Deep Neural Networks to Predict Obesity With Short Audio Recordings: Development and Usability Study

Use of Deep Neural Networks to Predict Obesity With Short Audio Recordings: Development and Usability Study

While the broad ramifications of obesity are well documented, recent scientific inquiries have begun to elucidate the potential alterations in voice characteristics that may be concurrent with obesity [5,6]. Several mechanisms are postulated to explain these alterations in vocal attributes. The deposition of adipose tissue near the vocal folds and larynx may influence vocal resonance and pitch, often resulting in variations in voice quality [7].

Jingyi Huang, Peiqi Guo, Sheng Zhang, Mengmeng Ji, Ruopeng An

JMIR AI 2024;3:e54885


Impact of Audio Data Compression on Feature Extraction for Vocal Biomarker Detection: Validation Study

Impact of Audio Data Compression on Feature Extraction for Vocal Biomarker Detection: Validation Study

These biomarkers are unique characteristics or acoustic patterns of an individual’s voice that can hold valuable information about their physical and mental well-being [2]. Human voice production requires the coordination of multiple biological systems; perturbations in these systems induced by various conditions or diseases can result in alterations in the characteristics of the human voice [3].

Jessica Oreskovic, Jaycee Kaufman, Yan Fossat

JMIR Biomed Eng 2024;9:e56246


Investigation of Deepfake Voice Detection Using Speech Pause Patterns: Algorithm Development and Validation

Investigation of Deepfake Voice Detection Using Speech Pause Patterns: Algorithm Development and Validation

Deepfakes are generated through the aggregation of substantial data sets, including voice recordings, images, and video segments [3]. This research specifically targets the detection of audio deepfakes, relying solely on voice data for both deepfake development and detection method testing.

Nikhil Valsan Kulangareth, Jaycee Kaufman, Jessica Oreskovic, Yan Fossat

JMIR Biomed Eng 2024;9:e56245


Using Wearable Devices and Speech Data for Personalized Machine Learning in Early Detection of Mental Disorders: Protocol for a Participatory Research Study

Using Wearable Devices and Speech Data for Personalized Machine Learning in Early Detection of Mental Disorders: Protocol for a Participatory Research Study

Smartwatch activity data [5-8] and voice [9-11] have been demonstrated to predict depression, anxiety, and stress with high levels of accuracy. The novelty of our study lies in three pillars: First, our study is designed to collect longitudinal data from each participant. In contrast, previous studies about detecting depression from voice relied on data sets consisting of a single data point per participant for a one-size-fits-all model.

Ramon E Diaz-Ramos, Isabella Noriega, Luis A Trejo, Eleni Stroulia, Bo Cao

JMIR Res Protoc 2023;12:e48210