Whispers of Sound：Enhancing Information Extraction from Depression Patients' Unstructured Data Through Audio and Text Emotion Recognition and Llama Fine-tuning

EasyChair Preprint 14991

18 pages•Date: September 21, 2024

Lin Gan, Xiaoyang Gao, Yifan Huang and Tao Yang

Abstract

Mental health issues present significant global challenges, affecting over 20% of adults at some point in their lives. While large language models have shown promise in various fields, their application in mental health remains underexplored. This study assesses how effectively these models can be applied to mental health, using the DAIC-WOZ text datasets and RAVDESS audio datasets. Given the challenges of missing non-verbal cues and ambiguous terms in text data, audio data was incorporated during training to address these gaps. This integration enhanced the models' ability to comprehend, extract, and summarize complex information, particularly in depression assessments. Additionally, technical optimizations, such as increasing the model's max_length to 8192, reduced GPU memory usage by 40%-50% and improved context processing, leading to substantial gains in handling complex mental health data.

Keyphrases: Depression, Llama, audio, fine-tuning, mental health, text

Links:

https://easychair.org/publications/preprint/XcFm

BibTeX entry

BibTeX does not have the right entry for preprints. This is a hack for producing the correct reference:

@booklet{EasyChair:14991,
  author    = {Lin Gan and Xiaoyang Gao and Yifan Huang and Tao Yang},
  title     = {Whispers of Sound：Enhancing Information Extraction from Depression Patients' Unstructured Data Through Audio and Text Emotion Recognition and Llama Fine-tuning},
  howpublished = {EasyChair Preprint 14991},
  year      = {EasyChair, 2024}}

Download PDF Open PDF in browser