Download PDFOpen PDF in browserWhispers of Sound:Enhancing Information Extraction from Depression Patients' Unstructured Data Through Audio and Text Emotion Recognition and Llama Fine-tuningEasyChair Preprint 1499118 pages•Date: September 21, 2024AbstractMental health issues present significant global challenges, affecting over 20% of adults at some point in their lives. While large language models have shown promise in various fields, their application in mental health remains underexplored. This study assesses how effectively these models can be applied to mental health, using the DAIC-WOZ text datasets and RAVDESS audio datasets. Given the challenges of missing non-verbal cues and ambiguous terms in text data, audio data was incorporated during training to address these gaps. This integration enhanced the models' ability to comprehend, extract, and summarize complex information, particularly in depression assessments. Additionally, technical optimizations, such as increasing the model's max_length to 8192, reduced GPU memory usage by 40%-50% and improved context processing, leading to substantial gains in handling complex mental health data. Keyphrases: Depression, Llama, audio, fine-tuning, mental health, text
|