基于语音信号的抑郁情绪检测-猫眼课题宝

权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

基于语音信号的抑郁情绪检测

结题报告

批准号：

61901265

项目类别：

青年科学基金项目

资助金额：

25.0 万元

负责人：

吴梦玥

依托单位：

上海交通大学

学科分类：

F0117.多媒体信息处理

结题年份：

2022

批准年份：

2019

项目状态：

已结题

项目参与者：

关键词：

情感计算语音信息处理富音频分析

国基评审专家1V1指导中标率高出同行96.8%

中文摘要

抑郁症人群数量大幅增长，然而精神科医生培训周期长、人数不够，因此本项目拟针对中文语音的抑郁情绪检测展开研究。当下抑郁情绪感知面临着三方面挑战，首先是没有大规模规范的中文抑郁患者音频数据集，其次是抑郁症情绪的感知力受限，最后是抑郁症情绪检测不鲁棒、欠有效。本项目拟在中文抑郁患者的音频数据集构建方面，提出采集和生成联合的方法，实现抑郁患者音频数据的有效扩充。同时，引入社会属性标签，为细粒度的抑郁检测提供坚实基础。针对单模态感知能力受限、多模态数据采集困难的挑战，提出基于语音的多信息联合方法，深度挖掘对话语音中的语音、文本以及结构化行为信息。对于音频在抑郁情绪的检测中能力有限的问题，提出了基于多标签、细粒度的抑郁情绪检测模型，探索抑郁情绪与社会属性标签的协同关联，实现了音频数据与抑郁程度更深层次的交互，提升了抑郁情绪检测的鲁棒性。

英文摘要

The number of people with depression has increased significantly. However, the training period of psychiatrists is long and the number of people is not enough. Therefore, this project intends to conduct research on the detection of depression in Chinese speech. At present, the perception of depression emotions faces three challenges. First, there is no large-scale standardized audio dataset for Chinese depressed patients, followed by limited perception of depression emotions. Finally, depression emotion detection is not robust and ineffective. This project intends to construct a combination of acquisition and generation of audio datasets for Chinese depressive patients to achieve effective expansion of audio data for depressed patients. At the same time, the introduction of social attribute tags provides a solid foundation for fine-grained depression testing. Aiming at the challenge of limited single-modality and multi-modal data acquisition, a multi-information method based on speech is proposed to deeply mine the speech, text and structured behavior information in dialogue speech. For the problem of limited ability of audio in the detection of depression, a multi-label, fine-grained depression emotion detection model is proposed to explore the synergistic relationship between depression and social attribute tags, and achieve a deeper interaction between audio data and depression. Improve the robustness of depression detection.

本项目面向基于语音信号的抑郁症自动检测，研究按计划执行，主要在抑郁症数据集构建、基于音频/文本的抑郁症检测与分类开展研究工作，最终实现真实场景下的验证。在数据方面，采集1,211人语音数据，其中重症抑郁症患者708人，总计数据580小时，为目前已知的最大抑郁症语音数据集；同时拓展人机问诊新模式，创新型采取三阶段数据采集模式，公开发表首个抑郁症问诊数据集（含对话数据1,339例）；此外，在研究过程中发现精神疾病的难检测另一挑战源自于症状表达的主观性以及不可确定性，因此首先基于国际标准精神疾病诊断手册（DSM-5）抽取了症状-精神疾病知识图谱，继而构建了首个基于自然语言表达的精神疾病症状数据集，研究成果已开源。在检测算法上，研究基于自监督方式的音频特征提取方式，改善了既往手工特征的低效以及不可迁移方式，同时相较于普通高维音频特征，提升了检测性能，在发表后广受关注；同时提出了面向自然语言表达的层级化注意力网络模型，提升了早期抑郁症诊断的准确性。最终，面向真实场景应用，提出了对于复杂高噪声环境下抑郁语音的建模方法，研究了弱监督条件下的语音活性检测，以及跨模态的声学场景理解方法。最后开发了真实的抑郁症检测系统，并在h5和小程序两端同时加载，在疫情期间进行了推广以及示范性应用。受项目资助的论文共计16篇，其中CCF A/B类15篇，申请发明专利3项，授权3项。

期刊论文列表

专著列表

科研奖励列表

会议论文列表

专利列表

Voice Activity Detection in the Wild: A Data-Driven Approach Using Teacher-Student Training

野外语音活动检测：使用师生培训的数据驱动方法

DOI：10.1109/taslp.2021.3073596

发表时间：2021-05

期刊：

IEEE/ACM Transactions on Audio, Speech, and Language Processing

影响因子：--

作者：

Heinrich Dinkel;Shuai Wang;Xuenan Xu;Mengyue Wu;Kai Yu

通讯作者：Kai Yu

Towards Duration Robust Weakly Supervised Sound Event Detection

走向持续时间鲁棒弱监督声音事件检测

DOI：10.1109/taslp.2021.3054313

发表时间：2021-01-01

期刊：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

影响因子：5.4

作者：

Dinkel, Heinrich;Wu, Mengyue;Yu, Kai

通讯作者：Yu, Kai

国内基金

海外基金

会员权益说明：