权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

US-German Research Proposal: ADaptive low-latency SPEEch Decoding and synthesis using intracranial signals (ADSPEED)

美德研究提案：使用颅内信号的自适应低延迟 SPEEch 解码和合成 (ADSPEED)

基本信息

批准号：
2011595
负责人：
Dean Krusienski
金额：
$ 60.48万
依托单位：
Virginia Commonwealth University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-01-01 至 2025-12-31
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2011595&HistoricalAwards=false
关键词：
US German Research Proposal ADaptive

项目摘要

Recent research has demonstrated that it is possible to synthesize intelligible speech sounds directly from invasive measurements of brain activity. However, these approaches have a perceptible delay between brain activity and audible speech output, preventing a natural spoken communication. Furthermore, the approaches generally require pre-recorded speech and thus cannot be directly applied to people who are unable to speak and generate such recordings. This project aims to develop methods for synthesizing speech from brain activity without perceptible processing delay that do not rely on pre-recorded speech from the user. The ultimate goal is to develop a system that restores natural spoken communication to the millions of people who suffer from severe speech disorders, including those with complete loss of speech. The project is organized into three research thrusts. The first thrust focuses on asynchronous and acoustics-free model training, where novel surrogates to the user's vocalized speech will be created using approaches based on dynamic time warping and the inference of intended inner-speech acoustics from corresponding textual representations. The second thrust focuses on online validation and user adaptation, where the existing low-latency speech decoding and synthesis scheme, which is not inherently adaptable, will be validated in a closed-loop fashion using online human-subject experiments. This will provide valuable insights into how the user responds and adapts to the artificial, synthesized speech output. The third thrust focuses on the development and testing of low-latency system-user co-adaptation schemes. Co-adaptation, where both the user and system adapt to optimize the synthesized output, is crucial for revealing the elusive representations of inner (i.e., imagined or attempted) speech in the absence of a reliable surrogate for modeling. As a result, this research will simultaneously advance the understanding of the neural representations of inner speech and, in turn, co-adaptive inner speech decoding toward the development of practical closed-loop speech neuroprosthetics.A companion project is being funded by the Federal Ministry of Education and Research, Germany (BMBF).This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

最近的研究表明，直接通过对大脑活动的侵入性测量来合成可理解的语音是可能的。然而，这些方法在大脑活动和可听语音输出之间具有可感知的延迟，从而阻止了自然的口头通信。此外，这些方法通常需要预先录制的语音，因此不能直接应用于不能说话和生成这种录音的人。该项目旨在开发从大脑活动中合成语音的方法，而没有可感知的处理延迟，不依赖于用户预先录制的语音。最终目标是开发一个系统，为数百万患有严重言语障碍的人恢复自然的口语交流，包括那些完全丧失语言能力的人。该项目分为三个研究重点。第一个推力集中在异步和无声学模型训练，其中将使用基于动态时间规整和从相应的文本表示中推断预期的语音内部声学的方法来创建用户发声语音的新代理。第二个重点是在线验证和用户自适应，其中现有的低延迟语音解码和合成方案，这是不固有的适应性，将在一个闭环的方式使用在线人体实验进行验证。这将为用户如何响应和适应人工合成语音输出提供有价值的见解。第三个重点是开发和测试低延迟系统-用户协同适应方案。共同适应，其中用户和系统都适应以优化合成输出，对于揭示内在的难以捉摸的表示（即，想象的或尝试的）语音。因此，这项研究将同时推进对内部语音的神经表征的理解，反过来，共同适应的内部语音解码将有助于开发实用的闭环语音神经假体。一个配套项目正在由联邦教育和研究部资助，德国（BMBF）。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（9）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Towards Closed-Loop Speech Synthesis from Stereotactic EEG: A Unit Selection Approach

从立体定向脑电图实现闭环语音合成：一种单元选择方法

DOI：
10.1109/icassp43922.2022.9747300
发表时间：
2022
期刊：
IEEE ICASSP
影响因子：
0
作者：
Angrick, Miguel;Ottenhoff, Maarten;Diener, Lorenz;Ivucic, Darius;Ivucic, Gabriel;Goulis, Sophocles;Colon, Albert J.;Wagner, Louis;Krusienski, Dean J.;Kubben, Pieter L.
通讯作者：
Kubben, Pieter L.

Contributions of Stereotactic EEG Electrodes in Grey and White Matter to Speech Activity Detection

DOI：
10.1109/embc48229.2022.9871464
发表时间：
2022-07
期刊：
2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC)
影响因子：
0
作者：
P. Z. Soroush;Christian Herff;S. Riès;J. Shih;Tanja Schultz;D. Krusienski
通讯作者：
P. Z. Soroush;Christian Herff;S. Riès;J. Shih;Tanja Schultz;D. Krusienski

An Interpretable Deep Learning Model for Speech Activity Detection Using Electrocorticographic Signals

使用皮层电信号进行语音活动检测的可解释深度学习模型

DOI：
10.1109/tnsre.2022.3207624
发表时间：
2022
期刊：
IEEE Transactions on Neural Systems and Rehabilitation Engineering
影响因子：
4.9
作者：
Stuart, Morgan;Lesaja, Srdjan;Shih, Jerry J.;Schultz, Tanja;Manic, Milos;Krusienski, Dean J.
通讯作者：
Krusienski, Dean J.

Self-Supervised Learning of Neural Speech Representations From Unlabeled Intracranial Signals

来自未标记的颅内信号的神经语音表示的自我监督学习

DOI：
10.1109/access.2022.3230688
发表时间：
2022
期刊：
IEEE Access
影响因子：
3.9
作者：
Lesaja, Srdjan;Stuart, Morgan;Shih, Jerry J.;Soroush, Pedram Z.;Schultz, Tanja;Manic, Milos;Krusienski, Dean J.
通讯作者：
Krusienski, Dean J.

The nested hierarchy of overt, mouthed, and imagined speech activity evident in intracranial recordings