Computational Methods for Speech Analysis
语音分析的计算方法
基本信息
- 批准号:2120087
- 负责人:
- 金额:$ 24.93万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-08-01 至 2024-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This research project will develop tools for testing hypotheses about human communication. Researchers generally study human communication from textual transcripts which omit vocal tone. The project will directly address the disconnect between the data-generating process - in which speakers and listeners use the auditory channel to convey both textual and non-textual signals - and the widespread practice of discarding speech audio. The investigators will extend their prior speech model, The Model of Audio and Speech Structure, to address some limitations of the model. In particular, the statistical extensions will accommodate multiple speakers and allow for the joint modeling of text and tone. To demonstrate the value of the statistical extensions, the model will be applied to two original video corpora - police body-worn camera footage and campaign speeches for federal office. New software will be developed that makes it easy for researchers to quickly annotate a large amount of speech audio. The browser-based tools will enable automatic and manual segmentation, along with labeling. Multiple graduate students will gain experience in computationally intensive research and software development. The tools to be developed will be incorporated into ongoing public-private collaborations to improve oversight of police officers in the field.This research project will extend the Model of Audio and Speech Structure (MASS), which analyzes conversation as a nested stochastic process in which (i) the flow of conversation unfolds as a sequence of utterances transitioning between speakers and their vocal tones, based on contextual covariates; and (ii) the auditory signal within each utterance unfolds as a hidden Markov model that transitions between phonemes which generate sound. The model enables social scientists to test hypotheses about how conversations are structured by fixed covariates (e.g., speaker gender, conversation role) and time-varying covariates (e.g., exogenous external stimuli, endogenous conversation trajectory such as the previous speaker's tone). In its current implementation, however, MASS has two key limitations: First, it uses resource-intensive human annotations of tone for each speaker, which limits application to contexts with many unique speakers, such as police body-worn camera footage. This project will develop extensions allowing the model to borrow strength by partial pooling across speakers with similar speech profiles. Second, MASS incorporates text as externally given metadata. The project will develop a new approach for joint modeling of text and audio which will incorporate a dynamic topic model into the flow-of-conversation layer of MASS. The investigators will conduct two applications to demonstrate the value of the multi-speaker and joint text-audio modeling extensions.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
这项研究项目将开发工具来测试关于人类交流的假设。研究人员通常从文本文本中研究人类交流,而文本文本省略了声调。该项目将直接解决数据生成过程与丢弃语音音频的普遍做法之间的脱节。在数据生成过程中,说话者和听者使用听觉通道来传递文本和非文本信号。研究人员将扩展他们之前的语音模型,音频和语音结构模型,以解决该模型的一些局限性。特别是,统计扩展将容纳多个发言者,并允许对文本和语调进行联合建模。为了证明统计扩展的价值,该模型将应用于两个原始视频语料库-警察穿着身体的摄像机镜头和联邦办公室的竞选演讲。将开发新的软件,使研究人员能够轻松地对大量语音音频进行快速注释。基于浏览器的工具将支持自动和手动分割,以及标签。多名研究生将在计算密集型研究和软件开发方面获得经验。将开发的工具将被纳入正在进行的公私合作中,以改善对现场警察的监督。本研究项目将扩展音频和语音结构模型(MASS),该模型将会话分析为一个嵌套的随机过程,其中(I)对话流展开为基于上下文协变量在说话者和他们的声调之间转换的一系列话语;(Ii)每个话语中的听觉信号展开为一个隐藏的马尔可夫模型,该模型在产生声音的音素之间转换。该模型使社会科学家能够测试关于对话是如何由固定的协变量(例如,说话人的性别、对话角色)和时变的协变量(例如,外部刺激、内生的对话轨迹,如前一说话人的语气)构成的假设。然而,在目前的实现中,MASS有两个关键限制:首先,它为每个说话者使用资源密集型的人类语气注释,这将应用程序限制在许多独特说话者的上下文中,例如警察佩戴的摄像机镜头。该项目将开发扩展,允许该模型通过部分汇集具有相似语音特征的说话者来借力。其次,MASS将文本合并为外部给定的元数据。该项目将开发一种新的文本和音频联合建模方法,将动态主题模型结合到质量的对话流层中。调查人员将进行两项申请,以证明多发言者和联合文本-音频建模扩展的价值。该奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Christopher Lucas其他文献
Form-function mismatches in (formally) definite English noun phrases: Towards a diachronic account
(形式上)确定的英语名词短语中的形式功能不匹配:走向历时帐户
- DOI:
10.1075/la.171.12luc - 发表时间:
2011 - 期刊:
- 影响因子:0
- 作者:
Christopher Lucas - 通讯作者:
Christopher Lucas
Toward quantitative forecasts of volcanic ash dispersal: Using satellite retrievals for optimal estimation of source terms
火山灰扩散的定量预测:利用卫星检索对源项进行最佳估计
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
M. Zidikheri;Christopher Lucas;Rodney J. Potts - 通讯作者:
Rodney J. Potts
Contact-induced grammatical change: Towards an explicit account
接触引起的语法变化:走向明确的解释
- DOI:
10.1075/dia.29.3.01luc - 发表时间:
2012 - 期刊:
- 影响因子:0.7
- 作者:
Christopher Lucas - 通讯作者:
Christopher Lucas
On Wilmsen on the development of postverbal negation in dialectal Arabic
威尔姆森论阿拉伯语方言中动词后否定的发展
- DOI:
10.13173/zeitarabling.67.0044 - 发表时间:
2018 - 期刊:
- 影响因子:1.1
- 作者:
Christopher Lucas - 通讯作者:
Christopher Lucas
Can Transformer be Too Compositional? Analysing Idiom Processing in Neural Machine Translation
Transformer 会不会太组合了?
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Verna Dankers;Christopher Lucas;Ivan Titov - 通讯作者:
Ivan Titov
Christopher Lucas的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Christopher Lucas', 18)}}的其他基金
XMaS: The National Material Science Beamline Research Facility at the ESRF
XMaS:ESRF 的国家材料科学光束线研究设施
- 批准号:
EP/Y031164/1 - 财政年份:2024
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
Dissecting macrophage regulation of lung epithelial regeneration
剖析巨噬细胞对肺上皮再生的调节
- 批准号:
MR/X019314/1 - 财政年份:2023
- 资助金额:
$ 24.93万 - 项目类别:
Fellowship
XMaS Capital Equipment Upgrade
XMaS 资本设备升级
- 批准号:
EP/X035131/1 - 财政年份:2023
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
Inflammation in Covid-19: Exploration of Critical Aspects of Pathogenesis (ICECAP)
Covid-19 中的炎症:发病机制关键方面的探索 (ICECAP)
- 批准号:
MR/V028790/1 - 财政年份:2020
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
XMaS: The UK Materials Science Facility at the ESRF
XMaS:ESRF 的英国材料科学设施
- 批准号:
EP/S020802/1 - 财政年份:2018
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
Arabic and contact-induced language change
阿拉伯语和接触引起的语言变化
- 批准号:
AH/P014089/1 - 财政年份:2017
- 资助金额:
$ 24.93万 - 项目类别:
Fellowship
Developing Electrochemical Structure-Function Relationships in Non-aqueous Electrolytes
开发非水电解质中的电化学结构-功能关系
- 批准号:
EP/K002236/1 - 财政年份:2012
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
Combined Atomic Imaging and Diffraction Studies of the Electrooxidation of Supported Metal Multilayers
负载金属多层电氧化的原子成像和衍射联合研究
- 批准号:
EP/G068372/1 - 财政年份:2009
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
Atomic-scale Structural Studies of the Electrochemical Interface
电化学界面的原子尺度结构研究
- 批准号:
EP/F036418/1 - 财政年份:2008
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
Exploiting XMaS Studies of Highly Correlated Electron Systems, Real Surfaces and Biomaterials
利用高度相关电子系统、真实表面和生物材料的 XMaS 研究
- 批准号:
EP/F000766/1 - 财政年份:2007
- 资助金额:
$ 24.93万 - 项目类别:
Research Grant
相似国自然基金
Computational Methods for Analyzing Toponome Data
- 批准号:60601030
- 批准年份:2006
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Perceptual Methods for Speech Communication
言语交流的感知方法
- 批准号:
RGPIN-2016-04412 - 财政年份:2022
- 资助金额:
$ 24.93万 - 项目类别:
Discovery Grants Program - Individual
Human and synthesised speech structures: compositional approaches to analysis, generative methods and musical transformation
人类和合成语音结构:分析的组合方法、生成方法和音乐转换
- 批准号:
2749034 - 财政年份:2022
- 资助金额:
$ 24.93万 - 项目类别:
Studentship
Improving machine learning methods for classifying human brain responses (EEG) to speech sounds
改进机器学习方法,对人脑对语音的反应(EEG)进行分类
- 批准号:
564301-2021 - 财政年份:2021
- 资助金额:
$ 24.93万 - 项目类别:
University Undergraduate Student Research Awards
Transfer characteristics of emotional speech information toward elderly persons with hearing loss and development of novel speech morphing methods
情感语音信息向听力损失老年人的传递特性及新型语音变形方法的开发
- 批准号:
21K19794 - 财政年份:2021
- 资助金额:
$ 24.93万 - 项目类别:
Grant-in-Aid for Challenging Research (Exploratory)
Perceptual Methods for Speech Communication
言语交流的感知方法
- 批准号:
RGPIN-2016-04412 - 财政年份:2021
- 资助金额:
$ 24.93万 - 项目类别:
Discovery Grants Program - Individual
Perceptual Methods for Speech Communication
言语交流的感知方法
- 批准号:
RGPIN-2016-04412 - 财政年份:2019
- 资助金额:
$ 24.93万 - 项目类别:
Discovery Grants Program - Individual
Using Psychophysical Methods to Understand and Improve Speech Recognition in Cochlear Implant Users
使用心理物理学方法来理解和提高人工耳蜗使用者的语音识别能力
- 批准号:
10219806 - 财政年份:2019
- 资助金额:
$ 24.93万 - 项目类别:
Development of speech enhancement methods for conveying emotions equivalent to face-to-face communication
开发用于传达相当于面对面交流的情感的语音增强方法
- 批准号:
19K20618 - 财政年份:2019
- 资助金额:
$ 24.93万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Using Psychophysical Methods to Understand and Improve Speech Recognition in Cochlear Implant Users
使用心理物理学方法来理解和改善人工耳蜗使用者的语音识别能力
- 批准号:
9921335 - 财政年份:2019
- 资助金额:
$ 24.93万 - 项目类别:
Using Psychophysical Methods to Understand and Improve Speech Recognition in Cochlear Implant Users
使用心理物理学方法来理解和提高人工耳蜗使用者的语音识别能力
- 批准号:
10456756 - 财政年份:2019
- 资助金额:
$ 24.93万 - 项目类别: