International: An Analysis of Speaker Diarization Systems Errors
国际:说话人二值化系统误差分析
基本信息
- 批准号:1135365
- 负责人:
- 金额:$ 1.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2011
- 资助国家:美国
- 起止时间:2011-08-01 至 2012-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This project will support the three-month visit of a US PhD student to the Idiap Research Institute in Switzerland, a leading international laboratory that has developed a state-of-the-art diarization system. Speaker diarization is the task of determining ?who spoke when? without a priori knowledge of the number of speakers or speaker identities. The focus of the current effort is to perform error analysis in audio-only speaker diarization for the meeting domain. There are two main areas of interest. The first is to build a framework to analyze speaker diarization performance on specific types of segments (e.g., speaker changes, interruption, overlapped speech, short utterances, long utterances, etc.). By analyzing where speaker diarization systems perform poorly, speaker diarization researchers can focus on improving performance during those problematic types of segments. The second area is to compare speaker diarization performance across systems. The project has substantial broader impacts. Speaker diarization is a useful step in meeting analysis. Considering the time people spend in meetings, improved speaker diarization could be useful for a broad portion of the population. While the goal is to characterize current speaker diarization errors, the knowledge gained from this work will be useful for improving future speaker diarization systems. In particular, by comparing where errors occur across multiple systems, the speaker diarization community can gain insight into the strengths and weaknesses of the various systems which could lead to a more novel way of combining systems to improve speaker diarization performance. In addition, the project will support the development of an international network of collaborators for a US graduate student.
该项目将支持一名美国博士生对瑞士Idiap研究所进行为期三个月的访问,该研究所是一家领先的国际实验室,开发了最先进的二值化系统。确定演讲者身份的任务是什么?谁在什么时候发言?而没有说话者数量或说话者身份的先验知识。当前工作的重点是对会议域中的纯音频说话人二值化进行误差分析。人们感兴趣的主要有两个领域。第一个是建立一个框架来分析特定类型的片段(例如,说话人改变、中断、重叠语音、短话语、长话语等)上的说话人二元化性能。通过分析说话人二元化系统在哪里表现不佳,说话人二元化研究人员可以专注于提高这些有问题的分段类型的性能。第二个领域是比较不同系统的扬声器二元化性能。该项目具有重大而广泛的影响。说话人对分是会议分析中一个有用的步骤。考虑到人们花在会议上的时间,改进发言者二元化对大部分人来说可能是有用的。虽然目标是表征当前的说话人二元化错误,但从这项工作中获得的知识将有助于改进未来的说话人二元化系统。特别是,通过比较多个系统中发生错误的位置,扬声器对分社区可以深入了解各种系统的优势和劣势,这可能导致一种更新颖的组合系统以提高扬声器对分性能的方法。此外,该项目还将支持为一名美国研究生建立一个国际合作者网络。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Nelson Morgan其他文献
Updated MINDS report on speech recognition and understanding, Part 2 [DSP Education]
关于语音识别和理解的最新 MINDS 报告,第 2 部分 [DSP 教育]
- DOI:
10.1109/msp.2009.932707 - 发表时间:
2009 - 期刊:
- 影响因子:0
- 作者:
J. Baker;Li Deng;S. Khudanpur;Chin;James R. Glass;Nelson Morgan;Douglas D. O'Shaughnessy - 通讯作者:
Douglas D. O'Shaughnessy
Updated MINDS Report on Speech Recognition and Understanding
更新后的 MINDS 关于语音识别和理解的报告
- DOI:
10.1016/s1567-4231(09)70205-9 - 发表时间:
2009 - 期刊:
- 影响因子:14.9
- 作者:
J. Baker;Li Deng;S. Khudanpur;Chin;James R. Glass;Nelson Morgan - 通讯作者:
Nelson Morgan
Writing programs that scale with increasing numbers of cores should be as easy as writing programs for sequential computers
编写随着内核数量的增加而扩展的程序应该像为顺序计算机编写程序一样简单
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
K. Asanović;Rastislav Bodík;James Demmel;T. Keaveny;K. Keutzer;J. Kubiatowicz;Nelson Morgan;David A. Patterson;Koushik Sen;J. Wawrzynek;David Wessel;K. Yelick - 通讯作者:
K. Yelick
Nelson Morgan的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Nelson Morgan', 18)}}的其他基金
RI: Small: Collaborative Research: Towards Modeling Source Separation from Measured Cortical Responses
RI:小型:协作研究:根据测量的皮质反应对源分离进行建模
- 批准号:
1320260 - 财政年份:2013
- 资助金额:
$ 1.5万 - 项目类别:
Standard Grant
EAGER: Collaborative Research: Towards Modeling Human Speech Confusions in Noise
EAGER:协作研究:对噪声中的人类语音混乱进行建模
- 批准号:
1248047 - 财政年份:2012
- 资助金额:
$ 1.5万 - 项目类别:
Standard Grant
CI-P: Towards a Consensus Representation for Understanding Structure of Multiparty Conversations
CI-P:走向理解多方对话结构的共识表示
- 批准号:
0958561 - 财政年份:2010
- 资助金额:
$ 1.5万 - 项目类别:
Standard Grant
OIA/MRI: Acquisition of a Computational Server for Large Vocabulary Connectionist Speech Recognition
OIA/MRI:购买用于大词汇量联结语音识别的计算服务器
- 批准号:
0521210 - 财政年份:2005
- 资助金额:
$ 1.5万 - 项目类别:
Standard Grant
ITR/PE+SY:Mapping Meetings: Language Technology to make Sense of Human Interaction
ITR/PE SY:映射会议:理解人类互动的语言技术
- 批准号:
0121396 - 财政年份:2001
- 资助金额:
$ 1.5万 - 项目类别:
Standard Grant
SGER: Incorporating Higher-Level Information into Dynamic Pronounciation Modeling for ASR
SGER:将高级信息纳入 ASR 动态发音建模
- 批准号:
9713346 - 财政年份:1997
- 资助金额:
$ 1.5万 - 项目类别:
Standard Grant
Robust Speech Recognition Using Vector Computing
使用矢量计算的鲁棒语音识别
- 批准号:
9612778 - 财政年份:1997
- 资助金额:
$ 1.5万 - 项目类别:
Standard Grant
Automatic Speech Recognition Based on Syllable-length Acoustic Models
基于音节长度声学模型的自动语音识别
- 批准号:
9712579 - 财政年份:1997
- 资助金额:
$ 1.5万 - 项目类别:
Continuing Grant
A System for Connectionist Speech Recognition Research
联结主义语音识别研究系统
- 批准号:
9311980 - 财政年份:1993
- 资助金额:
$ 1.5万 - 项目类别:
Continuing Grant
Application of Signal Processing CAD to the Digital Realization of Artificial Neural Networks
信号处理CAD在人工神经网络数字化实现中的应用
- 批准号:
8922354 - 财政年份:1990
- 资助金额:
$ 1.5万 - 项目类别:
Continuing Grant
相似国自然基金
Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:合作创新研究团队
Intelligent Patent Analysis for Optimized Technology Stack Selection:Blockchain BusinessRegistry Case Demonstration
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国学者研究基金项目
基于Meta-analysis的新疆棉花灌水增产模型研究
- 批准号:41601604
- 批准年份:2016
- 资助金额:22.0 万元
- 项目类别:青年科学基金项目
大规模微阵列数据组的meta-analysis方法研究
- 批准号:31100958
- 批准年份:2011
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
用“后合成核磁共振分析”(retrobiosynthetic NMR analysis)技术阐明青蒿素生物合成途径
- 批准号:30470153
- 批准年份:2004
- 资助金额:22.0 万元
- 项目类别:面上项目
相似海外基金
An Analysis of Distance and Direct Discourse: Differences in Speaker-Specific Information and Communication Process
距离和直接话语的分析:特定说话者信息和沟通过程的差异
- 批准号:
20K13086 - 财政年份:2020
- 资助金额:
$ 1.5万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Qualitative Analysis of Foreign Language Learning from Multilingual (Multilingual Speaker) Proficient Experience
多语种(多语种说话者)熟练经验对外语学习的定性分析
- 批准号:
19K14053 - 财政年份:2019
- 资助金额:
$ 1.5万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Analysis of Speech Style and Speaker Attributes in Narrative and Daily Conversational Sentences
叙事句和日常会话句中的言语风格和说话人属性分析
- 批准号:
18K11991 - 财政年份:2018
- 资助金额:
$ 1.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
RI: Medium: Assessing Speaker and Teacher Effectiveness through Gestural Analysis, EEG Recordings, and Eye Tracking
RI:中:通过手势分析、脑电图记录和眼动追踪评估演讲者和教师的有效性
- 批准号:
1513853 - 财政年份:2015
- 资助金额:
$ 1.5万 - 项目类别:
Standard Grant
A study of voice conversion based on sophisticated control of speaker identity founded on tensor analysis.
基于张量分析的说话人身份复杂控制的语音转换研究。
- 批准号:
23800015 - 财政年份:2011
- 资助金额:
$ 1.5万 - 项目类别:
Grant-in-Aid for Research Activity Start-up
Analysis of Intra-Speaker Variation and Development of Distributed Speaker Recognition System
说话人内部变异分析及分布式说话人识别系统开发
- 批准号:
17300065 - 财政年份:2005
- 资助金额:
$ 1.5万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Analysis and improvement of a high performance digital speaker system
高性能数字扬声器系统的分析与改进
- 批准号:
16560310 - 财政年份:2004
- 资助金额:
$ 1.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
A voice analysis and synthesis Model based on speaker identity
一种基于说话人身份的语音分析与合成模型
- 批准号:
238586-2000 - 财政年份:2002
- 资助金额:
$ 1.5万 - 项目类别:
Industrial Research Fellowships
A voice analysis and synthesis Model based on speaker identity
一种基于说话人身份的语音分析与合成模型
- 批准号:
238586-2000 - 财政年份:2001
- 资助金额:
$ 1.5万 - 项目类别:
Industrial Research Fellowships
A voice analysis and synthesis Model based on speaker identity
一种基于说话人身份的语音分析与合成模型
- 批准号:
238586-2000 - 财政年份:2000
- 资助金额:
$ 1.5万 - 项目类别:
Industrial Research Fellowships














{{item.name}}会员




