权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CI-ADDO-NEW: Collaborative Research: The Speech Recognition Virtual Kitchen

CI-ADDO-NEW：协作研究：语音识别虚拟厨房

基本信息

批准号：
1305365
负责人：
Florian Metze
金额：
$ 54.24万
依托单位：
Carnegie-Mellon University
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2013
资助国家：
美国
起止时间：
2013-09-01 至 2017-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1305365&HistoricalAwards=false
关键词：
CI ADDO NEW Collaborative Research

项目摘要

The Speech Recognition Virtual Kitchen Performing successful research on end-to-end speech processing problems requires the integration of many individual tools (e.g. for data cleaning, acoustic model training, language modeling, data analysis, real-time audio, decoding, parsing, synthesis, etc.). It is difficult for new researchers to get started in the field, simply because a typical lab environment consists of a hodgepodge of tools suited to a particular computing set-ups. This environment is hard to recreate, because few people are experts in the theory and practice of all these fields, and can debug and replicate experiments from scratch. This research infrastructure project creates a "kitchen" environment based on Virtual Machines (VMs) to promote community sharing of research techniques, and provides solid reference systems as a tool for education, research, and evaluation. We liken VMs to a "kitchen" because they provide an environment into which one can install "appliances" (e.g., toolkits), "recipes" (scripts for creating state-of-the art systems using these tools), and "ingredients" (spoken language data). The kitchen even holds "reference dishes" in the form of complete experiments with baseline runs, log-files, etc., together with all that is needed to recreate and modify them. The project is developing a community and repository by (a) building pilot VMs, (b) engaging the community in using and continuing to develop them on its own, and (c) evaluating the impact of providing VMs for education and research. We envision researchers as well as students downloading a VM, reproducing the baseline experiment, implementing changes, posting their results in the community, discussing with other users who have worked on the same VM, merging improvements back into the VM, which get re-distributed, and finally publishing easily reproducible results. Work with curriculum and project development will support the creation of engaging activities to specifically encourage students at undergraduate and graduate levels.

语音识别虚拟厨房成功地研究端到端语音处理问题需要集成许多单独的工具（例如，用于数据清理、声学模型训练、语言建模、数据分析、实时音频、解码、解析、合成等）。对于新的研究人员来说，很难在这个领域开始工作，这仅仅是因为典型的实验室环境由适合特定计算设置的工具组成。这种环境很难重现，因为很少有人是所有这些领域的理论和实践专家，可以从头开始调试和复制实验。该研究基础设施项目创建了一个基于虚拟机（VM）的“厨房”环境，以促进社区共享研究技术，并提供坚实的参考系统作为教育，研究和评估的工具。我们将VM比作“厨房”，因为它们提供了一个可以安装“设备”（例如，工具包）、“配方”（使用这些工具创建最先进系统的脚本）和“成分”（口语数据）。厨房里甚至有“参考菜”，以完整实验的形式，包括基线运行、日志文件等，以及重新创建和修改它们所需的一切。该项目正在通过以下方式开发一个社区和存储库：（a）建立试验虚拟机，（B）让社区自行使用并继续开发虚拟机，以及（c）评估为教育和研究提供虚拟机的影响。我们设想研究人员和学生下载虚拟机，复制基线实验，实施更改，在社区中发布结果，与在同一虚拟机上工作的其他用户讨论，将改进合并回虚拟机，重新分发，最后发布易于复制的结果。与课程和项目开发的工作将支持创建参与活动，特别是鼓励学生在本科和研究生水平。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Florian Metze其他文献

Fine-Grained Grounding for Multimodal Speech Recognition

多模态语音识别的细粒度基础

DOI：
10.18653/v1/2020.findings-emnlp.242
发表时间：
2020
期刊：
International Journal of Innovative Science and Research Technology (IJISRT)
影响因子：
0
作者：
Tejas Srinivasan;Ramon Sanabria;Florian Metze;Desmond Elliott
通讯作者：
Desmond Elliott

Robust audio-codebooks for large-scale event detection in consumer videos

用于消费视频中大规模事件检测的强大音频码本

DOI：
发表时间：
2013
期刊：
Interspeech
影响因子：
0
作者：
Shourabh Rawat;Peter F. Schulam;Susanne Burger;Duo Ding;Yipei Wang;Florian Metze
通讯作者：
Florian Metze

Subspace mixture model for low-resource speech recognition in cross-lingual settings

跨语言环境中低资源语音识别的子空间混合模型

DOI：
10.1109/icassp.2013.6639088
发表时间：
2013
期刊：
2013 IEEE International Conference on Acoustics, Speech and Signal Processing
影响因子：
0
作者：
Yajie Miao;Florian Metze;A. Waibel
通讯作者：
A. Waibel

Multimodal Speech Recognition with Unstructured Audio Masking

具有非结构化音频掩蔽的多模态语音识别

DOI：
10.18653/v1/2020.nlpbt-1.2
发表时间：
2020
期刊：
ArXiv
影响因子：
0
作者：
Tejas Srinivasan;Ramon Sanabria;Florian Metze;Desmond Elliott
通讯作者：
Desmond Elliott

Hierarchical Phone Recognition with Compositional Phonetics

使用组合语音进行分层电话识别

DOI：
10.21437/interspeech.2021-1803
发表时间：
2021
期刊：
ICASSP 2021 - 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
影响因子：
0
作者：
Xinjian Li;Juncheng Li;Florian Metze;A. Black
通讯作者：
A. Black