权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CCRI: Medium: Developing a Multi-Channel Naturalistic Audio Corpora for the Natural Language Processing Research Community

CCRI：Medium：为自然语言处理研究界开发多通道自然音频语料库

基本信息

批准号：
2016725
负责人：
John Hansen
金额：
$ 121.15万
依托单位：
University of Texas at Dallas
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2020
资助国家：
美国
起止时间：
2020-09-01 至 2024-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2016725&HistoricalAwards=false
关键词：
CCRI Medium Developing Multi Channel

项目摘要

This project is focused on developing a massive audio resource of real-world speech communications related to task solving by engineers/scientists working to ensure the success of one of history’s greatest scientific and technical accomplishments - the NASA Apollo missions. Major challenges exist in exploring team based voice communications due to a lack of any comprehensive time-synchronized and transcribed audio resource. The research team will create the framework needed to recover and make available the entire team communications. This includes digitizing the extensive 50-year-old Apollo 30-track analog tape collection, and advance the necessary speech technology tools/resources to automatically generate meta-data that includes when speech occurs, all text transcripts, speaker identity, as well as aspects relating to speaker traits. The resource is expected to encompass 150,000 hours of audio from all Apollo missions. The research team will provide open user access to explore and navigate the community resource using an interactive web platform: Explore Apollo, as well as download audio and corresponding meta-data: Fearless Steps – Explore Apollo. ‘Finding Waldo’: a resource to identify all instances of individual NASA members across Apollo missions and to make this available to surviving personnel and family members as a tribute to the ‘Heroes behind the Heroes of Apollo’. The focused research communities include speech technology, speech and language communications, team psychology in social sciences, education/STEM, historians and preservation archivists. Outreach efforts will be through mini-workshops, tutorials, community challenges, and special sessions across various fields, providing opportunities to distribute and receive user feedback to enhance the resource. This resource will allow engineers, scientists, educators and historians unique data to develop new theories and models for how people work and respond rapidly to challenging problems, as well as promote science and math based (STEM) goals for space, history and team based learning.This project is focused on developing a massive audio resource of real-world team-based speech communications by engineers/scientists working to ensure the success of one of history’s greatest technical achievements, the NASA Apollo missions. There is significant need from the speech technology community for access to natural big-data speech corpora to develop next generation technologies. A critical challenge is the ability to employ audio that is team and task based and not simulated. This project will establish a sustainable multi-speaker task-based corpora generation process based on the recovery of Apollo missions, encompassing up to +150,000 hours of audio. Research activities include (i) establishing the framework needed to digitize the 50-year old Apollo 30-track analog tape collection, (ii) advance speech technology tools/resources to automatically generate meta-data that include speech activity, speech recognition transcript generation, speaker identity, as well as aspects relating to speaker traits. Specific advancements will address acoustic and expanded lexicon/language model requirements to encompass communication traits for NASA engineers. The research team will provide open user access to explore and navigate the community resource using an interactive web platform: Explore Apollo, and download audio and meta-data: Fearless Steps – Explore Apollo. The resource is significantly enhanced by advancing extensive machine learning speech technologies in transcript/meta-data generation for audio speaker diarization – the process of determining “who spoke, what, and when”. The technology offers a unique opportunity to provide portions of history, and tangible pieces of technology for multi-purpose use. The open-access resource provides freely available meta-data to propose and develop algorithms for speech activity detection, keyword spotting, speaker variability, sentiment, accent, language identification, multimodal systems, conversational analysis, speaker turn detection and individual as well as team assessment. The concept of ‘Where’s Waldo’ is used as a metaphor to pay tribute and yield personal recognition to the thousands of notable members across the Apollo missions in addition to using deep learning strategies to develop effective speaker tagging and hot-spot detection systems. This will impact the lives of the Apollo members and their families and provide additional education resources for future generations while also aiding as a historical archive.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

该项目致力于开发与工程师/科学家解决任务相关的真实世界语音通信的海量音频资源，以确保历史上最伟大的科学技术成就之一--美国国家航空航天局阿波罗任务的成功。由于缺乏任何全面的时间同步和转录的音频资源，在探索基于团队的语音通信方面存在重大挑战。研究团队将创建恢复和提供整个团队通信所需的框架。这包括将已有50年历史的大量Apollo 30声道模拟磁带集数字化，并推进必要的语音技术工具/资源以自动生成元数据，其中包括语音发生时间、所有文本抄本、说话人身份以及与说话人特征相关的方面。预计该资源将包括来自所有阿波罗任务的15万小时音频。研究团队将提供开放的用户访问，以使用交互式网络平台探索和导航社区资源：探索阿波罗，以及下载音频和相应的元数据：无畏的步骤-探索阿波罗。“寻找瓦尔多”：一个资源，用来识别阿波罗任务中NASA成员个人的所有实例，并将其提供给幸存的人员和家庭成员，作为对“阿波罗英雄背后的英雄”的敬意。重点研究社区包括语音技术、语音和语言交流、社会科学中的团队心理学、教育/STEM、历史学家和保存档案员。外展工作将通过各领域的小型讲习班、教程、社区挑战和特别会议，提供分发和接收用户反馈的机会，以加强资源。该资源将允许工程师、科学家、教育家和历史学家开发新的理论和模型，了解人们如何工作并快速应对具有挑战性的问题，并促进空间、历史和基于团队的学习的基于科学和数学的(STEM)目标。该项目专注于开发一个巨大的音频资源，由工程师/科学家进行基于真实世界的基于团队的语音交流，以确保历史上最伟大的技术成就之一--NASA阿波罗任务的成功。语音技术界迫切需要获得自然的大数据语音语料库，以开发下一代技术。一个关键的挑战是使用基于团队和任务而不是模拟的音频的能力。该项目将在恢复阿波罗任务的基础上，建立一个可持续的多发言者任务型语料库生成过程，包括长达150,000小时的音频。研究活动包括：(1)建立将已有50年历史的阿波罗30声道模拟磁带集数字化所需的框架；(2)先进的语音技术工具/资源，以自动生成元数据，包括语音活动、语音识别记录生成、说话人身份以及与说话人特征有关的方面。具体的进展将解决声学和扩展词典/语言模型要求，以涵盖NASA工程师的交流特征。研究团队将提供开放的用户访问，以使用交互式网络平台探索和导航社区资源：探索阿波罗，并下载音频和元数据：无畏的步骤-探索阿波罗。通过推进广泛的机器学习语音技术，为音频说话人二元化生成抄本/元数据--确定“谁发言、什么发言和何时发言”的过程，大大增强了这一资源。这项技术提供了一个独特的机会，可以提供部分历史和有形的技术片段，供多用途使用。开放获取资源提供免费可用的元数据，用于提出和开发用于语音活动检测、关键字识别、说话人可变性、情感、口音、语言识别、多模式系统、对话分析、说话人转向检测以及个人和团队评估的算法。“Where‘s Waldo”的概念被用作一种隐喻，除了使用深度学习策略来开发有效的说话人标记和热点检测系统外，还可以向阿波罗任务中数以千计的著名成员致敬和获得个人认可。这将影响阿波罗成员及其家人的生活，并为后代提供额外的教育资源，同时也有助于作为历史档案。该奖项反映了NSF的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（25）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Challenges in Metadata Creation for Massive Naturalistic Team-Based Audio Data

基于团队的海量自然音频数据元数据创建的挑战

DOI：
10.21437/interspeech.2022-11243
发表时间：
2022
期刊：
ISCA INTERSPEECH-2022
影响因子：
0
作者：
Belitz, Chelzy;Hansen, John H.L.
通讯作者：
Hansen, John H.L.

Speaker tracking across a massive naturalistic audio corpus: Apollo-11

在大量自然主义音频语料库中跟踪说话者：Apollo-11

DOI：
10.1121/10.0008574
发表时间：
2021
期刊：
The Journal of the Acoustical Society of America
影响因子：
0
作者：
Chandra Shekar, Meena;Hansen, John H.
通讯作者：
Hansen, John H.

Automatic Measurement of Teachers' Talk: Indicators of Location and Quality in Science Activities

教师演讲的自动测量：科学活动中的位置和质量指标

DOI：
发表时间：
2020
期刊：
CRIEI-2020: Conf. on Research Innovations in Early Intervention
影响因子：
0
作者：
Buzhardt, J.;Irvin, D.W.;Hansen, J.H.L.;Kothalkar, P.;Consolver, K.;Luo, Y.;Rous, B.
通讯作者：
Rous, B.

Fearless Steps Apollo: Towards Community Resource Development for Science, Technology, Education, and Historical Preservation

无畏的阿波罗步伐：致力于科学、技术、教育和历史保护的社区资源开发

DOI：
发表时间：
2024
期刊：
and Signal Processing
影响因子：
0
作者：
Hansen, J.H.L.;Joglekar, A.;Shekar, M.M.C.;Chen, S.-J.;Liu X.
通讯作者：
Liu X.

FEARLESS STEPS: ADVANCEMENTS IN SPEECH TECHNOLOGY AND CORPUS DEVELOPMENT FOR NATURALISTIC AUDIO

无所畏惧的脚步：自然音频语音技术和语料库开发的进步

DOI：
发表时间：
2023
期刊：
NASA Human Research Program Investigators Conference
影响因子：
0
作者：
Joglekar, A.;Hansen, J.H.L.;Yousefi, M.;Chandra Shekar, M.;Chen, S.-J.;Belitz, C.
通讯作者：
Belitz, C.

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

John Hansen其他文献

An energy and power-aware approach to high-level synthesis of asynchronous systems

用于异步系统高级综合的能量和功率感知方法

DOI：
10.1109/iccad.2010.5654169
发表时间：
2010
期刊：
2010 IEEE/ACM International Conference on Computer-Aided Design (ICCAD)
影响因子：
0
作者：
John Hansen;Montek Singh
通讯作者：
Montek Singh

Springer Publisher

施普林格出版社

DOI：
发表时间：
2004
期刊：
DSP in Mobile and Vehicular Systems
影响因子：
0
作者：
Huseyin Abut;John Hansen;Kazuya Takeda (Eds.)
通讯作者：
Kazuya Takeda (Eds.)

Pedometer Use as Motivation for Physical Activity in Cardiac Tele-Rehabilitation

在心脏远程康复中使用计步器作为身体活动的动力

DOI：
10.5334/ijic.2288
发表时间：
2015
期刊：
International Journal of Integrated Care
影响因子：
2.4
作者：
C. Thorup;Mette Grønkjær;H. Spindler;J. Andreasen;John Hansen;B. Dinesen;Gitte Nielsen;E. E. Sørensen
通讯作者：
E. E. Sørensen