权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Landmark-Based Speech Recognition in Music and Speech Backgrounds

职业：音乐和语音背景中基于地标的语音识别

基本信息

批准号：
0132900
负责人：
Mark Hasegawa-Johnson
金额：
$ 39.58万
依托单位：
University of Illinois at Urbana-Champaign
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2002
资助国家：
美国
起止时间：
2002-07-01 至 2008-06-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=0132900&HistoricalAwards=false
关键词：
CAREER Landmark Based Speech Recognition

项目摘要

This is a Faculty Early Career Development (CAREER) award. The research will develop speech recognition and auditory scene analysis models that are probability distributions whose parameters can be trained from data and whose internal structures are capable of abstracting the perceptual response patterns of human listeners. Two broad research questions will be explored: (1) Can probability models representing the pitch, envelope, and timing of an acoustic source be computed and integrated in a tractable manner? (2) What are the theoretical and empirical requirements for the partitioning, training, and recognition scoring of probability models for landmark-based acoustic features? Landmarks in speech are identifiable points in the flow of sound over time, such as consonant releases and closures, vowel centers, and glide extrema. The educational component of this project includes significant curriculum development at both the undergraduate and graduate levels, and a strong investment in the mentoring of undergraduate and graduate research trainees.This CAREER award recognizes and supports the early career-development activities of a teacher-scholar who is likely to become an academic leader of the twenty-first century. This is fundamental scientific research in acoustics and computer science, but it addresses the very practical problem that computers are still far worse at recognizing speech than human beings are. Speech recognition technology has already become an important industry, but it will become far more important in the future as mobile computing and computer-mediated communications make it necessary for millions of people to control machines verbally rather than by means of keyboards. The educational component of this work will train graduate students to be teachers and communicators, as well as researchers, thus preparing them to help build the base of personnel needed in this exciting, growing area.

这是一个教师早期职业发展（CAREER）奖。该研究将开发语音识别和听觉场景分析模型，这些模型是概率分布，其参数可以从数据中训练，其内部结构能够抽象出人类听众的感知反应模式。将探讨两个广泛的研究问题：（1）概率模型表示的音高，包络线，声源的时间可以计算和整合在一个易于处理的方式？ (2)基于地标的声学特征的概率模型的划分、训练和识别评分的理论和经验要求是什么？言语中的地标是声音随时间变化的流动中可识别的点，例如辅音释放和闭合、元音中心和滑音极值。该项目的教育部分包括本科生和研究生两个层次的重要课程开发，以及对本科生和研究生研究实习生的指导的大力投资。该职业生涯奖认可并支持有可能成为21世纪学术领导者的教师学者的早期职业发展活动。这是声学和计算机科学的基础科学研究，但它解决了一个非常实际的问题，即计算机在识别语音方面仍然比人类差得多。语音识别技术已经成为一个重要的产业，但随着移动的计算和以计算机为媒介的通信使数百万人必须通过口头而不是通过键盘来控制机器，它在未来将变得更加重要。这项工作的教育部分将培养研究生成为教师和传播者以及研究人员，从而使他们做好准备，帮助建立这个令人兴奋的不断增长的领域所需的人员基础。