Automated Large-Scale Phonetic Analysis: DASS Pilot
自动大规模语音分析:DASS Pilot
基本信息
- 批准号:1625680
- 负责人:
- 金额:$ 37.73万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2016
- 资助国家:美国
- 起止时间:2016-08-15 至 2021-03-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Generalizations about language contained in dictionaries and grammars hide the extensive variation in the way that speakers actually use language. However, modern technology now makes it possible to use automated means to extract variation in pronunciation from spoken interviews. This research project uses available software to process sixty-four interviews with speakers from Florida, Georgia, Tennessee, Alabama, Mississippi, Louisiana, Arkansas, and Texas recorded from 1968-1983. These interviews constitute a geographic and social sample of speakers across the Gulf States. All of the transcriptions of the interviews, the vowel pronunciation data, and the visualizations will be presented on the website of the Linguistic Atlas Project. Detailed data on actual speaker variation addresses the industrial methods currently used for speech recognition and speech synthesis.This project will be the first large-scale test of the complex systems model against acoustic phonetic data. The legacy interviews consist of over 200 Gb of files containing 372 hours of digital audio interviews. In the first stage of the research, vowel pronunciations will be extracted from a list of seventy-eight different words that were elicitation targets in the interviews, plus additional words found to occur frequently in the interviews such as color terms, up to a total of three hundred words. The resulting data set will have approximately 22,500 vowel tokens per interview, nearly 1,500,000 tokens across the data set, a very large corpus of data on Southern American English. The second stage of the project will create visualizations of these tokens to determine the dimensions of variation in the realization of vowels per speaker, social category, and geographic area. The science of complex systems will be employed as a model in the analysis, which predicts that the wide range of realizations that occurs in the groups under analysis will be self-organized into nonlinear distributional patterns. The extraction and display of the full range of vowel variation has the potential to improve industrial methods used for both speech recognition and speech synthesis, as it offers a detailed view of actual variation for speakers and groups rather than assuming a consistent or ?average? realization of vowels.
词典和语法中对语言的概括掩盖了说话者实际使用语言的方式的广泛差异。然而,现代技术现在可以使用自动化手段从口语采访中提取发音的变化。这个研究项目使用现有的软件来处理64个来自佛罗里达、格鲁吉亚、田纳西、亚拉巴马、密西西比、路易斯安那、阿肯色州和得克萨斯州的演讲者的访谈,这些访谈记录于1968 - 1983年。这些访谈构成了海湾国家演讲者的地理和社会样本。所有访谈的翻译、元音发音数据和可视化结果都将在语言地图集项目的网站上展示。关于实际说话人变化的详细数据解决了目前用于语音识别和语音合成的工业方法。该项目将是第一个针对声学语音数据的复杂系统模型的大规模测试。传统访谈包括超过200 GB的文件,其中包含372小时的数字音频访谈。在第一阶段的研究中,元音发音将被提取从78个不同的单词,在采访中的启发目标,加上额外的单词,发现经常出现在采访中,如颜色的条款,共300字。由此产生的数据集将有大约22,500个元音标记,整个数据集有近1,500,000个标记,这是一个非常大的南美英语数据语料库。该项目的第二阶段将创建这些标记的可视化,以确定每个说话者,社会类别和地理区域的元音实现的变化维度。复杂系统科学将被用作分析模型,预测分析中的群体中发生的广泛实现将自组织成非线性分布模式。的提取和显示的全方位的元音变化有可能改善工业方法用于语音识别和语音合成,因为它提供了一个详细的视图,而不是假设一个一致的或实际的变化为扬声器和团体?平均水平?元音的实现。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
William Kretzschmar其他文献
William Kretzschmar的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('William Kretzschmar', 18)}}的其他基金
Conference: Language Variety in the South
会议:南方的语言多样性
- 批准号:
2313720 - 财政年份:2023
- 资助金额:
$ 37.73万 - 项目类别:
Standard Grant
Doctoral Dissertation Research: Investigating the Local Construction of Identity: Sociophonetic Variation in Smoky Mountain African American Women's Speech
博士论文研究:调查身份的本地建构:烟山非裔美国女性言语的社交语音变异
- 批准号:
0446888 - 财政年份:2005
- 资助金额:
$ 37.73万 - 项目类别:
Standard Grant
Colorado Field Research for Linguistic Atlas of the Western States
科罗拉多州西部各州语言地图集实地研究
- 批准号:
0115654 - 财政年份:2001
- 资助金额:
$ 37.73万 - 项目类别:
Standard Grant
Collaborative Research on the Geography of English Dialect Features by Self-Organizing Maps
自组织映射的英语方言特征地理协同研究
- 批准号:
9975657 - 财政年份:1999
- 资助金额:
$ 37.73万 - 项目类别:
Standard Grant
Historical Databases of African American English and Gullah
非裔美国英语和古拉历史数据库
- 批准号:
9729149 - 财政年份:1998
- 资助金额:
$ 37.73万 - 项目类别:
Standard Grant
Charting Linguistic Features by Density Estimation
通过密度估计绘制语言特征图表
- 批准号:
9222279 - 财政年份:1993
- 资助金额:
$ 37.73万 - 项目类别:
Standard Grant
Computer Tools for Phonetic Analysis: LAMSAS
用于语音分析的计算机工具:LAMSAS
- 批准号:
8819749 - 财政年份:1989
- 资助金额:
$ 37.73万 - 项目类别:
Standard Grant
相似国自然基金
水稻穗粒数调控关键因子LARGE6的分子遗传网络解析
- 批准号:
- 批准年份:2022
- 资助金额:30 万元
- 项目类别:青年科学基金项目
量子自旋液体中拓扑拟粒子的性质:量子蒙特卡罗和新的large-N理论
- 批准号:
- 批准年份:2020
- 资助金额:62 万元
- 项目类别:面上项目
甘蓝型油菜Large Grain基因调控粒重的分子机制研究
- 批准号:31972875
- 批准年份:2019
- 资助金额:58.0 万元
- 项目类别:面上项目
Large PB/PB小鼠 视网膜新生血管模型的研究
- 批准号:30971650
- 批准年份:2009
- 资助金额:8.0 万元
- 项目类别:面上项目
基因discs large在果蝇卵母细胞的后端定位及其体轴极性形成中的作用机制
- 批准号:30800648
- 批准年份:2008
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
LARGE基因对口腔癌细胞中α-DG糖基化及表达的分子调控
- 批准号:30772435
- 批准年份:2007
- 资助金额:29.0 万元
- 项目类别:面上项目
相似海外基金
A Smart Automated TEM Facility for Large Scale Analysis of Atomic Structure and Chemistry
用于大规模原子结构和化学分析的智能自动化 TEM 设备
- 批准号:
EP/X041204/1 - 财政年份:2023
- 资助金额:
$ 37.73万 - 项目类别:
Research Grant
Enhancing Automated Software Evolution via Building and Utilizing Large-Scale Software Evolution Corpora
通过构建和利用大规模软件演进语料库增强自动化软件演进
- 批准号:
22H03567 - 财政年份:2022
- 资助金额:
$ 37.73万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Automated monitoring and debugging of large scale manycore heterogeneous systems
大规模众核异构系统的自动监控和调试
- 批准号:
507883-2016 - 财政年份:2020
- 资助金额:
$ 37.73万 - 项目类别:
Collaborative Research and Development Grants
Development of An Automated High-Throughput Dried Blood Spot Assay to Facilitate Large Scale Screening for Type 1 Diabetes Risk
开发自动化高通量干血斑测定法以促进大规模筛查 1 型糖尿病风险
- 批准号:
10020787 - 财政年份:2019
- 资助金额:
$ 37.73万 - 项目类别:
AI-DCL: Collaborative Research: EAGER: Understanding and Alleviating Potential Biases in Large Scale Employee Selection Systems: The Case of Automated Video Interviews
AI-DCL:协作研究:EAGER:理解和减轻大规模员工选拔系统中的潜在偏见:自动视频面试的案例
- 批准号:
1921111 - 财政年份:2019
- 资助金额:
$ 37.73万 - 项目类别:
Standard Grant
Automated monitoring and debugging of large scale manycore heterogeneous systems
大规模众核异构系统的自动监控和调试
- 批准号:
507883-2016 - 财政年份:2019
- 资助金额:
$ 37.73万 - 项目类别:
Collaborative Research and Development Grants
SBIR Phase I: Large Scale And Automated Unsupervised Machine Learning For Anomaly Detection
SBIR 第一阶段:用于异常检测的大规模自动化无监督机器学习
- 批准号:
1843988 - 财政年份:2019
- 资助金额:
$ 37.73万 - 项目类别:
Standard Grant
Development of An Automated High-Throughput Dried Blood Spot Assay to Facilitate Large Scale Screening for Type 1 Diabetes Risk
开发自动化高通量干血斑测定法以促进大规模筛查 1 型糖尿病风险
- 批准号:
9910015 - 财政年份:2019
- 资助金额:
$ 37.73万 - 项目类别:
AI-DCL: Collaborative Research: EAGER: Understanding and Alleviating Potential Biases in Large Scale Employee Selection Systems: The Case of Automated Video Interviews
AI-DCL:协作研究:EAGER:理解和减轻大规模员工选拔系统中的潜在偏见:自动视频面试的案例
- 批准号:
1921087 - 财政年份:2019
- 资助金额:
$ 37.73万 - 项目类别:
Standard Grant
Automated monitoring and debugging of large scale manycore heterogeneous systems
大规模众核异构系统的自动监控和调试
- 批准号:
507883-2016 - 财政年份:2018
- 资助金额:
$ 37.73万 - 项目类别:
Collaborative Research and Development Grants