权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Automated Large-Scale Phonetic Analysis: DASS Pilot

自动大规模语音分析：DASS Pilot

基本信息

批准号：
1625680
负责人：
William Kretzschmar
金额：
$ 37.73万
依托单位：
University of Georgia Research Foundation Inc
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2016
资助国家：
美国
起止时间：
2016-08-15 至 2021-03-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1625680&HistoricalAwards=false
关键词：
Automated Large Scale Phonetic Analysis

项目摘要

Generalizations about language contained in dictionaries and grammars hide the extensive variation in the way that speakers actually use language. However, modern technology now makes it possible to use automated means to extract variation in pronunciation from spoken interviews. This research project uses available software to process sixty-four interviews with speakers from Florida, Georgia, Tennessee, Alabama, Mississippi, Louisiana, Arkansas, and Texas recorded from 1968-1983. These interviews constitute a geographic and social sample of speakers across the Gulf States. All of the transcriptions of the interviews, the vowel pronunciation data, and the visualizations will be presented on the website of the Linguistic Atlas Project. Detailed data on actual speaker variation addresses the industrial methods currently used for speech recognition and speech synthesis.This project will be the first large-scale test of the complex systems model against acoustic phonetic data. The legacy interviews consist of over 200 Gb of files containing 372 hours of digital audio interviews. In the first stage of the research, vowel pronunciations will be extracted from a list of seventy-eight different words that were elicitation targets in the interviews, plus additional words found to occur frequently in the interviews such as color terms, up to a total of three hundred words. The resulting data set will have approximately 22,500 vowel tokens per interview, nearly 1,500,000 tokens across the data set, a very large corpus of data on Southern American English. The second stage of the project will create visualizations of these tokens to determine the dimensions of variation in the realization of vowels per speaker, social category, and geographic area. The science of complex systems will be employed as a model in the analysis, which predicts that the wide range of realizations that occurs in the groups under analysis will be self-organized into nonlinear distributional patterns. The extraction and display of the full range of vowel variation has the potential to improve industrial methods used for both speech recognition and speech synthesis, as it offers a detailed view of actual variation for speakers and groups rather than assuming a consistent or ?average? realization of vowels.

词典和语法中对语言的概括掩盖了说话者实际使用语言的方式的广泛差异。然而，现代技术现在可以使用自动化手段从口语采访中提取发音的变化。这个研究项目使用现有的软件来处理64个来自佛罗里达、格鲁吉亚、田纳西、亚拉巴马、密西西比、路易斯安那、阿肯色州和得克萨斯州的演讲者的访谈，这些访谈记录于1968 - 1983年。这些访谈构成了海湾国家演讲者的地理和社会样本。所有访谈的翻译、元音发音数据和可视化结果都将在语言地图集项目的网站上展示。关于实际说话人变化的详细数据解决了目前用于语音识别和语音合成的工业方法。该项目将是第一个针对声学语音数据的复杂系统模型的大规模测试。传统访谈包括超过200 GB的文件，其中包含372小时的数字音频访谈。在第一阶段的研究中，元音发音将被提取从78个不同的单词，在采访中的启发目标，加上额外的单词，发现经常出现在采访中，如颜色的条款，共300字。由此产生的数据集将有大约22，500个元音标记，整个数据集有近1，500，000个标记，这是一个非常大的南美英语数据语料库。该项目的第二阶段将创建这些标记的可视化，以确定每个说话者，社会类别和地理区域的元音实现的变化维度。复杂系统科学将被用作分析模型，预测分析中的群体中发生的广泛实现将自组织成非线性分布模式。的提取和显示的全方位的元音变化有可能改善工业方法用于语音识别和语音合成，因为它提供了一个详细的视图，而不是假设一个一致的或实际的变化为扬声器和团体？平均水平？元音的实现。