STTR Phase I: Small Footprint Speech Synthesis
STTR 第一阶段:小规模语音合成
基本信息
- 批准号:0441125
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2005
- 资助国家:美国
- 起止时间:2005-01-01 至 2006-06-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This Small Business Technology Transfer Phase I project aims to develop and implement a new algorithm in the area of text-to-speech synthesis (TTS) that will lead to (i) dramatic decreases in disk and memory requirements at a given speech quality level and (ii) minimization of the amount of voice recordings needed to create a new synthetic voice. Most current TTS systems operate by concatenating segments of recorded speech ([acoustic] units). A challenge for TTS is coarticulation: The dependency of the acoustic manifestations of a phoneme on its neighbors. Current TTS systems use multi-phone acoustic units such as diphones, which preserve coarticulatory patterns naturally present in speech. However, this approach requires a large amount of recordings and generates systems with large footprints. Biospeech proposes a uniphone approach that addresses coarticulation processes with an explicit model. The method uses complex spectral vectors (basis vectors) representing brief segments of speech inside single phonemes, and decomposes these into two components: A formant vector and a spectral balance vector. To generate speech, the formant and spectral balance vectors derived from the basis vectors corresponding to successive phonemes are subjected to separate--and hence generally asynchronous--interpolation operations using time varying weights; the formant and spectral balance vector trajectories thus created are re-combined to create a trajectory in complex spectral space; finally, this trajectory is converted into output speech with the inverse Fourier transform. Asynchronicity is necessitated by the quasi-independence of articulators underlying different spectral features (e.g., frication, formant frequencies).The proposed work has implications for other speech technologies, including Automatic Speech Recognition (ASR). Current ASR technologies address coarticulation by using multi-phone units, typical triphones. The number of triphones in English is over 70,000, and thus requires a large amount of training recordings. The proposed model could dramatically impact on the amount of recordings required for system training. Second, TTS has generally recognized societal benefits for universal access, education, and information access by voice. For example, TTS-based augmentative devices are available for individuals who have lost their voice; and reading machines for the blind have been available for several decades. Third, the approach will make higher-quality TTS more available for smaller devices. For example, voice based caller ID on low-end mobile telephones is currently not possible due to memory limitations. Fourth, it enables voice adaptation with a minimum of recordings. This will enable building personalized TTS systems for individuals with speech disorders who can only intermittently produce normal speech sounds or for individuals who are about to undergo surgery that will irreversibly alter their speech. The method proffered by Biospeech only requires recordings of valid samples of each of (less than 50) phonemes instead of each of (2000 or more) diphones.
该小型企业技术转让第一阶段项目旨在开发和实施文本到语音合成(TTS)领域的新算法,该算法将导致(i)在给定的语音质量水平下大幅降低磁盘和内存需求,以及(ii)最大限度地减少创建新合成语音所需的录音量。 大多数当前的TTS系统通过连接记录的语音片段([声学]单元)来操作。 TTS的一个挑战是协同发音:一个音素的声学表现对其邻居的依赖性。 目前的TTS系统使用多音素声学单元,例如双音素,其保留语音中自然存在的共发音模式。 然而,这种方法需要大量的记录,并产生具有大足迹的系统。 生物语音提出了一个单一的方法,解决协同发音过程与明确的模型。 该方法使用复杂的频谱向量(基向量)表示单个音素内的语音的简短片段,并将其分解为两个分量:共振峰向量和频谱平衡向量。 为了产生语音,从对应于连续音素的基向量导出的共振峰和频谱平衡向量经受使用时变权重的单独的(因此通常是异步的)内插操作;由此创建的共振峰和频谱平衡向量轨迹被重新组合以创建复频谱空间中的轨迹;最后,该轨迹利用逆傅立叶变换被转换成输出语音。 非同步性是由不同光谱特征(例如,所提出的工作对包括自动语音识别(ASR)在内的其他语音技术具有影响。 当前的ASR技术通过使用多音素单元(典型的三音素)来解决协同发音。英语中的三音素数量超过70,000,因此需要大量的训练录音。 所提出的模型可能会极大地影响系统训练所需的记录量。 第二,TTS普遍认识到通过语音实现普遍接入、教育和信息接入的社会效益。 例如,基于TTS的辅助设备可用于失去声音的个人;用于盲人的阅读机器已经存在了几十年。 第三,这种方法将使更高质量的TTS更适用于更小的设备。 例如,由于存储器限制,在低端移动的电话上基于语音的呼叫者ID当前是不可能的。 第四,它可以用最少的录音实现语音适应。 这将使构建个性化的TTS系统的个人与语言障碍谁只能间歇性地产生正常的语音声音或个人谁是即将接受手术,将不可逆转地改变他们的讲话。 Biospeech提供的方法只需要记录每个(少于50个)音素的有效样本,而不是每个(2000个或更多)双音素。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Alexander Kain其他文献
Alexander Kain的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Alexander Kain', 18)}}的其他基金
RI: Medium: Collaborative Research: Semi-Supervised Discriminative Training of Language Models
RI:媒介:协作研究:语言模型的半监督判别训练
- 批准号:
0964102 - 财政年份:2010
- 资助金额:
-- - 项目类别:
Continuing Grant
Collaborative Research: CDI-Type I: Computational Models for the Automatic Recognition of Non-Human Primate Social Behaviors
合作研究:CDI-Type I:自动识别非人类灵长类动物社会行为的计算模型
- 批准号:
1027834 - 财政年份:2010
- 资助金额:
-- - 项目类别:
Standard Grant
HCC: Medium: Synthesis and Perception of Speaker Identity
HCC:媒介:说话者身份的综合和感知
- 批准号:
0964468 - 财政年份:2010
- 资助金额:
-- - 项目类别:
Standard Grant
RI: Small: Modeling Coarticulation for Automatic Speech Recognition
RI:小型:自动语音识别的协同发音建模
- 批准号:
0915754 - 财政年份:2009
- 资助金额:
-- - 项目类别:
Continuing Grant
HCC: High-Quality Compression, Enhancement, and Personalization of Text-to-Speech Voices
HCC:文本转语音的高质量压缩、增强和个性化
- 批准号:
0713617 - 财政年份:2007
- 资助金额:
-- - 项目类别:
Continuing Grant
相似国自然基金
Baryogenesis, Dark Matter and Nanohertz Gravitational Waves from a Dark
Supercooled Phase Transition
- 批准号:24ZR1429700
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
ATLAS实验探测器Phase 2升级
- 批准号:11961141014
- 批准年份:2019
- 资助金额:3350 万元
- 项目类别:国际(地区)合作与交流项目
地幔含水相Phase E的温度压力稳定区域与晶体结构研究
- 批准号:41802035
- 批准年份:2018
- 资助金额:12.0 万元
- 项目类别:青年科学基金项目
基于数字增强干涉的Phase-OTDR高灵敏度定量测量技术研究
- 批准号:61675216
- 批准年份:2016
- 资助金额:60.0 万元
- 项目类别:面上项目
基于Phase-type分布的多状态系统可靠性模型研究
- 批准号:71501183
- 批准年份:2015
- 资助金额:17.4 万元
- 项目类别:青年科学基金项目
纳米(I-Phase+α-Mg)准共晶的临界半固态形成条件及生长机制
- 批准号:51201142
- 批准年份:2012
- 资助金额:25.0 万元
- 项目类别:青年科学基金项目
连续Phase-Type分布数据拟合方法及其应用研究
- 批准号:11101428
- 批准年份:2011
- 资助金额:23.0 万元
- 项目类别:青年科学基金项目
D-Phase准晶体的电子行为各向异性的研究
- 批准号:19374069
- 批准年份:1993
- 资助金额:6.4 万元
- 项目类别:面上项目
相似海外基金
Phase Ib/II study of safety and efficacy of EZH2 inhibitor, tazemetostat, and PD-1 blockade for treatment of advanced non-small cell lung cancer
EZH2 抑制剂、他泽美司他和 PD-1 阻断治疗晚期非小细胞肺癌的安全性和有效性的 Ib/II 期研究
- 批准号:
10481965 - 财政年份:2024
- 资助金额:
-- - 项目类别:
SBIR Phase I: CAS: Advanced Thermal Oxidizer to Cost-effectively Control Greenhouse Emissions from Small Sources
SBIR 第一阶段:CAS:先进的热氧化器,可经济高效地控制小源温室气体排放
- 批准号:
2326861 - 财政年份:2024
- 资助金额:
-- - 项目类别:
Standard Grant
SBIR Phase I: A wave attenuation technology for oyster reef restoration and small dock protection
SBIR 第一阶段:用于牡蛎礁恢复和小型码头保护的波浪衰减技术
- 批准号:
2223944 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
The role of liquid-liquid phase separation in the mechanism of small RNA amplification
液-液相分离在小RNA扩增机制中的作用
- 批准号:
23H02412 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (B)
SBIR Phase II: Accelerating R&D through Streamlined Machine Learning Algorithms for Small Data Applications in Advanced Manufacturing
SBIR 第二阶段:加速 R
- 批准号:
2325045 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Cooperative Agreement
SBIR Phase I: A physics-based machine learning platform for crystal structure prediction of small drug molecules
SBIR 第一阶段:基于物理的机器学习平台,用于小药物分子晶体结构预测
- 批准号:
2227936 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
Liquid-liquid phase-separation of small molecules and proteins
小分子和蛋白质的液-液相分离
- 批准号:
2821139 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Studentship
SBIR Phase II: Liquid Oxygen (LOX) - Methane Engine for Small Satellite Launch Vehicles
SBIR 第二阶段:液氧 (LOX) - 小型卫星运载火箭的甲烷发动机
- 批准号:
2303613 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Cooperative Agreement
SBIR Phase II: Automated Perception for Robotic Chopsticks Manipulating Small and Large Objects in Constrained Spaces
SBIR 第二阶段:机器人筷子在受限空间中操纵小型和大型物体的自动感知
- 批准号:
2321919 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Cooperative Agreement
High resolution laser spectroscopy of small gas-phase metal-containing molecules
气相含金属小分子的高分辨率激光光谱
- 批准号:
RGPIN-2016-03980 - 财政年份:2022
- 资助金额:
-- - 项目类别:
Discovery Grants Program - Individual