权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Independent Component Analysis for Speech Signal Enhancement and Representation

用于语音信号增强和表示的独立分量分析

基本信息

批准号：
EP/F036132/1
负责人：
Peter Jancovic
金额：
$ 45.38万
依托单位：
University of Birmingham
依托单位国家：
英国
项目类别：
Research Grant
财政年份：
2008
资助国家：
英国
起止时间：
2008 至无数据
项目状态：
已结题

来源：
https://gtr.ukri.org/projects?ref=EP%2FF036132%2F1
关键词：
Independent Component Analysis Speech Signal

项目摘要

While current automatic speech and speaker recognition systems can reach high performance in carefully controlled environments, their performance degrades rapidly when they are applied in real-world situations due to the presence of a background environmental noise. There are three approaches to deal with additive background noise: speech signal enhancement, noise robust speech feature extraction and noise compensation. This proposal is concerned with signal enhancement and noise-robust feature extraction.The goal of the speech enhancement is to estimate the original signal from a given noise-corrupted signal. Several techniques have been proposed in the past decades, such as spectral subtraction and Wiener filtering. Recently the use of maximum-a-posteriori (MAP) technique has been proposed and this has shown a superior performance compared to the other techniques. The MAP estimation is usually carried out in a linear transformation domain. In our recent research, we have proposed a novel MAP-based algorithm which performs the enhancement in the Independent Component Analysis (ICA) transformation domain and demonstrated that the use of ICA can lead to a better performance than using other transformations when the signal and noise have non-Gaussian distributions. The denoising capability of the proposed algorithm improves with increasing non-Gaussianity of the signal and noise.The purpose of signal representation is to explicitly represent the information in the signal which is embedded in statistical dependencies. This is typically performed by using a linear transformation. In our recent work, we have analyzed the effectiveness of the signal representation by using the ICA estimated based on clean signals and demonstrated that such representation is most effective for non-Gaussian signals being clean or corrupted by Gaussian noise and the effectiveness increases with increasing the non-Gaussianity of the signal. We have also demonstrated that the use of such ICA transformation is not optimal for signal corrupted by non-Gaussian noise. Our previous studies summarized above provide a solid theoretical foundation for the development of richer classes of speech signal enhancement and representation techniques capable of better exploiting the statistical properties of the signal and noise and employing specific properties of speech signals. Our proposed research aims to: i) develop speech enhancement techniques employing multiple distribution models of the signal and multiple transformations in order to better account for the variability of speech signals; ii) incorporate specific properties of speech signals within these signal enhancement techniques; iii) investigate an effective signal representation under non-Gaussian noise corruption. The performance of the developed speech enhancement techniques will be first evaluated in terms of low-level measures and listening experiments. Then, the proposed techniques will be evaluated in terms of recognition accuracy when employed for speech and speaker recognition. We aim to achieve significant performance improvements on standard datasets (AURORA2, TIMIT, Resource Management).

虽然当前的自动语音和说话人识别系统可以在严格控制的环境中达到高性能，但由于背景环境噪声的存在，当它们应用于现实世界时，其性能会迅速下降。处理加性背景噪声有三种方法：语音信号增强、噪声鲁棒语音特征提取和噪声补偿。该提案涉及信号增强和抗噪声特征提取。语音增强的目标是从给定的噪声损坏信号中估计原始信号。在过去的几十年里，已经提出了几种技术，例如谱减法和维纳滤波。最近提出了最大后验（MAP）技术的使用，与其他技术相比，该技术显示出优越的性能。 MAP估计通常在线性变换域中进行。在我们最近的研究中，我们提出了一种新的基于 MAP 的算法，该算法在独立分量分析 (ICA) 变换域中进行增强，并证明当信号和噪声具有非高斯分布时，使用 ICA 可以比使用其他变换获得更好的性能。该算法的去噪能力随着信号和噪声的非高斯性的增加而提高。信号表示的目的是明确地表示信号中嵌入统计依赖性的信息。这通常通过使用线性变换来执行。在我们最近的工作中，我们通过使用基于干净信号估计的ICA来分析信号表示的有效性，并证明这种表示对于干净的或被高斯噪声破坏的非高斯信号是最有效的，并且有效性随着信号的非高斯性的增加而增加。我们还证明，对于被非高斯噪声破坏的信号来说，使用这种 ICA 变换并不是最佳选择。我们之前总结的研究为开发更丰富的语音信号增强和表示技术提供了坚实的理论基础，这些技术能够更好地利用信号和噪声的统计特性并利用语音信号的特定特性。我们提出的研究旨在：i）开发采用信号的多种分布模型和多种变换的语音增强技术，以便更好地考虑语音信号的可变性； ii) 将语音信号的特定属性纳入这些信号增强技术中； iii) 研究非高斯噪声损坏下的有效信号表示。所开发的语音增强技术的性能将首先根据低级测量和听力实验进行评估。然后，所提出的技术将在用于语音和说话人识别时的识别准确性方面进行评估。我们的目标是在标准数据集（AURORA2、TIMIT、资源管理）上实现显着的性能改进。