权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

時変な特徴の高精度推定と認識に関する研究

时变特征的高精度估计与识别研究

基本信息

批准号：
06808032
负责人：
宮永喜一
金额：
$ 1.22万
依托单位：
Hokkaido University
依托单位国家：
日本
项目类别：
Grant-in-Aid for General Scientific Research (C)
财政年份：
1994
资助国家：
日本
起止时间：
1994 至无数据
项目状态：
已结题

项目摘要

1.最初に適応的な時変音声スペクトルの推定アルゴリズムを開発するため、その第1段階として、新しい時変確率モデルを設計した。このモデルは、音声の生成モデルにかなり忠実なモデルとして定義し、音声の有声音と無声音を区別して分析できるように構築されている。また推定モデルの設計と同時に推定手法も同時に開発し、両者を合わせる事で、無理のない時変性の特徴表現ができ、その推定も可能である事も示した。2.次に音声の認識のために、時間変動も考慮する自己組織化ニューラルネットワークを設計した。自己組織化を行なう認識システムは、最初の段階で入力された特徴量をある距離に従って、自動的にクラスタリングする。ここでの特徴データは時間的に変動するスペクトルや波形のエネルギーを考えた。そこである程度の時間幅を持たせたデータセットを処理対象として、これを時間領域のマルコフ性を表現できる多層のクラスタリングネットワークによって自己組織化し、クラスタリングを行った。このクラスタリングは時間の変動に追従でき、さらに学習データに対して誤認識を生じさせない程度にクラスタの生成・融合・消滅を行なえるような評価基準をもたせている。その結果として未知データに対する汎化能力を向上させた。実験の結果、学習に要する時間をかなり短くしても、従来の認識方法と同程度の認識結果を得た。3.導入された設備を利用して連続音声認識の実験を行なった。ここでは、特に学習のデータの正当性を厳しくチェックするために、2次元の時間とスペクトル空間の図形表示が必要になった。そのため、MATLABと関連のソフトウエアツールにより効率よく表示が可能となり、またその評価を最終的には人手によりチェックし、良好な学習データを作成できた。これにより、一般的に使われているATRの音声データを使っての、不特定話者認識は、少ないデータセットに関わらず/b,g,d/に関して、およそ87%となり、従来の手法でもっとも認識率の高い方法として考えられている方法と同程度の認識率を得ている。さらに、従来法に比べ学習の速度が数百倍以上速くなる事を確認した。

1. Initially, the appropriate time-varying sound selection module was developed, and the first stage and the new time-varying accuracy module were designed. The definition of sound, the difference between sound and no sound, the construction of sound The design and simultaneous estimation methods of the estimation are used to indicate the simultaneous development and combination of the characteristics of the unreasonable and time-varying characteristics. 2. Second, the sound of understanding, time to consider the organization of their own design Self-organizing, self-organizing This feature is the time to change the waveform. For example, the time range of the time domain is different from that of the time domain. This is the first time that a person has been identified. The results are unknown. The results of the study, the time required for the study, the methods of the study and the results of the study are obtained at the same level 3. The introduction of sound recognition and implementation of sound recognition This is a very important part of the study of the legitimacy of the two-dimensional space and time. MATLAB and related software can be used to evaluate the effectiveness of the system. ATR's voice and voice data are not recognized by specific speakers, but by 87% of users. The recognition rate is high. The speed of learning is hundreds of times faster than the speed of learning.