权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

複数のバイオメトリクス個人情報を利用したロバストな話者認識手法に関する研究

利用多种个人生物特征信息的鲁棒说话人识别方法研究

基本信息

批准号：
14780274
负责人：
宮島千代美
金额：
$ 1.47万
依托单位：
Nagoya University (2003)Nagoya Institute of Technology (2002)
依托单位国家：
日本
项目类别：
Grant-in-Aid for Young Scientists (B)
财政年份：
2002
资助国家：
日本
起止时间：
2002 至 2003
项目状态：
已结题

项目摘要

本研究では,音声や行動様式に含まれる個人性を利用したバイオメトリクス個人認識について以下の検討を行った.(1)混合因子分析に基づく話者モデルのパラメータの共有構造について我々は前年度までに,話者認識のモデルを混合因子分析に基づいて構築することによって,従来の混合正規分布に基づく話者モデルに比べて高い認識性能が得られることを報告した.本年度は,この混合因子分析における共分散行列のパラメータの共有方法の違いについて検討した.混合因子分析における共分散行列の因子負荷量,もしくは対角成分のパラメータを混合要素間で共有する場合と,パラメータを共有しない場合の三つの方法について比較した結果,対角要素のパラメータのみを共有する場合に最も良い認識結果が得られた.また,比較的小さい因子数でも高い認識性能が得られることがわかった.(2)最小分類誤り学習による話者モデルのオンライン学習法について音声で人を識別する家庭用ペットロボットのためのオンライン話者識別学習について検討した.ロボットが話者を誤って識別した場合に,不正解であるという情報のみがロボットへフィードバックされる状況を想定し,不正解であるという情報を有効利用するための最小分類誤り学習を提案した.実験の結果,不正解の情報を利用しない場合に比べ,約1.5倍の速度で学習ができることがわかった.また,過去に入力された音声データを複数まとめて,再度学習に利用することによって,より高速な学習が可能であることがわかった.(3)運転行動信号を用いた個人認識について自動車のアクセルやブレーキ,ハンドル操作などの運転行動に表れる個人性を利用して運転者を認識できれば,運転者に合わせた運転支援や車内環境の自動設定などへの応用が期待できる.アクセルペダル・ブレーキペダル踏力の分布を混合正規分布でモデル化し,30名の運転者の認識実験を行った結果,アクセル,もしくはブレーキのみでは30%程度の識別率であったのに対し,これらの信号の時間変化を動的特徴量として加え,さらにアクセルとブレーキを組み合わせて用いることにより73%まで識別率が向上した.また,動的特徴量を求める時間窓幅について検討した結果,800ms程度が最も有効であることがわかった.識別実験に加えて照合実験も行ったが,識別実験と同様の特徴量が有効であり,8%の等誤り率が得られた.運転行動信号を用いた個人認識の研究はこれまでに報告されていないが,本研究によってその可能性が示された.

This study discusses the following issues: voice, voice, behavior, and personality. (1)Mixed factor analysis is used to analyze the common structure of the basic speaker, the speaker recognition, the mixed factor analysis is used to analyze the common structure of the basic speaker, the mixed normal distribution is used to analyze the common structure of the basic speaker, the speaker recognition, the speaker recognition, the mixed normal distribution is used to analyze the common structure of the basic speaker, the speaker recognition, the speaker recognition. This year, the mixed factor analysis was conducted to investigate the violation of the common method. Mixed factor analysis is used to analyze the factor load of the co-dispersed array. In the case where the angle component is shared among the mixed elements, the method of three sets of methods is used to compare the results. In the case where the angle component is shared among the mixed elements, the best recognition result is obtained. The number of small factors in comparison is high, and cognitive performance is high. (2)Minimum classification error: learning from speaker: learning from speaker: In the case of incorrect identification of the speaker, the incorrect solution of the information is determined, and the incorrect solution of the information is used to determine the minimum classification error. As a result, the learning speed is about 1.5 times faster than that of the correct solution information. In the past, the sound of the voice was mixed, and the use of high speed learning was possible. (3)The movement action signal is used in the personal recognition of the automatic vehicle, the operation of the vehicle action table is used in the personal recognition of the operator, the operator is combined with the movement support and the automatic setting of the vehicle interior environment. The results show that the recognition rate of 30 operators is 30%, and the recognition rate of the time-varying characteristics of the signal is 73%. The characteristic quantity of motion is calculated by the time range. The result of the investigation is that the degree of 800ms is the most important. The identification of the same characteristics is effective, and the error rate of 8% is obtained. The study of personal cognition in the use of mobile action signals is reported in this paper, and the possibility of this study is shown.

项目成果

期刊论文数量（5）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

山本啓善, 南角吉彦, 宮島千代美, 徳田恵一, 北村正: "混合因子分析に基づく話者識別モデルのパラメータ共有構造"情報処理学会音声言語処理研究会研究報告. vol.2003 no.124. 91-96 (2003)

Keizen Yamamoto、Yoshihiko Minamikado、Chiyomi Miyajima、Keiichi Tokuda、Tadashi Kitamura：“基于混合因素分析的说话人识别模型的参数共享结构”日本信息处理学会语音和语言处理研究小组研究报告。2003年第124卷。 91- 96 (2003)