权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

音声および顔情報の融合解析に基づく対話型人物認識システムに関する研究

基于语音与面部信息融合分析的交互式人物识别系统研究

基本信息

批准号：
06780358
负责人：
松村雅史
金额：
$ 0.64万
依托单位：
Osaka Electro-Communication University
依托单位国家：
日本
项目类别：
Grant-in-Aid for Encouragement of Young Scientists (A)
财政年份：
1994
资助国家：
日本
起止时间：
1994 至无数据
项目状态：
已结题

项目摘要

本研究では、音声の個人的特徴と発話時の顔の特徴を高精度で抽出し、総合的あるいは選択的に活用し、端末を操作する人物を認識するシステムの開発を研究目的とする。具体的には、複数の視覚センサと音響センサを設置した視聴覚融合センシングシステムの開発,音声生成過程の解析に基づく個人性情報の抽出を研究目的とする。研究成果は以下の通りである。1.視聴覚融合センシングシステムの開発:端末に複数のビデオカメラとマイクロホンを設置した視聴覚融合センシングシステムを開発する。まず、4本のマイクロホンを用いて音源(口唇)位置の推定を行う手法を開発した。本手法はマイクロホンと音源との距離の差に伴う信号の位相差をマイクロホン信号の相互相関関数より推定し、音源位置を同定する。端末から50cm離れた音源位置を誤差2.4cm以内で推定することに成功した。次に音源位置が既知である場合、周囲雑音を含むマイクロホン信号から音源信号を抽出する適応フィルタを開発し、周囲雑音中より話者の発生音抽出のS/Nを向上させた。2.カラー顔画像による口唇位置の推定:カラー顔画像より口唇の位置を推定する手法を提案した。本手法では口唇が肌の色より赤みがかっている点に着目し、カラー顔画像のHSI変換により口唇部の位置を推定する。被験者6名の顔画像を用いて口唇位置推定実験を行った結果、100%の識別結果が得られた。3.音声生成過程の解析に基づく個人性情報の抽出:磁気共鳴映像法(MRI)により歯冠部を含む声道形状の精密測定に初めて成功し、摩擦子音声時の声道形状データを得ることができた。また、子音の明瞭度を決定する舌-口蓋接触面の垂直応力の計測に成功した。さらに声道及び鼻腔の音響特性を推定し、実音声の分析結果と一致すること、このような音声生成過程の解析より得られた特徴量が、個人識別に有効なパラメータであることを示した。

The purpose of this study is to improve the performance of high-precision speech extraction, integrated selection, and end-to-end operation of characters and characters in this study. The specific and complex data are required to change the sound response settings. The audio generation process is based on the analysis of the basic human temperament report and the purpose of the study. The research results are as follows. 1. The end of the system is the number of copies. The end of the system is the number of copies. The end of the configuration is to change the number of copies of the data. The location of the sound source (mouth and lip) is presumed to be used to determine the location of the sound source (lip). In this technique, the sound source is separated from the sound source with the phase difference of the signal, the phase difference of the signal, the number of signals, and the position of the sound source. At the end of the terminal, the sound source location is within the presumption that the sound source location is different from the 2.4cm, and the sound source location is within the 2.4cm. The location of the secondary sound source is known to be close, the sound source signal contains the sound signal, the sound source signal pulls out the sound signal, the sound source signal, the sound signal, the sound two。 The presumption of the position of the lip: the presumption of the position of the lip, the presumption of the position of the lip, the proposal of the technique. In this technique, the mouth and lip muscles are colored, the eyes are fixed, the eyes are fixed, the portrait is HSI, the position of the lip is presumed. The six portraits of the victim used the position of their lips to deduce the results of the experiment, and the results of 100% of the results were satisfactory. 3. Sound generation process analysis basic personal temperament report extraction: magnetic resonance imaging (MRI) precision measurement of sound channel shape is very successful, and when the sound is rubbed, the shape of the sound channel is very accurate. The degree of understanding of the consonant and consonant determines the vertical force of the tongue-mouth contact surface and the success of the calculation. The sound characteristics of the sound channel and nasal cavity are presumed to be presumed, the results of sound analysis are consistent, the process of sound generation is analyzed, and the individual is shown to have a sound profile.