权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

位相差のモデルを活用したアドホックマイクロホンアレイ信号処理

使用相位差模型的特设麦克风阵列信号处理

基本信息

批准号：
22KJ2545
负责人：
升山義紀
金额：
$ 1.98万
依托单位：
Tokyo Metropolitan University
依托单位国家：
日本
项目类别：
Grant-in-Aid for JSPS Fellows
财政年份：
2023
资助国家：
日本
起止时间：
2023-03-08 至 2024-03-31
项目状态：
已结题

项目摘要

本研究課題では，分散配置されたスマートフォンやタブレットPCなどの録音機能をもつデバイスを連携させアレイ信号処理を行う，アドホックアレイ信号処理に取り組んでいる．アドホックアレイ信号処理では，各デバイスでのサンプリング周波数のずれにより，通常のアレイでは定常になるマイク間の位相差が非定常になり，これまでのアレイ信号処理がそのままでは適用できなくなるという課題がある．本年度は昨年度に引き続き，位相差の非定常性の原因であるサンプリング周波数のずれの推定・補償に取り組んだ．従来手法はあるマイクロホンのサンプリング周波数を基準とし，各非参照マイクロホンのサンプリング周波数の基準からのずれを個別に推定する．一方，提案手法では，アドホックアレイにおける多チャネル信号全体の確率モデルに基づきすべての非参照マイクロホンのサンプリング周波数を同時推定する．これにより，従来手法では考慮されていなかった非参照マイクロホン間の整合性を最適化の基準に取り入れることができ，推定精度が改善することを確認した．そして，サンプリング周波数の補償を行わないと音源分離の性能が低下する条件においても，提案手法によって音源分離性能を維持できることを確認した．また，アドホックアレイ信号処理の主要な応用先の一つである会議などの議事録作成を見据えて，音声強調・分離と音声認識の統合学習にも取り組んだ．特に，音声認識では近年注目されている自己教師あり学習表現 (SSLR) モデルを活用することで高い性能を実現した．多チャネル音声強調では様々なビームフォーマを比較検討し，WPDビームフォーマとSSLRモデルと組み合わせることで雑音・残響のある様々な環境において特に低い単語誤り率を実現した．

This research topic is based on the decentralized configuration of the audio recorder function of the PC recorder.デバイスを合合させアレイsignal processing を行う, アドホックアレイsignal processing にGETり组んでいる.アドホックアレイsignal processing equipment, each moduleリングcyclic wave number のずれにより, usually のアレイでは constant になるThe phase difference between the mako and the unsteady になり, the これまでのアレイ signal Dealing with the subject matter of the application of がそのままでは. The reason for the non-constancy of the phase difference this year is the same as that of the previous year, and the number of cycles is estimated and compensated. The technique is based on the frequency and frequency of the wave number, and each is not a reference.マイクロホンのサンプリングCycle number standard からのずれをIndividual estimation する. On the one hand, the proposed technique is the same, and the accuracy of the overall signal of the アドホックアレイにおける多ャネル is highルに本づきすべての Non-reference マイクロホンのサンプリングCycle number をsimultaneous estimation する.これにより, the original technique is based on the されていなかったnon-referencing マイクロホン间のintegration The benchmark for optimization is taken into account, and the estimated accuracy is improved and confirmed.そして, サンプリングCycle number compensation を行わないとThe performance of sound source separation is low する strip The file is correct, the proposed method is sound source separation performance is maintained, and the sound source separation performance is maintained and confirmed.また． The recording of events is recorded and documented, and the integrated learning of sound emphasis, separation, and sound recognition is taken and grouped. Special, sound recognition, in recent years, attention has been paid to my own teacher's learning performance (SSLR), and my ability to use it effectively has been high performance. Multi-voice emphasis, WPD sound emphasis, WPD sound emphasis, SSLR soundデルと group み合わせることで雑音・狠声のある様々なEnvironment において特にlow い単语ErrorりRateを実appearsした.

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

End-to-End Integration of Speech Recognition, Dereverberation, Beamforming, and Self-Supervised Learning Representation

DOI：
10.1109/slt54892.2023.10023199
发表时间：
2022-10
期刊：
2022 IEEE Spoken Language Technology Workshop (SLT)
影响因子：
0
作者：
Yoshiki Masuyama;Xuankai Chang;Samuele Cornell;Shinji Watanabe;Nobutaka Ono
通讯作者：
Yoshiki Masuyama;Xuankai Chang;Samuele Cornell;Shinji Watanabe;Nobutaka Ono

Joint Optimization of Sampling Rate Offsets Based on Entire Signal Relationship Among Distributed Microphones