课题基金基金详情
面向复杂实时指挥环境的小样本多语种空管语音识别研究
结题报告
批准号:
62001315
项目类别:
青年科学基金项目
资助金额:
24.0 万元
负责人:
林毅
依托单位:
学科分类:
多媒体信息处理
结题年份:
2023
批准年份:
2020
项目状态:
已结题
项目参与者:
林毅
国基评审专家1V1指导 中标率高出同行96.8%
结合最新热点,提供专业选题建议
深度指导申报书撰写,确保创新可行
指导项目中标800+,快速提高中标率
客服二维码
微信扫码咨询
中文摘要
众所周知,管制通话蕴含丰富的实时交通动态,是空管“人在环路”的集中体现。受管制体制及技术水平影响,管制通话语音具有多人共用信道、多语种通话、复杂无线电噪声和标注困难等问题,其识别方法尚未得到专门的研究。本项目研究复杂空管场景下的小样本多语种语音识别方法:以自监督学习方式解决不同噪声背景和语速下的多语种语音特征表示问题;以优化发音尺度为目标,设计声学词汇表解决多语种混合识别问题;结合特征学习网络和声学模型实现空管场景下从“时域语音信号”到“可读文本”的多语种端到端识别;提出主干网络无监督预训练和整体有监督优化方法以解决小标注样本语料的模型优化问题;提出特征解耦对抗学习框架研究融合语言模型、实时场景语义和上下文语境的动态解码方案。本项目研究可以从解决实时空管指挥场景下的通话语音识别出发丰富空中交通动态感知来源,辅助管制决策过程、支撑相关空管研究以提高智能化水平。
英文摘要
As well known, the pilot-controller communication implies rich real-time traffic dynamics, which is also the concentration of human-in-the-loop in air traffic control systems. Due to the management system and technical level, the automatic speech recognition (ASR) for air traffic control (ATC) has several domain specificities, including shared channel for multi-speakers, multilingual, complex radiotelephony noise, small transcribed samples. Thus, the research on the ASR in ATC domain is still absent. In this work, the multilingual ASR approach with a small corpus in ATC domain is studied by analyzing the domain specificities. Firstly, a self-supervised method is proposed to learn the multilingual speech representations under volatile background noise and speech rate. A dedicated vocabulary is built to integrate multilingual ASR into a same model by optimizing a unified pronunciation scale. By combining the representation learning network and an improved acoustic architecture, an end-to-end model is formulated to translate the raw speech wave into multilingual human-readable texts. An unsupervised method is proposed to pretrain the backbone network of the ASR model, in which a mask strategy is performed on the input speech feature. Finally, the ASR task is accomplished by an integrated optimization of the whole network in a supervised manner. In addition, a disentangled feature adversarial learning framework is also studied to formulate a dynamic decoding algorithm by considering the language model, real-time situational semantics and the contextualized information of ATC communication. The study is able to enrich the approach of sensing traffic dynamics from the perspective of translating real-time ATC speeches in the air traffic environment. The obtained traffic dynamics are capable of making effect decision, supporting air traffic related studies and further enhancing the intelligent air traffic control research.
期刊论文列表
专著列表
科研奖励列表
会议论文列表
专利列表
DOI:10.1016/j.asoc.2021.107847
发表时间:2021-09-05
期刊:APPLIED SOFT COMPUTING
影响因子:8.7
作者:Lin, Yi;Yang, Bo;Zhang, Yi
通讯作者:Zhang, Yi
DOI:10.1016/j.cja.2022.08.020
发表时间:2022-08
期刊:Chinese Journal of Aeronautics
影响因子:5.7
作者:Yi Lin;Min-zhi Ruan;Kunjie Cai;Dan Li;Ziqiang Zeng;Fan Li;Bo Yang
通讯作者:Yi Lin;Min-zhi Ruan;Kunjie Cai;Dan Li;Ziqiang Zeng;Fan Li;Bo Yang
DOI:10.1145/3572792
发表时间:2021-11
期刊:ACM Transactions on Asian and Low-Resource Language Information Processing
影响因子:2
作者:Dongyue Guo;Jianwei Zhang;Bo Yang;Yi Lin
通讯作者:Dongyue Guo;Jianwei Zhang;Bo Yang;Yi Lin
DOI:10.1016/j.inffus.2023.101924
发表时间:2023
期刊:Inf. Fusion
影响因子:--
作者:Zhen Yan;Hongyu Yang;Dongyue Guo;Yi Lin
通讯作者:Yi Lin
DOI:https://doi.org/10.1016/j.knosys.2022.108232
发表时间:2022
期刊:Knowledge-Based Systems
影响因子:--
作者:Jianwei Zhang;Pan Zhang;Dongyue Guo;Yang Zhou;Yuankai Wu;Bo Yang;Yi Lin
通讯作者:Yi Lin
复杂任务驱动下融合视听感知与态势语境的空管语音理解研究
  • 批准号:
    62371323
  • 项目类别:
    面上项目
  • 资助金额:
    50.00万元
  • 批准年份:
    2023
  • 负责人:
    林毅
  • 依托单位:
国内基金
海外基金