Improving audio-visual speech recognition with augmented facial-mapping.

通过增强面部映射改进视听语音识别。

基本信息

  • 批准号:
    1964209
  • 负责人:
  • 金额:
    --
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Studentship
  • 财政年份:
    2017
  • 资助国家:
    英国
  • 起止时间:
    2017 至 无数据
  • 项目状态:
    已结题

项目摘要

Research questions:Can audio-visual speech recognition be improved through the augmentation of emerging facial mapping technology?Can the application of real-time 3D face mapping and sound compartmentalisation improve audio-visual speech recognition accuracy?Potential applications At the time of writing, no known research exists in the use of the TrueDepth camera's facial recognition for audio-visual speech recognition. This may be due to the infancy of the technology. The potential applications for an improved integrated audio-visual speech recognition system are: Improved human computer interaction for AI systems.A cheaper means of autonomous speech therapy.Language learning.Objectives and AimsThis research will focus on machine learning principles to develop a more effective end-to-end solution for speech and facial (visual speech) recognition algorithms. This will then be used to improve human accuracy and communication in these areas, through a precise feedback engine. The objective is to effectively integrate the use of the latest infrared and proximity sensors used for real-time face mapping, to improve audio-visual speech recognition.MethodologyAs this research is inherently interdisciplinary between computer science and linguistics this paper will first investigate current deep learning audio-visual speech recognition methodologies and broader historical speechreading and natural language processing techniques. This paper will then explore the individual accuracy of Apple's TrueDepth camera in terms of its potential application for visual speech recognition. The TrueDepth system is primarily used for facial recognition and animation, and is essentially the same technology contained within Microsoft's 3D tracking Connect accessory. This has since been miniaturised and improved by a middleware layer of machine learning software, to achieve the real-time mapping and articulation of 37 facial features with millimetre accurately. This research will first test the TrueDepth camera's recognition accuracy of a set visemes (visual phonemes) by recording a large native language learning dataset and iterating through a supervised deep learning algorithm. Once an acceptable level of viseme recognition accuracy is achieved, this will then be combined with an existing audio-based speech recognition engine. The final stage will assess whether the augmentation of the TruDepth camera system will result in a statistically viable improvement, when tested against standalone speech recognition engines.
研究问题:视听语音识别是否可以通过新兴的面部映射技术的增强来改进?实时3D人脸映射和声音划分的应用能否提高视听语音识别的准确性?在撰写本文时,还没有已知的研究将TrueDepth相机的面部识别用于视听语音识别。这可能是由于该技术的婴儿期。改进的集成视听语音识别系统的潜在应用是:改善人工智能系统的人机交互。自主语音治疗的更便宜的手段。语言学习。目标和目的本研究将专注于机器学习原理,为语音和面部(视觉语音)识别算法开发更有效的端到端解决方案。然后,通过精确的反馈引擎,这将用于提高这些领域的人类准确性和沟通。我们的目标是有效地整合使用最新的红外线和接近传感器用于实时人脸映射,以提高视听语音recognition.MethodologyAs这项研究本质上是计算机科学和语言学之间的跨学科本文将首先调查当前的深度学习视听语音识别方法和更广泛的历史语音阅读和自然语言处理技术。本文将探讨苹果TrueDepth摄像头在视觉语音识别方面的潜在应用。TrueDepth系统主要用于面部识别和动画,基本上与微软的3D跟踪连接配件中包含的技术相同。此后,机器学习软件的中间件层对这一点进行了改进和改进,以实现37个面部特征的实时映射和精确表达。这项研究将首先通过记录一个大型的母语学习数据集并通过监督式深度学习算法迭代来测试TrueDepth相机对一组视位(视觉音素)的识别准确性。一旦达到可接受的视位识别准确度水平,这将与现有的基于音频的语音识别引擎相结合。最后阶段将评估TruDepth相机系统的增强是否会在与独立语音识别引擎进行测试时带来统计上可行的改进。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

其他文献

Internet-administered, low-intensity cognitive behavioral therapy for parents of children treated for cancer: A feasibility trial (ENGAGE).
针对癌症儿童父母的互联网管理、低强度认知行为疗法:可行性试验 (ENGAGE)。
  • DOI:
    10.1002/cam4.5377
  • 发表时间:
    2023-03
  • 期刊:
  • 影响因子:
    4
  • 作者:
  • 通讯作者:
Differences in child and adolescent exposure to unhealthy food and beverage advertising on television in a self-regulatory environment.
在自我监管的环境中,儿童和青少年在电视上接触不健康食品和饮料广告的情况存在差异。
  • DOI:
    10.1186/s12889-023-15027-w
  • 发表时间:
    2023-03-23
  • 期刊:
  • 影响因子:
    4.5
  • 作者:
  • 通讯作者:
The association between rheumatoid arthritis and reduced estimated cardiorespiratory fitness is mediated by physical symptoms and negative emotions: a cross-sectional study.
类风湿性关节炎与估计心肺健康降低之间的关联是由身体症状和负面情绪介导的:一项横断面研究。
  • DOI:
    10.1007/s10067-023-06584-x
  • 发表时间:
    2023-07
  • 期刊:
  • 影响因子:
    3.4
  • 作者:
  • 通讯作者:
ElasticBLAST: accelerating sequence search via cloud computing.
ElasticBLAST:通过云计算加速序列搜索。
  • DOI:
    10.1186/s12859-023-05245-9
  • 发表时间:
    2023-03-26
  • 期刊:
  • 影响因子:
    3
  • 作者:
  • 通讯作者:
Amplified EQCM-D detection of extracellular vesicles using 2D gold nanostructured arrays fabricated by block copolymer self-assembly.
使用通过嵌段共聚物自组装制造的 2D 金纳米结构阵列放大 EQCM-D 检测细胞外囊泡。
  • DOI:
    10.1039/d2nh00424k
  • 发表时间:
    2023-03-27
  • 期刊:
  • 影响因子:
    9.7
  • 作者:
  • 通讯作者:

的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('', 18)}}的其他基金

An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
  • 批准号:
    2901954
  • 财政年份:
    2028
  • 资助金额:
    --
  • 项目类别:
    Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
  • 批准号:
    2896097
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
  • 批准号:
    2780268
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
  • 批准号:
    2908918
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
  • 批准号:
    2908693
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
  • 批准号:
    2908917
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
  • 批准号:
    2879438
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
  • 批准号:
    2890513
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
CDT year 1 so TBC in Oct 2024
CDT 第 1 年,预计 2024 年 10 月
  • 批准号:
    2879865
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
  • 批准号:
    2876993
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship

相似海外基金

Multisensory Augmented Reality as a bridge to audio-only accommodations for inclusive STEM interactive digital media
多感官增强现实作为包容性 STEM 交互式数字媒体的纯音频住宿的桥梁
  • 批准号:
    10693600
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
EduSay™ - developing a digital, audio-visual and kinesthetic English pronunciation training programme for international students and professionals; upskilling communications for education, employability, UK productivity and integration
EduSay™ - 为国际学生和专业人士开发数字、视听和动觉英语发音培训计划;
  • 批准号:
    10063001
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Collaborative R&D
Empowering Archivists: Applying New Tools and Approaches for Better Representation of Women in Audio-Visual Collections
赋予档案管理员权力:应用新工具和方法在音像收藏中更好地代表女性
  • 批准号:
    AH/Y007328/1
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Research Grant
User-centric Audio-Visual Scene Understanding for Augmented Reality Smart Glasses in the Wild
以用户为中心的野外增强现实智能眼镜的视听场景理解
  • 批准号:
    23K16912
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Maps as a service: A systematic approach to the production of tactile and audio/vibrational maps for visually impaired users
地图即服务:为视障用户制作触觉和音频/振动地图的系统方法
  • 批准号:
    10720207
  • 财政年份:
    2023
  • 资助金额:
    --
  • 项目类别:
Audio-visual poetics for the environmental pollutions: A research on the documentaries and expressions of "Kogai" films
环境污染的视听诗学——“小外”电影的纪录片与表达研究
  • 批准号:
    22H00613
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Using eye tracking to examine audio-visual rhythm perception in infants
使用眼动追踪检查婴儿的视听节律感知
  • 批准号:
    572614-2022
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    University Undergraduate Student Research Awards
Emotional McGurk: Developing a novel tool to examine audio-visual integration of affective signals
Emotional McGurk:开发一种新颖的工具来检查情感信号的视听整合
  • 批准号:
    574638-2022
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    University Undergraduate Student Research Awards
Neural Rendering of object-based audio-visual scenes
基于对象的视听场景的神经渲染
  • 批准号:
    2644080
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Studentship
Ghosts amongst us: an audio-visual exploration of haunting in Palestine
我们身边的鬼魂:对巴勒斯坦闹鬼事件的视听探索
  • 批准号:
    2733997
  • 财政年份:
    2022
  • 资助金额:
    --
  • 项目类别:
    Studentship
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了