CAREER: Inclusive, Private Mobile Input and Interaction Using Lip Reading
职业:使用唇读进行包容性、私密的移动输入和交互
基本信息
- 批准号:2239633
- 负责人:
- 金额:$ 63.63万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-04-15 至 2028-03-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Speech and whisper input on mobile devices can offer fast and seamless hands-free input and interaction to a wide range of users, including people with low vision and blindness. But there are many scenarios where speech and whisper are not viable due to ambient noise or because of privacy and security concerns, or even simply not to disturb other people. A system that understands speech by visually interpreting lip movements, known as image-based lip reading or silent speech, can mitigate many of these challenges. However, silent speech recognition systems are typically slower and more error prone than common speech recognition models, and they may require hardware that is impractical in real-world scenarios. Hence to date this approach has not been investigated as a serious alternative mode of interaction on mobile devices, and it is unknown how best to design the user interface for silent speech or the types of feedback that can enhance its usability. Silent speech has also not been well studied with people without sight. This research will develop an efficient real-time lip reader that uses the front camera of a mobile device to capture the motion of the lips and interprets that into text. A particular focus is on the design of an intuitive user interface that provides a range of visual, auditory, and tactile feedback to facilitate error free text entry. Even broader impacts will derive from providing access to mobile devices to a wider range of users, such as persons with speech disorders or who are mute. Ultimately, project outcomes could be exploited in virtual reality, automotive user interfaces, and many other systems to increase their usability, privacy, security and accessibility. The real-time lip reader will slice and overlap live video feeds from a mobile camera to recognize one phoneme at a time as the user silently speaks by using a deep 3D convolutional neural network (3D-CNN), a recurrent network, and the connectionist temporal classification loss. It will be augmented with a refiner channel that will detect, auto-correct and provide feedback on both character and word-level errors using deep denoising autoencoder (DDA) and custom language models. A range of auditory and tactile feedback will be developed to facilitate error free input and uninterrupted camera view for people with low vision and blindness. The project will also develop multi-modal error correction approaches by exploiting speech, silent speech, and touch interactions. Finally, it will build a silent speech recognition API for the design and development of accessible mobile input and interaction techniques.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
移动的设备上的语音和耳语输入可以为广泛的用户(包括视力低下和失明的人)提供快速和无缝的免提输入和交互。但在许多情况下,由于环境噪音或隐私和安全问题,甚至只是为了不打扰其他人,语音和耳语是不可行的。通过视觉上解释嘴唇运动来理解语音的系统,称为基于图像的唇阅读或无声语音,可以减轻许多这些挑战。然而,无声语音识别系统通常比常见的语音识别模型更慢并且更容易出错,并且它们可能需要在现实世界场景中不切实际的硬件。因此,到目前为止,这种方法还没有被调查作为一个严重的替代模式的互动移动的设备,它是未知的如何最好地设计用户界面的无声语音或类型的反馈,可以提高其可用性。无声语言也没有被很好地研究过。本研究将开发一种高效的实时唇读器,该唇读器使用移动终端的前置摄像头来捕获嘴唇的运动并将其解释为文本。 一个特别的重点是设计一个直观的用户界面,提供了一系列的视觉,听觉和触觉反馈,以促进无错误的文本输入。向更广泛的用户提供移动的设备,如语言障碍者或哑巴,将产生更广泛的影响。最终,项目成果可以在虚拟现实、汽车用户界面和许多其他系统中得到利用,以提高其可用性、隐私性、安全性和可访问性。实时唇读器将切片和重叠来自移动的摄像头的实时视频馈送,以通过使用深度3D卷积神经网络(3D-CNN)、递归网络和连接主义时间分类损失来在用户无声地说话时一次识别一个音素。它将增加一个细化通道,该通道将使用深度去噪自动编码器(DDA)和自定义语言模型检测,自动纠正并提供字符和单词级错误的反馈。将开发一系列听觉和触觉反馈,以促进低视力和失明人士的无错误输入和不间断的摄像机视图。该项目还将通过利用语音、无声语音和触摸交互来开发多模式纠错方法。最后,它将建立一个无声的语音识别API,用于设计和开发无障碍的移动的输入和交互技术。该奖项反映了NSF的法定使命,并已被认为是值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估的支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Ahmed Arif其他文献
Securitization, Covered Bonds, and Risk Appetite of Bank: Does Skin in the Game Matter?
证券化、担保债券和银行的风险偏好:利益攸关吗?
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Ahmed Arif - 通讯作者:
Ahmed Arif
Factors causing stress among Pakistani working women
造成巴基斯坦职业女性压力的因素
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Ahmed Arif;S. Naveed;R. Aslam - 通讯作者:
R. Aslam
Determinants of Dividend Policy: A Sectoral Analysis from Pakistan
股利政策的决定因素:来自巴基斯坦的行业分析
- DOI:
- 发表时间:
2013 - 期刊:
- 影响因子:0
- 作者:
Ahmed Arif;Fatima Akbarshah - 通讯作者:
Fatima Akbarshah
Deciphering Securitisation and Covered Bonds : Economic analysis and regulations
解读证券化和资产担保债券:经济分析和法规
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Ahmed Arif - 通讯作者:
Ahmed Arif
Road environment characteristics and fatal crash injury during the rush and non-rush hour periods in the U.S: Model testing and cluster analysis.
美国高峰和非高峰时段道路环境特征与致命碰撞伤害:模型测试和聚类分析。
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:3.4
- 作者:
O. Adeyemi;Rajib Paul;E. Delmelle;C. DiMaggio;Ahmed Arif - 通讯作者:
Ahmed Arif
Ahmed Arif的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
Understanding The Political Representation of Men: A Novel Approach to Making Politics More Inclusive
了解男性的政治代表性:使政治更具包容性的新方法
- 批准号:
EP/Z000246/1 - 财政年份:2025
- 资助金额:
$ 63.63万 - 项目类别:
Research Grant
INSPIRE- Intersectional Spaces of Participation: Inclusive, Resilient, Embedded
INSPIRE-交叉参与空间:包容性、弹性、嵌入式
- 批准号:
10106857 - 财政年份:2024
- 资助金额:
$ 63.63万 - 项目类别:
EU-Funded
Moving the dial on economic inactivity to build inclusive futures across Northern Ireland
扭转经济不活跃的局面,在整个北爱尔兰建立包容性的未来
- 批准号:
ES/Y502388/1 - 财政年份:2024
- 资助金额:
$ 63.63万 - 项目类别:
Research Grant
HSI Implementation and Evaluation Project: Blending Socioeconomic-Inclusive Design into Undergraduate Computing Curricula to Build a Larger Computing Workforce
HSI 实施和评估项目:将社会经济包容性设计融入本科计算机课程,以建立更大规模的计算机队伍
- 批准号:
2345334 - 财政年份:2024
- 资助金额:
$ 63.63万 - 项目类别:
Continuing Grant
Securing the Future: Inclusive Cybersecurity Education for All
确保未来:全民包容性网络安全教育
- 批准号:
2350448 - 财政年份:2024
- 资助金额:
$ 63.63万 - 项目类别:
Standard Grant
Research: Characterizing Gendered Socialization of Early Career Civil Engineers to Promote Inclusive Practices and Retention of a Diverse Workforce
研究:表征早期职业土木工程师的性别社会化,以促进包容性实践和保留多元化的劳动力
- 批准号:
2414042 - 财政年份:2024
- 资助金额:
$ 63.63万 - 项目类别:
Standard Grant
NSF Engines Development Award: Accelerating A Just Energy Transition Through Innovative Nature-Inclusive Offshore Wind Farms (CT,DE,MA,MD,NJ,RI,VA)
NSF 发动机开发奖:通过创新的自然包容性海上风电场加速公正的能源转型(康涅狄格州、特拉华州、马里兰州、马里兰州、新泽西州、罗德岛州、弗吉尼亚州)
- 批准号:
2315558 - 财政年份:2024
- 资助金额:
$ 63.63万 - 项目类别:
Cooperative Agreement
RCN: Incubating Infrastructure for Experimentation on Inclusive STEM Teaching Practices
RCN:包容性 STEM 教学实践实验孵化基础设施
- 批准号:
2322330 - 财政年份:2024
- 资助金额:
$ 63.63万 - 项目类别:
Standard Grant
Collaborative Research: GEO OSE Track 2: Project Pythia and Pangeo: Building an inclusive geoscience community through accessible, reusable, and reproducible workflows
合作研究:GEO OSE 第 2 轨道:Pythia 和 Pangeo 项目:通过可访问、可重用和可重复的工作流程构建包容性的地球科学社区
- 批准号:
2324304 - 财政年份:2024
- 资助金额:
$ 63.63万 - 项目类别:
Standard Grant
Household Inclusive Approaches to Domestic Decarbonisation
家庭包容性的国内脱碳方法
- 批准号:
10087514 - 财政年份:2024
- 资助金额:
$ 63.63万 - 项目类别:
Collaborative R&D