Eye Gaze in Salience Modeling for Robust Spoken Language Understanding
用于鲁棒口语理解的显着性建模中的眼睛注视
基本信息
- 批准号:0535112
- 负责人:
- 金额:--
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2005
- 资助国家:美国
- 起止时间:2005-11-15 至 2009-10-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
In spoken dialog systems, interpreting user speech input is still a significant challenge due to limited speech recognition and language understanding performance. This problem is further amplified if a user has an accent or is speaking in a noisy environment. However, previous research has shown that, in multimodal systems, fusing two or more information sources can be an effective means of reducing recognition uncertainties, for example through mutual disambiguation. Inspired by earlier work on multimodal systems, in this project the PI will investigate the role of eye gaze in human machine conversation, in particular in salience modeling for robust spoken language understanding. Cognitive studies have shown that human eye gaze is one of the reliable indicators of what a person is "thinking about." Specifically, eye gaze is tightly linked to human language processing. Previous psycholinguistic work has shown that almost immediately after hearing a word, the eyes move to the corresponding real-world referent. And right before speaking a word, the eyes also move to the mentioned object. Not only is eye gaze highly reliable, it is also an implicit, subconscious reflex of speech. The user does not need to make a conscious decision; the eye automatically moves towards the relevant object, without the user even being aware. Motivated by these psycholinguistic findings, the PI's hypothesis is that during human machine conversation user eye gaze information coupled with conversation context can signal a part of the physical world (related to the domain and the graphical interface) that is most salient at each point of communication, thus it can potentially be used to tailor the interpretation of speech input. Based on this hypothesis, the PI will seek to improve spoken language understanding in conversational interfaces through a new salience-based framework with two objectives: (1) To better understand the role of eye gaze in human language production and its implications in salience modeling for automated input interpretation; and (2) To develop algorithms and systems that apply computational gaze based salience modeling to robust spoken language understanding. These objectives will be pursued in the following four directions: (a) Investigation of the utility of human eye gaze and its implications for salience modeling during human machine conversation through psycholinguistic studies; (b) Development of computational salience models that integrate eye gaze with conversation context to automatically identify a salient part of the physical world at each point of communication; (c) Development of approaches that apply the new salience models to constrain the hypothesis space for robust spoken language understanding; and (d) Evaluation of the generality of the new approaches in two different applications: an interior design/training application based on a 3D rendered interface, and an information seeking application using a 2D map-based interface.Broader Impacts: The technologies to be developed in this interdisciplinary project can be applied to many applications such as virtual training systems where users can see the interface and talk to the computer system at the same time. The technologies will benefit a variety of diverse users, and particularly individuals who are unable to interact with graphical interfaces with their hands (e.g., motion disabled users). Since one major application area of the work is e-training and e-learning, the education and outreach impact of the proposed research is potentially profound; the PI will make specific efforts to transfer the research results into classrooms. The project will also provide a unique opportunity for students in Computer Science, Psychology, and Cognitive Science to work together, and thus will synergize multidisciplinary research activities at Michigan State University.
在口语对话系统中,由于有限的语音识别和语言理解性能,解释用户语音输入仍然是一个重大挑战。 如果用户有口音或在嘈杂的环境中说话,则该问题进一步放大。 然而,先前的研究表明,在多模态系统中,融合两个或更多个信息源可以是减少识别不确定性的有效手段,例如通过相互消除歧义。 受多模态系统早期工作的启发,在这个项目中,PI将研究眼睛注视在人机对话中的作用,特别是在鲁棒口语理解的显着性建模中。 认知研究表明,人类的眼睛注视是一个人“在想什么”的可靠指标之一。“具体来说,眼睛注视与人类语言处理密切相关。 先前的心理语言学研究表明,在听到一个单词后,眼睛几乎立即会移动到相应的现实世界所指。 就在说话之前,眼睛也会移动到提到的物体上。 目光不仅是高度可靠的,它也是一种隐含的、潜意识的言语反射。 用户不需要做出有意识的决定;眼睛会自动移向相关对象,而用户甚至没有意识到。 受这些心理语言学研究结果的启发,PI的假设是,在人机对话过程中,用户的眼睛注视信息加上对话上下文可以表示物理世界的一部分(与域和图形界面相关),这在每个通信点上都是最突出的,因此它可以潜在地用于定制语音输入的解释。 基于这一假设,PI将寻求通过一个新的基于显着性的框架来提高会话界面中的口语理解,该框架有两个目标:(1)更好地理解眼睛注视在人类语言产生中的作用及其在自动输入解释显着性建模中的意义;以及(2)开发将基于计算注视的显著性建模应用于鲁棒口语理解的算法和系统。 这些目标将在以下四个方向上实现:(a)通过心理语言学研究,调查人眼注视的效用及其对人机对话期间突显性建模的影响;(B)开发计算突显性模型,将眼睛注视与对话背景相结合,以自动识别每个通信点的物理世界的显著部分;(c)开发应用新的显著性模型来约束假设空间的方法,以实现强大的口语理解;以及(d)评价新方法在两种不同应用中的普遍性:基于3D渲染界面的室内设计/培训应用程序,以及使用基于2D地图界面的信息搜索应用程序。在这个跨学科项目中开发的技术可以应用于许多应用,例如虚拟培训系统,用户可以同时看到界面并与计算机系统交谈。 这些技术将使各种各样的用户受益,特别是那些无法用手与图形界面交互的人(例如,运动残疾用户)。 由于工作的一个主要应用领域是电子培训和电子学习,拟议研究的教育和推广影响可能是深远的; PI将作出具体努力,将研究成果转移到教室。该项目还将为计算机科学,心理学和认知科学的学生提供一个独特的机会,共同努力,从而将协同密歇根州立大学的多学科研究活动。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Joyce Chai其他文献
Improving Coherence of Language Model Generation with Latent Semantic State
提高语言模型生成与潜在语义状态的一致性
- DOI:
- 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Amanda Askell;Yuntao Bai;Anna Chen;Dawn Drain;Deep Ganguli;T. Henighan;Andy Jones;Benjamin Mann;Nova Dassarma;Nelson El;Zac Hatfield;Danny Hernandez;John Kernion;Kamal Ndousse;Catherine Olsson;Dario Amodei;Tom Brown;J. Clark;Sam Mc;Chris Olah;Jared Kaplan;Nick Ryder;Jared D Subbiah;Prafulla Kaplan;A. Dhariwal;P. Neelakantan;Girish Shyam;Amanda Sastry;Sandhini Askell;Ariel Agarwal;Herbert;Gretchen Krueger;R. Child;Aditya Ramesh;Daniel M. Ziegler;Jeffrey Wu;Christopher Winter;Mark Hesse;Eric Chen;Mateusz Sigler;Scott teusz Litwin;Benjamin Gray;Jack Chess;Christopher Clark;Sam Berner;Alec McCandlish;Ilya Radford;Sutskever Dario;Amodei;Joshua Maynez;Shashi Narayan;Bernd Bohnet;Kurt Shuster;Spencer Poff;Moya Chen;Douwe Kiela;Shane Storks;Qiaozi Gao;Yichi Zhang;Joyce Chai;Niket Tandon;Keisuke Sakaguchi;Bhavana Dalvi;Dheeraj Rajagopal;Peter Clark;Michal Guerquin;Kyle Richardson;Eduard H. Hovy;A. Dataset;Rowan Zellers;Ari Holtzman;Matthew E. Peters;Roozbeh Mottaghi;Aniruddha Kembhavi;Ali Farhadi;Chunting Zhou;Graham Neubig;Jiatao Gu;Mona Diab;Francisco Guzmán;Luke Zettlemoyer - 通讯作者:
Luke Zettlemoyer
A pilot study of pre-operative misoprostol in reducing operative blood loss during hysterectomy
- DOI:
10.1016/j.ejogrb.2011.03.023 - 发表时间:
2011-09-01 - 期刊:
- 影响因子:
- 作者:
Joyce Chai;Edmund Hon;Chiu-Fai Li;Ting-Chung Pun;Shu-Biu Yeung;Pak-Chung Ho - 通讯作者:
Pak-Chung Ho
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
3D-GRAND:为 3D-LLM 提供的百万级数据集,具有更好的基础和更少的幻觉
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Jianing Yang;Xuweiyi Chen;Nikhil Madaan;Madhavan Iyengar;Shengyi Qian;D. Fouhey;Joyce Chai - 通讯作者:
Joyce Chai
Continuing Medical Education Postmenopausal Bleeding
继续医学教育绝经后出血
- DOI:
- 发表时间:
2012 - 期刊:
- 影响因子:0
- 作者:
Joyce Chai;Vincent YT Cheung - 通讯作者:
Vincent YT Cheung
BAD: BiAs Detection for Large Language Models in the context of candidate screening
BAD:候选筛选背景下大型语言模型的 BiAs 检测
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
N. Koh;Joseph Plata;Joyce Chai - 通讯作者:
Joyce Chai
Joyce Chai的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Joyce Chai', 18)}}的其他基金
NRI: INT: COLLAB: Collaborative Task Planning and Learning through Language Communication in a Human-Robot Team
NRI:INT:COLLAB:人机团队中通过语言交流进行协作任务规划和学习
- 批准号:
1949634 - 财政年份:2019
- 资助金额:
-- - 项目类别:
Standard Grant
NRI: INT: COLLAB: Collaborative Task Planning and Learning through Language Communication in a Human-Robot Team
NRI:INT:COLLAB:人机团队中通过语言交流进行协作任务规划和学习
- 批准号:
1830244 - 财政年份:2018
- 资助金额:
-- - 项目类别:
Standard Grant
RI: Small: Extending Verb Semantics with Causality towards Physical World
RI:小:将动词语义与因果关系扩展到物理世界
- 批准号:
1617682 - 财政年份:2016
- 资助金额:
-- - 项目类别:
Standard Grant
WORKSHOP: Student Consortium at the 2014 ACM Conference on Intelligent User Interfaces
研讨会:学生联盟参加 2014 年 ACM 智能用户界面会议
- 批准号:
1415879 - 财政年份:2013
- 资助金额:
-- - 项目类别:
Standard Grant
NRI-Small: Contextually Grounded Collaborative Discourse for Mediating Shared Basis in Situated Human Robot Dialogue
NRI-Small:基于情境的协作话语,用于调解情境人类机器人对话中的共享基础
- 批准号:
1208390 - 财政年份:2012
- 资助金额:
-- - 项目类别:
Standard Grant
EAGER: Shared Gaze in Collaborative Referring
EAGER:协作引用中的共同目光
- 批准号:
1050004 - 财政年份:2010
- 资助金额:
-- - 项目类别:
Standard Grant
II-NEW: Towards an Infrastructure for Research on Multimodal Language Processing in Situated Human Robot Dialogue
II-新:构建情景人类机器人对话中多模态语言处理研究的基础设施
- 批准号:
0957039 - 财政年份:2010
- 资助金额:
-- - 项目类别:
Standard Grant
SGER: Collaborative Research: Contextual Machine Translation
SGER:协作研究:上下文机器翻译
- 批准号:
0840538 - 财政年份:2008
- 资助金额:
-- - 项目类别:
Standard Grant
CAREER: Learning and Optimization for Robust Multimodal Interpretation in Conversation Systems
职业:对话系统中稳健的多模态解释的学习和优化
- 批准号:
0347548 - 财政年份:2004
- 资助金额:
-- - 项目类别:
Continuing Grant
相似海外基金
The Material Gaze: Horizontalizing the Anthropos through Film
物质凝视:通过电影水平化人类
- 批准号:
2891808 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Studentship
From lab to math classroom: Utilizing eye gaze and cognitive control tasks to examine the effects of perceptual cues and structure on mathematical performance
从实验室到数学课堂:利用目光注视和认知控制任务来检查感知线索和结构对数学表现的影响
- 批准号:
2320053 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Standard Grant
SBIR Phase II: Gaze-independent contactless autorefractor for self-serve eye exam kiosk.
SBIR 第二阶段:用于自助眼科检查亭的不依赖凝视的非接触式自动验光仪。
- 批准号:
2322305 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Cooperative Agreement
Multimodal analysis using pen input data and gaze data in science and mathematics e-learning
在科学和数学电子学习中使用笔输入数据和注视数据进行多模态分析
- 批准号:
23K17589 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Challenging Research (Exploratory)
Transferring Pharmacists' Safe and Efficient Dispensing Know-How: Identifying Reasons for Success of Proficient Pharmacists Based on Eye Gaze Measurement.
传授药剂师安全高效的配药知识:基于眼睛注视测量确定熟练药剂师成功的原因。
- 批准号:
23K04309 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (C)
Reversing the Gaze: Knowledge Stories and the Struggles for Community Land Rights in Scotland
扭转目光:知识故事和苏格兰社区土地权利的斗争
- 批准号:
ES/X010872/1 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Research Grant
Racial biases in gaze following from infancy to adulthood
从婴儿期到成年期的目光中的种族偏见
- 批准号:
2884597 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Studentship
An Indigenous gaze: looking through and envisioning contemporary Indigenous visual representations in the Andean region
原住民的目光:审视和想象安第斯地区当代原住民的视觉表现
- 批准号:
2784450 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Studentship
A Gaze-enabled 3D Face Morphable Model and Applications for Face Image Synthesis
支持凝视的 3D 人脸变形模型及其人脸图像合成应用
- 批准号:
22KJ0923 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Grant-in-Aid for JSPS Fellows
Propose sightseeing guidance suitable for each tourist by analyzing eye gaze and AI-based preference analysis during web browsing
通过分析网页浏览时的视线和基于人工智能的偏好分析,提出适合每个游客的观光指南
- 批准号:
23K11635 - 财政年份:2023
- 资助金额:
-- - 项目类别:
Grant-in-Aid for Scientific Research (C)