权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

Collaborative Research: ITR - [ASE+ECS] - [dmc+int]: DDDAS - Advances in recognition and interpretation of human motion: An Integrated Approach to ASL Recognition

合作研究：ITR - [ASE ECS] - [dmc int]：DDDAS - 人体运动识别和解释的进展：ASL 识别的综合方法

基本信息

批准号：
0428231
负责人：
Dimitris Metaxas
金额：
$ 84.98万
依托单位：
Rutgers University New Brunswick
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2004
资助国家：
美国
起止时间：
2004-10-15 至 2008-09-30
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=0428231&HistoricalAwards=false
关键词：
Collaborative Research ITR ASE+ECS dmc+int

项目摘要

NSF ITR title: ITR -[ASE+ECS] - [dmc+int]:DDDAS Advances in Recognition and Interpretation of Human Motion:An Integrated Approach to ASL RecognitionThis project is aimed at advancing the state of the art in the field of computer-based American Sign Language (ASL) recognition. To date, sign language recognition has focused primarily on detecting individual signs (words), articulated primarily with the arms and hands. This is a major limitation, given that critical linguistic information-including grammatical features such as negation, agreement, and question status-is conveyed through "non-manual" linguistic markings. These non-manual markings include facial expressions (such as raised or lowered eyebrows, varying gaze and aperture of the eyes, wrinkling of the nose, and mouth movements) and gestures or periodic movements of the head (such as tilts, nods, and shakes). No system for sign language recognition or generation can succeed without properly modeling the linguistic information produced both manually and non-manually.The fact that these critical non-manual behaviors occur in parallel with manual signing, and that they are temporally aligned with phrases rather than with individual signs, greatly complicates the task. Further problems arise because of the difficulties of tracking the minute details of human facial movements from video, and the variations in the specific realizations (style) of manual signs and non-manual linguistic markings across different individuals, just as there is variation in the specific ways in which individuals produce a given spoken language. A comprehensive approach to ASL recognition thus requires the integration of information from multiple data sources with different spatial and temporal scales, the application of linguistic knowledge about both the manual and the non-manual aspects of ASL, and the modeling of interdependencies of activities in the manual and non-manual channels.This collaborative project brings together the expertise of researchers in the fields of computer vision, linguistics, and recognition to achieve its goals. On the computer vision side, the principal investigators (PIs) will investigate the use of local free-form deformations and novel registration methods to enhance our existing face tracking software, so as to capture the minute details of the facial movements, and to improve robustness of the tracking. The tracking process results in a large number of facial parameters, which the researchers propose to reduce through nonlinear subspace manifold embedding. This embedding reduces the dimensionality of the parameter space, and more importantly, also results in a separation of style and content. Whereas style is specific to each signer, content captures the commonalities across all signers. Hence, by focusing on the content component, the PIs expect to be able to overcome the variations across signers and perform signer-independent recognition.On the recognition side, the researhcers will combine the linguistic knowledge about facial microactions with computational clustering approaches to develop the necessary statistical models for recognition. Initially, these will be based on Hidden Markov Models from previous work by this research group and elsewhere; however, their power to describe the dynamical aspects of human movements is limited. To overcome these limitations, the PIs will research the use of Switching Linear Dynamic Systems, augmented by Coupled Dynamic Bayesian Networks to model and capture the interactions of the simultaneously occurring microactions. Linguists and computer scientists will collaborate in exploring the best ways to leverage information about the linguistic organization of ASL for improvement of recognition strategies.This research will be performed on the existing linguistically annotated corpus of the National Center for Sign Language and Gesture Resources, as well as new data to be collected from 5-8 native ASL signers, which will also be annotated over the course of the project. The annotations will be used for the linguistic modeling; they also provide the "ground truth" for performing and validating the computer vision and recognition research.Broader impact: The computer-based techniques for ASL can be extended to more general systems for sign language recognition and generation, as well as for interpretation of other types of human movements, such as face gesture recognition for HCI, surveillance, verification of identity, interrogation, interviews and medical diagnosis applications. The materials to be distributed will benefit researchers in linguistics, computer science, and other domains. There are immediate applications for primary and secondary education of the deaf and training of sign language interpreters. Improvements in multimedia (linguistic) information technology promise to offer expanded employment possibilities for the deaf, as well as improved access to vocational and post-secondary education. Finally, this project itself will provide a huge boost in terms of education, awareness and encouragement of deaf students in enabling them to work on cutting-edge research that directly affects them and their community.

NSF ITR标题：ITR -[ASE+ECS] - [dmc+int]：DDDAS人类运动识别和解释的进展：ASL识别的综合方法该项目旨在推进基于计算机的美国手语（ASL）识别领域的最新技术。迄今为止，手语识别主要集中在检测主要用手臂和手表达的单个符号（单词）。这是一个主要的限制，因为关键的语言信息，包括语法特征，如否定，协议，和问题的地位，是通过“非手动”的语言标记传达。这些非手动标记包括面部表情（如眉毛升高或降低，眼睛的凝视和孔径变化，鼻子的摆动和嘴部运动）和头部的姿势或周期性运动（如倾斜，倾斜和摇晃）。没有一个手语识别或生成系统能够成功地对手动和非手动产生的语言信息进行正确建模，这些关键的非手动行为与手动手语同时发生，并且它们在时间上与短语而不是与单个符号对齐，这一事实大大增加了任务的复杂性。由于难以从视频中跟踪人类面部运动的微小细节，以及不同个体之间的手动符号和非手动语言标记的具体实现（风格）的变化，出现了进一步的问题，就像个体产生给定口语的具体方式存在变化一样。因此，一种全面的ASL识别方法需要整合来自不同空间和时间尺度的多个数据源的信息，应用关于ASL的手动和非手动方面的语言学知识，以及对手动和非手动渠道中活动的相互依赖性进行建模。这个合作项目汇集了计算机视觉，语言学，和认可来实现其目标。在计算机视觉方面，首席研究员（PI）将研究使用局部自由变形和新颖的注册方法来增强我们现有的面部跟踪软件，以捕捉面部运动的细节，并提高跟踪的鲁棒性。跟踪过程会产生大量的面部参数，研究人员建议通过非线性子空间流形嵌入来减少这些参数。这种嵌入降低了参数空间的维度，更重要的是，还导致了风格和内容的分离。虽然风格是特定于每个签名者的，但内容捕获了所有签名者的共性。因此，通过专注于内容组件，PI期望能够克服签名者之间的差异并执行签名者无关的识别。在识别方面，研究人员将联合收割机结合面部微动作的语言学知识和计算聚类方法，以开发识别所需的统计模型。最初，这些将基于该研究小组和其他地方以前的工作中的隐马尔可夫模型;然而，它们描述人类运动动态方面的能力是有限的。为了克服这些限制，PI将研究使用切换线性动态系统，通过耦合动态贝叶斯网络来建模和捕获同时发生的微动作的相互作用。语言学家和计算机科学家将合作探索最佳方式来利用关于美国手语的语言组织的信息来改进识别策略。这项研究将在国家手语和手势资源中心现有的语言注释语料库上进行，以及从5-8个本地美国手语签名者那里收集的新数据，这些数据也将在项目过程中进行注释。注释将用于语言建模;它们还为执行和验证计算机视觉和识别研究提供“地面实况”。用于ASL的基于计算机的技术可以扩展到用于手语识别和生成的更通用的系统，以及用于解释其他类型的人类运动的系统，例如用于HCI的面部姿势识别，监视，核实身份、讯问、面谈和医疗诊断申请。分发的材料将使语言学、计算机科学和其他领域的研究人员受益。目前正在申请聋人中小学教育和手语翻译培训。多媒体（语言）信息技术的改进有可能为聋人提供更多的就业机会，并增加他们获得职业教育和中学后教育的机会。最后，该项目本身将在教育、认识和鼓励聋人学生方面提供巨大的推动，使他们能够从事直接影响他们和他们社区的前沿研究。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Dimitris Metaxas其他文献

A frame-based model for large manufacturing databases

DOI：
10.1007/bf01471336
发表时间：
1991-02-01
期刊：
JOURNAL OF INTELLIGENT MANUFACTURING
影响因子：
7.400
作者：
Dimitris Metaxas;Timos Sellis
通讯作者：
Timos Sellis

Algorithmic issues in modeling motion

运动建模中的算法问题

DOI：
10.1145/592642.592647
发表时间：
2002
期刊：
ACM Comput. Surv.
影响因子：
0
作者：
Pankaj K. Agarwal;Leonidas J. Guibas;H. Edelsbrunner;Jeff Erickson;M. Isard;Sariel Har;J. Hershberger;Christian Jensen;L. Kavraki;Patrice Koehl;Ming Lin;Dinesh Manocha;Dimitris Metaxas;Brian Mirtich;David Mount;S. Muthukrishnan;Dinesh Pai;E. Sacks;J. Snoeyink;Subhash Suri;Ouri E. Wolfson;Merl Mirtich@merl Com
通讯作者：
Merl Mirtich@merl Com

A combustion-based technique for fire animation and visualization

DOI：
10.1007/s00371-007-0162-3
发表时间：
2007-06-28
期刊：
VISUAL COMPUTER
影响因子：
2.900
作者：
Kyungha Min;Dimitris Metaxas
通讯作者：
Dimitris Metaxas

Multi-Stage Feature Fusion Network for Video Super-Resolution

用于视频超分辨率的多级特征融合网络

DOI：
10.1109/tip.2021.3056868
发表时间：
2021-02
期刊：
IEEE Transactions on Image Processing
影响因子：
10.6
作者：
Huihui Song;Wenjie Xu;Dong Liu;Bo Liu;Qingshan Liu;Dimitris Metaxas
通讯作者：
Dimitris Metaxas

The Traffic Calming Effect of Delineated Bicycle Lanes

划定自行车道的交通平静效果

DOI：
发表时间：
2024
期刊：
Journal of Urban Mobility
影响因子：
0
作者：
Hannah Younes;Clinton Andrews;Robert B. Noland;Jiahao Xia;Song Wen;Wenwen Zhang;Dimitris Metaxas;Leigh Ann Von Hagen;Jie Gong
通讯作者：
Jie Gong