权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

HCC: Large Lexicon Gesture Representation, Recognition, and Retrieval

HCC：大型词典手势表示、识别和检索

基本信息

批准号：
0705749
负责人：
Stan Sclaroff
金额：
--
依托单位：
Trustees of Boston University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2007
资助国家：
美国
起止时间：
2007-09-15 至 2011-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=0705749&HistoricalAwards=false
关键词：
HCC Large Lexicon Gesture Representation

项目摘要

It is estimated that American Sign Language (ASL) is used by up to 2 million people in the United States. Yet many resources that are taken for granted by users of spoken languages are not available to users of ASL, given its visual nature and its lack of a standard written form. For instance, when an ASL user encounters an unknown sign, looking it up in a dictionary is not an option. With existing ASL dictionaries one can easily find what sign corresponds to an English word, but not what English word (or, more generally, what meaning) corresponds to a given sign. Another example is searching for computer files or web pages using keywords, which is now a frequent activity for computer users. At present, no equivalent for keyword search exists for ASL. ASL is not a written language, and the closest equivalent of a text document is a video sequence of ASL narration or communication. No tools are currently available for finding video segments in which specific signs occur. The lack of such tools severely restricts content-based access to video libraries of ASL literature, lore, poems, performances, or courses. The core goal of this research is to push towards making such resources available, by advancing the state-of-the-art in vision-based gesture recognition and retrieval. This poses challenging research problems in the areas of computer vision, machine learning, and database indexing. The effort will focus on the following: developing methods for learning models of sign classes, given only a few training examples per sign, by using a decomposition of signs into phonological elements; designing scalable indexing methods for video lexicons of gestural languages that achieve sign recognition at interactive speeds, in the presence of thousands of classes; creating indexing methods for spotting signs appearing in context in an ASL video database; incorporating linguistic constraints to improve performance of both lower-level vision modules, such as hand pose estimation and upper body tracking, and higher-level learning and indexing modules; and explicitly designing methods that can work with error-prone vision modules that often provide inaccurate or ambiguous outputs. The PIs will create two demonstration systems: an ASL lexicon containing a comprehensive database of ASL signs; and a "Sign Language Google" that can search for specific signs in large databases of ASL video content. The systems will be trained and evaluated using thousands of video sequences of signs performed in isolation and in context by native ASL signers. This usage data will be valuable for studying co-articulation effects and context-dependent sign variations. The signs collected will include the full list of ASL signs appearing in the first three years of standard college ASL curricula.Broader Impacts: The methods developed in this project will enable sign-based search of ASL literature, lore, poems, performances, courses, from digital video libraries and DVDs, a capability which will have far-reaching implications for improving education, opportunities, and access for the deaf. These algorithms also aim to enable video-based queries of ASL lexicons, and eventually full-fledged dictionaries with metalinguistic information about signs and examples of usage. By enabling those learning ASL to "look up" a sign they do not know, this technology promises to transform the way students of ASL (both deaf and hearing), parents of deaf children, sign language interpreters, and linguists learn about signs they encounter. The algorithms developed in this effort may well lead to more robust ASL recognition systems, which can handle natural signing with a large lexicon of signs and the technology will also advance the state of the art in gesture recognition and synthesis systems. The large linguistically annotated corpus of native ASL produced as part of this effort will itself be an important resource.

据估计，美国手语（ASL）在美国有多达200万人使用。然而，由于手语的视觉性质和缺乏标准的书面形式，口语使用者认为理所当然的许多资源对手语使用者来说是不可用的。例如，当一个ASL用户遇到一个未知的符号时，在字典中查找它不是一个选择。使用现有的美国手语词典，人们可以很容易地找到什么符号对应于一个英语单词，但不知道什么英语单词（或者更一般地说，什么意思）对应于一个给定的符号。另一个例子是使用关键字搜索计算机文件或网页，这现在是计算机用户的频繁活动。目前，ASL没有与关键词搜索等效的搜索。美国手语不是一种书面语言，与文本文档最接近的等价物是美国手语叙述或交流的视频序列。目前没有工具可用于查找出现特定标志的视频片段。缺乏这样的工具严重限制了对美国手语文学、传说、诗歌、表演或课程的视频库的基于内容的访问。这项研究的核心目标是通过推进基于视觉的手势识别和检索的最新技术，推动这些资源的可用性。这在计算机视觉、机器学习和数据库索引领域提出了具有挑战性的研究问题。这项工作将侧重于以下方面：通过将符号分解为语音要素，在每个符号只有几个训练实例的情况下，开发符号类学习模型的方法;为手势语言视频词典设计可伸缩的索引方法，在有数千个类的情况下，以交互速度实现符号识别;结合语言约束以改进诸如手姿势估计和上身跟踪的较低级别视觉模块以及较高级别学习和索引模块的性能;并明确设计可以与经常提供不准确或模糊输出的易出错视觉模块一起工作的方法。 PI将创建两个演示系统：一个包含手语符号综合数据库的手语词典;以及一个“手语谷歌”，可以在手语视频内容的大型数据库中搜索特定符号。这些系统将使用数千个视频序列进行训练和评估，这些视频序列是由本地ASL签名者在隔离和上下文中执行的。这些使用数据对于研究协同发音效应和上下文相关的符号变化将是有价值的。收集的标志将包括出现在第一个三年的标准大学ASL courses.Broader影响的ASL标志的完整列表：在这个项目中开发的方法将使手语文献，传说，诗歌，表演，课程，从数字视频图书馆和DVD，这将有深远的影响，改善教育，机会和访问聋人的能力。这些算法还旨在实现基于视频的ASL词典查询，并最终实现具有关于符号和用法示例的元语言信息的成熟词典。通过使那些学习美国手语的人能够“查找”他们不知道的符号，这项技术有望改变美国手语学生（聋人和听力）、聋哑儿童的父母、手语翻译和语言学家学习他们遇到的符号的方式。在这项工作中开发的算法很可能会导致更强大的ASL识别系统，它可以处理自然的签名与一个大词典的标志，该技术也将推进手势识别和合成系统的最新技术。作为这一努力的一部分，产生的大量带有语言注释的本土美国手语语料库本身将是一个重要的资源。