权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

EAGER: Learning Transferable Visual Features

EAGER：学习可迁移的视觉特征

基本信息

批准号：
2041307
负责人：
YingLi Tian
金额：
$ 24.11万
依托单位：
CUNY City College
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2020
资助国家：
美国
起止时间：
2020-09-01 至 2024-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2041307&HistoricalAwards=false
关键词：
EAGER Learning Transferable Visual Features

项目摘要

Artificial intelligence and machine learning have shown great promise for many applications in computer vision, multimedia, robotics, autonomous driving, medical imaging analysis, assistive technology, etc. In order to obtain better performance, large-scale labeled data are generally required to train deep neural networks. To avoid extensive cost of collecting and annotating large-scale data, a major goal of machine learning is to exploit new algorithms to learn general features from limited labeled or unlabeled data. This project aims to explore self-supervised methods to learn general visual features across different modalities from large scale data without using any human-labeled annotations. The learned general visual features can then be transferred to many different applications, such as human activity analysis, 3D scene understanding, and assistive technologies. The research is tightly integrated with graduate/undergraduate education in the City University of New York, a minority serving institution and one of the most diverse campuses in the United States.Most prior work of visual feature learning has focused on a single modality of data. This research is to explore methods of learning transferable visual features from multiple modalities including texts, audios, images, videos, and 3D data as well as investigate new loss functions to find optimal features. In particular, the will conduct the following research tasks: (1) exploration of new algorithms to effectively learn transferable visual features across multimodalities without requesting human annotations of large scale data; (2) investigation of effective algorithms and loss functions for bridging the gap among different modalities to handle the different feature distributions from different modalities; and (3) evaluation and generalization of the proposed technologies on different applications including human activity analysis, 3D scene understanding, and medical image processing. The project will result in new algorithms to effectively learn transferable features from multimodality data including texts, images, videos, and 3D data without depending on data annotations. The work will lead to advances in computer vision and machine learning technologies and the outcome algorithms will be general and broadly applicable across different real-world applications.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

人工智能和机器学习在计算机视觉、多媒体、机器人、自动驾驶、医学成像分析、辅助技术等领域的许多应用中显示出巨大的前景，为了获得更好的性能，通常需要大规模的标记数据来训练深度神经网络。为了避免收集和注释大规模数据的大量成本，机器学习的一个主要目标是利用新算法从有限的标记或未标记数据中学习一般特征。该项目旨在探索自我监督的方法，从大规模数据中学习不同模态的一般视觉特征，而不使用任何人类标记的注释。然后，学习到的一般视觉特征可以转移到许多不同的应用中，例如人类活动分析，3D场景理解和辅助技术。该研究与纽约城市大学的研究生/本科教育紧密结合，纽约城市大学是一所少数族裔服务机构，也是美国最多元化的校园之一。视觉特征学习的大多数先前工作都集中在单一的数据模式上。本研究旨在探索从多种形式（包括文本，音频，图像，视频和3D数据）中学习可转移视觉特征的方法，并研究新的损失函数以找到最佳特征。特别是，将进行以下研究任务：（1）探索新的算法，以有效地学习跨多模态的可转移视觉特征，而无需对大规模数据进行人工注释;（2）研究有效的算法和损失函数，以弥合不同模态之间的差距，以处理来自不同模态的不同特征分布;以及（3）对所提出的技术在不同应用（包括人体活动分析、3D场景理解和医学图像处理）上的评估和推广。该项目将产生新的算法，以有效地从多模态数据（包括文本，图像，视频和3D数据）中学习可转移的特征，而不依赖于数据注释。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（20）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

FESTA: Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds

DOI：
10.1109/cvpr46437.2021.01395
发表时间：
2021-04
期刊：
2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
影响因子：
0
作者：
Haiyan Wang;Jiahao Pang;M. Lodhi;Yingli Tian;Dong Tian
通讯作者：
Haiyan Wang;Jiahao Pang;M. Lodhi;Yingli Tian;Dong Tian

PSMNet: Position-aware Stereo Merging Network for Room Layout Estimation

DOI：
10.1109/cvpr52688.2022.00842
发表时间：
2022-03
期刊：
2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
影响因子：
0
作者：
Haiyan Wang;Will Hutchcroft;Yuguang Li;Zhiqiang Wan;Ivaylo Boyadzhiev;Yingli Tian;S. B. Kang
通讯作者：
Haiyan Wang;Will Hutchcroft;Yuguang Li;Zhiqiang Wan;Ivaylo Boyadzhiev;Yingli Tian;S. B. Kang

Medical Image Tampering Detection: a New Dataset and Baseline

医学图像篡改检测：新的数据集和基线

DOI：
发表时间：
2020
期刊：
2020.
影响因子：
0
作者：
Reichman, B;Jing, L;Akin, O;Tian, Y.
通讯作者：
Tian, Y.

AI-Driven Robust Kidney and Renal Mass Segmentation and Classification on 3D CT Images.

AI驱动的3D CT图像上的稳健肾脏和肾脏质量分割和分类。

DOI：
10.3390/bioengineering10010116
发表时间：
2023-01-13
期刊：
Bioengineering (Basel, Switzerland)
影响因子：
0
作者：
通讯作者：

Nonverbal Communication Cue Recognition: A Pathway to More Accessible Communication

DOI：
10.1109/cvprw59228.2023.00600
发表时间：
2023-06
期刊：
2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)
影响因子：
0
作者：
Z. Shafique;Haiyan Wang;Yingli Tian
通讯作者：
Z. Shafique;Haiyan Wang;Yingli Tian