CAREER: Exploiting Deep Generative Models for Visual Recognition
职业:利用深度生成模型进行视觉识别
基本信息
- 批准号:2239076
- 负责人:
- 金额:$ 58.19万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-04-01 至 2028-03-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Modern visual recognition systems have achieved impressive results on standard benchmarks and work reliably for common objects and scenes, given massive data and annotations. Unfortunately, current systems struggle to detect rare or unseen objects and fail to adapt to new domains. Researchers, engineers and/or domain experts have to capture and annotate huge amounts of real data, which are costly for common objects and impractical for rare objects and corner cases (i.e., cases that occur when multiple unique conditions simultaneously occur). To address the above challenges and automatically create and label data that fully depict the corner cases, this project leverages the rich compositional structure and powerful synthesis capacity of large-scale generative models. By using these models that can quickly synthesize diverse objects and scenes with an unknown visual elements (e.g., new poses, weather, lighting, etc.). This project will develop recognition algorithms that can recognize rare/unseen objects to adapt to continuously changing environments. This project has a potential to be transformative for various applications, such as autonomous driving, assistive robots, healthcare, e-commerce, and mixed reality. Furthermore, this research will translate to code, models, courses, and tutorials, that are widely accessible to diverse stakeholders and education and research programs that engage with the broader community. Directly using generative models is challenging, as it is highly unlikely that a randomly sampled image will cover a corner case that can improve recognition systems. To synthesize data that more closely resemble the long-tail distribution and new domains, this project will focus on three research thrusts. First, the project addresses learning visual recognition via generative models by exploring different methods of automatically generating data and annotations. Second, the project will analyze visual recognition systems through generative models by synthesizing diverse, continuously evolving test data to interrogate the system and understand the biases. Finally, the project will automatically select and adapt generative models to new domains and tasks. These three thrusts are tightly connected, as once the algorithms identify hard examples that fail our current system, these examples can be used to close the loop between training and analysis. Finally, investigators will evaluate the developed method by comparing methods with or without using generative models.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
现代视觉识别系统在标准基准测试中取得了令人印象深刻的结果,并在大量数据和注释的情况下可靠地工作于常见对象和场景。不幸的是,目前的系统很难检测到罕见或看不见的物体,并且无法适应新的领域。研究人员、工程师和/或领域专家必须捕获和注释大量的真实的数据,这对于普通对象是昂贵的,而对于稀有对象和角落情况(即,当多个唯一条件同时发生时发生的情况)。为了解决上述挑战并自动创建和标记充分描述角落案例的数据,该项目利用了大规模生成模型的丰富组成结构和强大的合成能力。通过使用这些模型,可以快速合成具有未知视觉元素的各种对象和场景(例如,新姿势、天气、照明等)。该项目将开发识别算法,可以识别罕见/看不见的物体,以适应不断变化的环境。该项目有可能对各种应用产生变革性影响,例如自动驾驶、辅助机器人、医疗保健、电子商务和混合现实。此外,这项研究将转化为代码,模型,课程和教程,可广泛访问不同的利益相关者和教育和研究计划,与更广泛的社区参与。 直接使用生成模型具有挑战性,因为随机采样的图像不太可能覆盖可以改进识别系统的角落情况。为了综合更接近长尾分布和新领域的数据,本项目将侧重于三个研究重点。首先,该项目通过探索自动生成数据和注释的不同方法,通过生成模型来学习视觉识别。其次,该项目将通过生成模型分析视觉识别系统,通过合成多样化的,不断发展的测试数据来询问系统并了解偏差。最后,该项目将自动选择生成模型并使其适应新的领域和任务。这三个方面是紧密相连的,因为一旦算法识别出我们当前系统失败的硬示例,这些示例就可以用来关闭训练和分析之间的循环。最后,研究人员将通过比较使用或不使用生成模型的方法来评估所开发的方法。该奖项反映了NSF的法定使命,并被认为是值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估的支持。
项目成果
期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Ablating Concepts in Text-to-Image Diffusion Models
- DOI:10.1109/iccv51070.2023.02074
- 发表时间:2023-03
- 期刊:
- 影响因子:0
- 作者:Nupur Kumari;Bin Zhang;Sheng-Yu Wang;Eli Shechtman;Richard Zhang;Jun-Yan Zhu
- 通讯作者:Nupur Kumari;Bin Zhang;Sheng-Yu Wang;Eli Shechtman;Richard Zhang;Jun-Yan Zhu
Expressive Text-to-Image Generation with Rich Text
- DOI:10.1109/iccv51070.2023.00694
- 发表时间:2023-04
- 期刊:
- 影响因子:0
- 作者:Songwei Ge;Taesung Park;Jun-Yan Zhu;Jia-Bin Huang
- 通讯作者:Songwei Ge;Taesung Park;Jun-Yan Zhu;Jia-Bin Huang
Content-based Search for Deep Generative Models
- DOI:10.1145/3610548.3618189
- 发表时间:2022-10
- 期刊:
- 影响因子:0
- 作者:Daohan Lu;Sheng-Yu Wang;Nupur Kumari;Rohan Agarwal;David Bau;Jun-Yan Zhu
- 通讯作者:Daohan Lu;Sheng-Yu Wang;Nupur Kumari;Rohan Agarwal;David Bau;Jun-Yan Zhu
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jun-Yan Zhu其他文献
Expressive Image Generation and Editing with Rich Text
- DOI:
10.1007/s11263-025-02361-2 - 发表时间:
2025-03-14 - 期刊:
- 影响因子:9.300
- 作者:
Songwei Ge;Taesung Park;Jun-Yan Zhu;Jia-Bin Huang - 通讯作者:
Jia-Bin Huang
Jun-Yan Zhu的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
Security Evaluation Method Against Deep-Learning-Based Side-Channel Attacks Exploiting Physical Behavior of Cryptographic Hardware
针对利用密码硬件物理行为的基于深度学习的侧信道攻击的安全评估方法
- 批准号:
23K11102 - 财政年份:2023
- 资助金额:
$ 58.19万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Exploiting machine-learning to provide dynamical, microphysical, radiative and electrifying insight from observations of deep convective cloud
利用机器学习从深对流云的观测中提供动态、微观物理、辐射和令人兴奋的见解
- 批准号:
2888807 - 财政年份:2023
- 资助金额:
$ 58.19万 - 项目类别:
Studentship
CRII: CNS: RUI: Exploiting Robust Deep Learning Framework for Wireless Localization Systems in Adversarial IoT Environments
CRII:CNS:RUI:在对抗性物联网环境中利用强大的深度学习框架实现无线定位系统
- 批准号:
2321763 - 财政年份:2022
- 资助金额:
$ 58.19万 - 项目类别:
Standard Grant
Exploiting and Enhancing Programmable Logic for Deep Learning and Datacenter Acceleration
利用和增强可编程逻辑进行深度学习和数据中心加速
- 批准号:
RGPIN-2022-04445 - 财政年份:2022
- 资助金额:
$ 58.19万 - 项目类别:
Discovery Grants Program - Individual
Yielding and Exploiting Confidence in Deep Learning
培养和利用深度学习的信心
- 批准号:
RGPIN-2019-04737 - 财政年份:2022
- 资助金额:
$ 58.19万 - 项目类别:
Discovery Grants Program - Individual
CRII: CNS: RUI: Exploiting Robust Deep Learning Framework for Wireless Localization Systems in Adversarial IoT Environments
CRII:CNS:RUI:在对抗性物联网环境中利用强大的深度学习框架实现无线定位系统
- 批准号:
2105416 - 财政年份:2021
- 资助金额:
$ 58.19万 - 项目类别:
Standard Grant
Yielding and Exploiting Confidence in Deep Learning
培养和利用深度学习的信心
- 批准号:
DGDND-2019-04737 - 财政年份:2021
- 资助金额:
$ 58.19万 - 项目类别:
DND/NSERC Discovery Grant Supplement
Yielding and Exploiting Confidence in Deep Learning
培养和利用深度学习的信心
- 批准号:
RGPIN-2019-04737 - 财政年份:2021
- 资助金额:
$ 58.19万 - 项目类别:
Discovery Grants Program - Individual
Exploiting standardised tissue-mimicking phantoms to enable deep learning-based estimation of optical tissue properties on experimental photoacoustic data
利用标准化的组织模仿体模,实现基于实验光声数据的光学组织特性的基于深度学习的估计
- 批准号:
458342884 - 财政年份:2021
- 资助金额:
$ 58.19万 - 项目类别:
WBP Fellowship
Yielding and Exploiting Confidence in Deep Learning
培养和利用深度学习的信心
- 批准号:
DGDND-2019-04737 - 财政年份:2020
- 资助金额:
$ 58.19万 - 项目类别:
DND/NSERC Discovery Grant Supplement