权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: New Directions in Deep Representation Learning from Complex Multimodal Data

职业：复杂多模态数据深度表示学习的新方向

基本信息

批准号：
1453651
负责人：
Honglak Lee
金额：
$ 48.86万
依托单位：
Regents of the University of Michigan - Ann Arbor
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2015
资助国家：
美国
起止时间：
2015-09-01 至 2023-08-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1453651&HistoricalAwards=false
关键词：
CAREER New Directions Deep Representation

项目摘要

The goal of deep learning is to learn an abstract representation of data with a hierarchical and compositional structure. Deep learning methods can effectively learn discriminative features from high-dimensional input data (e.g., for classification), and have been successfully applied to many real-world problems, such as image classification, speech recognition, and text modeling. Despite these successes, there still remains a challenging open question: how can we learn a robust deep representation that allows for holistic understanding and high-level reasoning from complex data? This CAREER project aims to address this question and is expected to result in novel deep architectures, graphical models, and algorithmic advances for inference, learning, and optimization in deep representation learning. The research outcomes will be disseminated through publications, talks, and tutorials. In addition to advancing the state of the art in deep learning and the many applications it entails, the project will integrate research and education through 1) developing courses in machine learning that include deep learning as a key topic; 2) mentoring significant graduate and undergraduate research activities; and 3) reaching out to K-12 students via hosting demo sessions and mentoring for science fair/research projects. This project investigates the following closely interrelated and complementary thrusts: First, it develops deep learning algorithms to disentangle factors of variation from complex data. This is done by modeling higher-order interactions between multiple groups of latent variables with a deep generative model (e.g., modeling face images via interaction of latent factors that correspond to identity, viewpoint, and emotion). In addition to better generalization, this approach is amenable to high-level reasoning, such as making analogies. Modeling higher-order interaction will be approached by learning a sub-manifold for each factor of variation, where correspondence information is used for regularizing the latent representation. The project will also develop weakly-supervised and semi-supervised disentangling algorithms that automatically establish correspondences without manual supervision. Second, the project develops deep representation learning methods for structured prediction problems. Specifically, it will develop a graphical model with deep representations that can model complex dependencies between output variables. This framework can be also viewed as data-driven modeling of higher-order prior on structured data, and can be used for modeling higher-order conditional random fields that permit efficient inference and learning. In addition, the project develops stochastic conditional generative models for structured prediction problems that involve uncertainty (i.e., one-to-many mappings). Third, the project develops novel deep learning algorithms for constructing shared representations from multiple heterogeneous input modalities, such as image and text, audio and video, and multiple sensor streams. The main idea is to separately model conditional distribution of each input modality given other modalities. This approach addresses the well-known difficulty of modeling a joint distribution across heterogeneous multimodal input, and provides a theoretical analysis on conditions under which the approach can recover a consistent generative model. This formulation allows for robust recognition and high-level reasoning from heterogeneous multimodal data. Overall, these three thrusts are complementary and are expected to play synergistic roles in tackling a broader range of AI problems and moving beyond the current state-of-the-art in deep learning.

深度学习的目标是学习具有分层和组合结构的数据的抽象表示。深度学习方法可以有效地从高维输入数据中学习区分特征(例如，用于分类)，并已成功地应用于许多现实世界的问题，如图像分类、语音识别和文本建模。尽管取得了这些成功，但仍然存在一个具有挑战性的悬而未决的问题：我们如何学习健壮的深层表示，以便从复杂的数据中进行整体理解和高级推理？这个职业项目旨在解决这个问题，预计将产生新颖的深层体系结构、图形模型和算法进步，用于深度表示学习中的推理、学习和优化。研究成果将通过出版物、讲座和教程进行传播。除了促进深度学习的最新水平及其所涉及的许多应用之外，该项目还将通过以下方式整合研究和教育：1)开发将深度学习作为关键主题的机器学习课程；2)指导重要的研究生和本科生研究活动；3)通过举办演示会议和指导科学博览会/研究项目来接触K-12学生。该项目研究了以下密切相关和互补的主题：首先，它开发了深度学习算法，以从复杂数据中分离出变异因素。这是通过利用深度生成模型对多组潜在变量之间的高阶交互进行建模来完成的(例如，通过对应于身份、观点和情绪的潜在因素的交互来对人脸图像进行建模)。除了更好的泛化外，这种方法还适用于高级推理，例如进行类比。高阶相互作用的建模将通过为每个变异因素学习一个子流形来实现，其中对应信息被用于规则化潜在的表示。该项目还将开发弱监督和半监督的解缠算法，无需人工监督即可自动建立通信。其次，该项目为结构化预测问题开发了深度表示学习方法。具体地说，它将开发一个具有深层表示的图形模型，可以对输出变量之间的复杂依赖关系进行建模。该框架还可以被看作是对结构化数据的高阶先验的数据驱动建模，并且可以用于对允许高效推理和学习的高阶条件随机场进行建模。此外，该项目还为涉及不确定性(即一对多映射)的结构化预测问题开发了随机条件生成模型。第三，该项目开发了新的深度学习算法，用于从图像和文本、音频和视频以及多个传感器流等多种不同的输入模式构建共享表示。其主要思想是在给定其他模式的情况下，分别对每种输入模式的条件分布进行建模。该方法解决了对跨异质多模式输入的联合分布建模的众所周知的困难，并对该方法能够恢复一致的生成模型的条件进行了理论分析。该公式允许从异质多模数据中进行健壮的识别和高级推理。总体而言，这三个推进是相辅相成的，预计将在解决更广泛的人工智能问题和超越目前深度学习方面的最先进水平方面发挥协同作用。

项目成果

期刊论文数量（0）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

数据更新时间：{{ journalArticles.updateTime }}

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Honglak Lee其他文献

Foundation models for fast, label-free detection of glioma infiltration

用于快速、无标记检测胶质瘤浸润的基础模型

DOI：
10.1038/s41586-024-08169-3
发表时间：
2024-11-13
期刊：
NATURE
影响因子：
48.500
作者：
Akhil Kondepudi;Melike Pekmezci;Xinhai Hou;Katie Scotford;Cheng Jiang;Akshay Rao;Edward S. Harake;Asadur Chowdury;Wajd Al-Holou;Lin Wang;Aditya Pandey;Pedro R. Lowenstein;Maria G. Castro;Lisa Irina Koerner;Thomas Roetzer-Pejrimovsky;Georg Widhalm;Sandra Camelo-Piragua;Misha Movahed-Ezazi;Daniel A. Orringer;Honglak Lee;Christian Freudiger;Mitchel Berger;Shawn Hervey-Jumper;Todd Hollon
通讯作者：
Todd Hollon