Regularized divergences and their gradient flows, generative modeling and structure-preserving learning.
正则化散度及其梯度流、生成建模和结构保持学习。
基本信息
- 批准号:2307115
- 负责人:
- 金额:$ 30万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-08-01 至 2026-07-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Generative modeling algorithms underlie many recent and ongoing advances in artificial intelligence, both in popular image and text generation tools as well as scientific applications such as materials design, medical imaging, drug discovery, and cosmology, to name a few. The goal of such algorithms is to learn and construct a model starting from data and available knowledge, and then deploy the learned model to generate new predictions. These predictions are in the form of new data such as new images, new text or even new candidate molecules for drug design. This project involves the development of new mathematical tools from information theory, deep learning, differential equations and probability theory to design, improve, explain and ultimately trust such learning algorithms. The investigators will also apply these new algorithms to merge data sets from different cancer studies, addressing a critical need to improve data analysis by integrating data from the same disease but which are obtained using different studies, technologies, and patient groups. The primary goal of the proposed research is to develop new reliable machine learning algorithms when data is scarce or expensive to obtain. Graduate students will be trained in this field as part of this research project.Probability divergences and metrics are mathematical objects designed to measure discrepancies between different probabilistic models or between models and data and are especially adept in very high-dimensional settings. Divergences need to be carefully designed to construct models which best describe the available data. This project will combine tools from optimal transport, information theory, partial differential equations and deep learning to develop Lipschitz regularized divergences which interpolate between Wasserstein metrics and information-theoretic divergences (e.g. the Kullback-Leibler divergence) and which provide flexible families of loss functions to compare non-absolutely continuous probability measures. In machine learning applications one often needs to build algorithms to model target distributions which are singular, either by their intrinsic nature such as probabilities concentrated on low dimensional structures and/or because they are often only known through data. These new divergences will be combined with deep learning to build gradient flows in a probability space which are capable of transporting any initial distribution to a target data set. These new methods will also be adapted for structure-preserving learning, arising in applications ranging from medicine to the design of new molecules, where data can exhibit symmetries or physical constraints. This additional knowledge will be taken into account in the probability divergence to build structure-preserving generative algorithms in an efficient way. The essential role of structure in generative algorithms will be studied and quantified whenever data is scarce and/or expensive to obtain. One of the demonstration areas of this research is in bioinformatics, where available real datasets, even when they involve the same disease, have low sample size due to budgetary constraints or limited availability of patients e.g., in the case of rare diseases.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
生成建模算法是人工智能领域许多最近和正在进行的进展的基础,无论是在流行的图像和文本生成工具中,还是在材料设计、医学成像、药物发现和宇宙学等科学应用中,仅举几例。这些算法的目标是从数据和可用知识开始学习和构建模型,然后部署学习的模型以生成新的预测。这些预测是以新数据的形式出现的,例如新图像、新文本,甚至是用于药物设计的新候选分子。该项目涉及从信息论,深度学习,微分方程和概率论开发新的数学工具,以设计,改进,解释并最终信任这些学习算法。 研究人员还将应用这些新算法来合并来自不同癌症研究的数据集,通过整合来自相同疾病但使用不同研究,技术和患者群体获得的数据来解决改善数据分析的关键需求。这项研究的主要目标是在数据稀缺或获取成本高昂的情况下开发新的可靠的机器学习算法。作为本研究项目的一部分,研究生将在这一领域接受培训。概率分歧和度量是数学对象,旨在测量不同概率模型之间或模型与数据之间的差异,特别适用于非常高维的设置。需要仔细设计分歧,以构建最能描述现有数据的模型。该项目将联合收割机工具从最佳运输,信息理论,偏微分方程和深度学习开发Lipschitz正则化分歧,Wasserstein度量和信息理论分歧(例如Kullback-Leibler分歧)之间的插值,并提供灵活的损失函数族来比较非绝对连续的概率测度。在机器学习应用中,人们通常需要构建算法来对奇异的目标分布进行建模,这要么是由于它们的内在性质,例如集中在低维结构上的概率,要么是因为它们通常只能通过数据来知道。这些新的分歧将与深度学习相结合,在概率空间中构建梯度流,从而能够将任何初始分布传输到目标数据集。这些新方法也将适用于结构保持学习,从医学到新分子设计的应用中出现,其中数据可以表现出对称性或物理约束。这种额外的知识将被考虑在概率发散中,以有效的方式构建结构保持生成算法。结构在生成算法中的重要作用将被研究和量化,每当数据是稀缺的和/或昂贵的获得。 这项研究的一个示范领域是生物信息学,其中可用的真实的数据集,即使它们涉及相同的疾病,由于预算限制或患者有限的可用性,样本量也很低,例如,该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Model Uncertainty and Correctability for Directed Graphical Models
- DOI:10.1137/21m1434453
- 发表时间:2021-07
- 期刊:
- 影响因子:0
- 作者:P. Birmpa;Jinchao Feng;M. Katsoulakis;Luc Rey-Bellet
- 通讯作者:P. Birmpa;Jinchao Feng;M. Katsoulakis;Luc Rey-Bellet
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Luc Rey-Bellet其他文献
Open classical systems
- DOI:
10.1007/3-540-33966-3_2 - 发表时间:
2006 - 期刊:
- 影响因子:0
- 作者:
Luc Rey-Bellet - 通讯作者:
Luc Rey-Bellet
Luc Rey-Bellet的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Luc Rey-Bellet', 18)}}的其他基金
Robust Uncertainty Quantification and Statistical Learning for Heavy Tails and Rare Events
重尾和稀有事件的鲁棒不确定性量化和统计学习
- 批准号:
2008970 - 财政年份:2020
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Mathematical and Computational Methods for Non-Equilbrium Systems
非平衡系统的数学和计算方法
- 批准号:
1515712 - 财政年份:2015
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
AMC-SS: Mathematical and Computational in Nonequilibrium Statistical Mechanics.
AMC-SS:非平衡统计力学中的数学和计算。
- 批准号:
0605058 - 财政年份:2006
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
Mathematical Problems in Nonequilibrium Statistical Mechanics
非平衡统计力学中的数学问题
- 批准号:
0306540 - 财政年份:2003
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
相似海外基金
NSF-BSF: Collaborative Research: CIF: Small: Neural Estimation of Statistical Divergences: Theoretical Foundations and Applications to Communication Systems
NSF-BSF:协作研究:CIF:小型:统计差异的神经估计:通信系统的理论基础和应用
- 批准号:
2308445 - 财政年份:2023
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
NSF-BSF: Collaborative Research: CIF: Small: Neural Estimation of Statistical Divergences: Theoretical Foundations and Applications to Communication Systems
NSF-BSF:协作研究:CIF:小型:统计差异的神经估计:通信系统的理论基础和应用
- 批准号:
2308446 - 财政年份:2023
- 资助金额:
$ 30万 - 项目类别:
Standard Grant
Postdoctoral Fellowship: SPRF: Documenting Dialect Divergences Across Space and Time
博士后奖学金:SPRF:记录跨空间和时间的方言差异
- 批准号:
2313787 - 财政年份:2023
- 资助金额:
$ 30万 - 项目类别:
Fellowship Award
Divergences in the mitochondrial oxidative phosphorylation process and the role they play on reactive oxygen species production and aging
线粒体氧化磷酸化过程的差异及其对活性氧产生和衰老的作用
- 批准号:
RGPIN-2021-02924 - 财政年份:2022
- 资助金额:
$ 30万 - 项目类别:
Discovery Grants Program - Individual
Divergences in the mitochondrial oxidative phosphorylation process and the role they play on reactive oxygen species production and aging
线粒体氧化磷酸化过程的差异及其对活性氧产生和衰老的作用
- 批准号:
RGPIN-2021-02924 - 财政年份:2021
- 资助金额:
$ 30万 - 项目类别:
Discovery Grants Program - Individual
Quantum divergences, channel discrimination and strong data-processing
量子发散、通道辨别和强大的数据处理
- 批准号:
2436710 - 财政年份:2020
- 资助金额:
$ 30万 - 项目类别:
Studentship
Machine learning and statistical methhods on infinite-dimensional manifolds
无限维流形上的机器学习和统计方法
- 批准号:
20H04250 - 财政年份:2020
- 资助金额:
$ 30万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
CAREER: Semantic Divergences Across the Language Barrier
职业:跨越语言障碍的语义分歧
- 批准号:
1750695 - 财政年份:2018
- 资助金额:
$ 30万 - 项目类别:
Continuing Grant
Translating Western Science, Technology and Medicine to Late Ming China: Convergences and Divergences in the Light of the Kunyu gezhi (Investigations of the Earth's Interior; 1640) and the Taixi shuifa (Hydromethods of the Great West; 1612)
将西方科学、技术和医学转化为晚明中国:《坤舆格志》(地球内部调查;1640年)和《太溪水法》(西方水法;1612年)的趋同与分歧
- 批准号:
397383545 - 财政年份:2018
- 资助金额:
$ 30万 - 项目类别:
Research Grants