权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Large-Scale Markov Chain Monte Carlo for Reliable Machine Learning

职业：用于可靠机器学习的大规模马尔可夫链蒙特卡罗

基本信息

批准号：
2046760
负责人：
Christopher De Sa
金额：
$ 42.21万
依托单位：
Cornell University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2021
资助国家：
美国
起止时间：
2021-03-15 至 2026-02-28
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=2046760&HistoricalAwards=false
关键词：
CAREER Large Scale Markov Chain

项目摘要

A core capability of intelligence is reasoning about hidden information. Many artificial intelligence (AI) approaches reason about hidden information by constructing a statistical model and then running a statistical inference algorithm to learn hidden information from observed data. But many inference algorithms take a very long time to run when they are learning from a very large amount of data; or, worse, they might run quickly but give the wrong answer. This is problematic as the world trends towards large-scale AI. This project will build new general statistical inference algorithms that will still run efficiently, even on very large datasets and on very complicated models, while having provable reliability guarantees. This will promote the progress of science by making scalable statistical inference reliable. The project will also further education in AI through the development of open-source course resources that give students hands-on experience with how scalability and reliability interact in ML systems.The project will focus on Markov chain Monte Carlo (MCMC) methods, which is a class of statistical inference algorithm that work by simulating a random process that converges to a desired statistical model. Markov chain Monte Carlo methods can give very accurate statistical estimates, but can scale poorly to large datasets and complicated models. This project will fix this by building new algorithms that address scaling to large data and large models with data-subsampling and asynchronous parallelism, respectively. Throughout, it will focus on proving theoretical guarantees that expose the trade-off between scalability and reliability for MCMC.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

智能的一个核心能力是对隐藏信息的推理。许多人工智能（AI）方法通过构建统计模型，然后运行统计推理算法来从观察到的数据中学习隐藏信息，从而对隐藏信息进行推理。但是，当从大量数据中学习时，许多推理算法需要很长时间才能运行;或者，更糟糕的是，它们可能运行得很快，但给出了错误的答案。随着世界向大规模AI发展，这是一个问题。该项目将构建新的通用统计推断算法，即使在非常大的数据集和非常复杂的模型上，这些算法仍然可以有效地运行，同时具有可证明的可靠性保证。这将通过使可扩展的统计推断可靠来促进科学的进步。该项目还将通过开发开源课程资源，让学生亲身体验ML系统中的可扩展性和可靠性如何相互作用，从而进一步教育AI。该项目将重点关注马尔可夫链蒙特卡罗（MCMC）方法，这是一类统计推理算法，通过模拟收敛到所需统计模型的随机过程来工作。马尔可夫链蒙特卡罗方法可以给出非常准确的统计估计，但对大型数据集和复杂模型的扩展性很差。这个项目将通过构建新的算法来解决这个问题，这些算法分别通过数据子采样和异步并行来解决大数据和大模型的扩展问题。该奖项反映了NSF的法定使命，并通过使用基金会的知识价值和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（2）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

A General Analysis of Example-Selection for Stochastic Gradient Descent

DOI：
发表时间：
2022
期刊：
影响因子：
0
作者：
Yucheng Lu;S. Meng;Christopher De Sa
通讯作者：
Yucheng Lu;S. Meng;Christopher De Sa

Low-Precision Stochastic Gradient Langevin Dynamics

DOI：
10.48550/arxiv.2206.09909
发表时间：
2022-06
期刊：
影响因子：
0
作者：
Ruqi Zhang;A. Wilson;Chris De Sa
通讯作者：
Ruqi Zhang;A. Wilson;Chris De Sa

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Christopher De Sa其他文献

‘Tecnologica cosa’: Modeling Storyteller Personalities in Boccaccio’s ‘Decameron’

“Tecnologica cosa”：塑造薄伽丘《十日谈》中讲故事者的个性

DOI：
10.18653/v1/2021.latechclfl-1.17
发表时间：
2021
期刊：
ArXiv
影响因子：
0
作者：
A. Feder Cooper;Maria Antoniak;Christopher De Sa;Marilyn Migiel;David M. Mimno
通讯作者：
David M. Mimno