权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

CAREER: Active Machine Learning for Automating Scientific Discovery

职业：用于自动化科学发现的主动机器学习

基本信息

批准号：
1845434
负责人：
Roman Garnett
金额：
$ 49.77万
依托单位：
Washington University
依托单位国家：
美国
项目类别：
Continuing Grant
财政年份：
2019
资助国家：
美国
起止时间：
2019-03-15 至 2025-02-28
项目状态：
未结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1845434&HistoricalAwards=false
关键词：
CAREER Active Machine Learning Automating

项目摘要

It is often much easier to collect and catalog features of data than to analyze data to determine properties of interest. Such settings are pervasive in the natural sciences and engineering, where in-depth investigation can require human intervention, expensive computer simulation, or costly laboratory experiments. Humanity is at the tipping point of a data revolution, and our ability to collect and store information will likely outpace our capacity to extract useful knowledge from data. Active machine learning provides a solution to this dilemma: we adaptively design expensive experiments guided by statistical models of the underlying process to make the most-effective use of limited resources. Numerous studies have established active machine learning as a promising tool for automating scientific discovery; however, modern procedures are currently difficult for practitioners to adopt. Considerable expertise is required to effectively use the available tools, especially as the field of machine learning continues to develop rapidly. This project will transform the application of active machine learning to problems from science and engineering, developing novel experimental procedures and pioneering new paradigms of scientific discovery. This project will also dramatically increase the availability of these methods to non-experts through automation, facilitating the further integration of machine learning into practice across disciplines. All research will be motivated by problems and validated on data from applications across science and engineering, including materials science, drug discovery, astronomy, and robotics. The project's research objectives will be accompanied by a comprehensive education plan designed to introduce active machine learning to a broad range of future scientists and engineers.The project will entail two broad themes of inquiry, corresponding to the two critical components of an active learning pipeline: (1) experimental policies and (2) modeling. (1): The core of an active learning procedure is its policy, which decides which data to analyze. A primary challenge when building an active learning system is developing a computationally efficient and empirically effective policy for the given learning objective. This is not a straightforward task: the optimal procedure is computationally infeasible and natural approximations can suffer from myopic, greedy behavior. This project will improve the performance and theoretical understanding of policies for automated scientific discovery, developing and studying both established and novel paradigms for active scientific discovery. A theme throughout this investigation will be nonmyopic decision making, where one reasons about the impact of each decision on the entire learning task. Algorithmic development will be accompanied by extensive theoretical study, establishing fundamental learning bounds and seeking efficient approximation schemes when possible. (2): The second thrust of the investigation will be on modeling complex processes from data, as a policy's success hinges on being guided by an informative model. Model selection for active learning is rendered difficult by inherently limited training data, and accounting for model uncertainty is often critical. The project will investigate automated model selection inline with active learning, advancing the nascent field of automated machine learning to create robust, fully automated active learning systems that do not require expert design or tuning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

收集和分类数据的特征通常比分析数据以确定感兴趣的属性要容易得多。这种情况在自然科学和工程学中很普遍，深入的调查可能需要人工干预，昂贵的计算机模拟或昂贵的实验室实验。人类正处于数据革命的临界点，我们收集和存储信息的能力可能会超过我们从数据中提取有用知识的能力。主动机器学习为这一困境提供了解决方案：我们自适应地设计昂贵的实验，这些实验由底层过程的统计模型指导，以最有效地利用有限的资源。许多研究已经建立了主动机器学习作为自动化科学发现的一种有前途的工具;然而，现代程序目前很难为从业者采用。有效使用现有工具需要大量的专业知识，特别是随着机器学习领域的持续快速发展。该项目将把主动机器学习的应用转化为科学和工程问题，开发新颖的实验程序，开创科学发现的新范式。该项目还将通过自动化大大提高这些方法对非专家的可用性，促进机器学习进一步融入跨学科实践。所有研究都将由问题驱动，并根据科学和工程应用的数据进行验证，包括材料科学，药物发现，天文学和机器人技术。该项目的研究目标将伴随着一个全面的教育计划，旨在向广泛的未来科学家和工程师介绍主动机器学习。该项目将涉及两个广泛的主题，对应于主动学习管道的两个关键组成部分：（1）实验政策和（2）建模。(1)主动学习过程的核心是它的策略，它决定分析哪些数据。构建主动学习系统的主要挑战是为给定的学习目标开发计算效率和经验有效的策略。这不是一个简单的任务：最优过程在计算上是不可行的，自然近似可能会受到短视，贪婪行为的影响。该项目将提高自动化科学发现政策的绩效和理论理解，开发和研究积极科学发现的既定和新范式。贯穿本研究的一个主题是非近视决策，即每个决策对整个学习任务的影响。数学发展将伴随着广泛的理论研究，建立基本的学习界限，并在可能的情况下寻求有效的近似方案。(2)调查的第二个重点将是从数据中建模复杂的过程，因为政策的成功取决于信息模型的指导。主动学习的模型选择由于固有的有限训练数据而变得困难，并且考虑模型的不确定性通常至关重要。该项目将研究与主动学习相结合的自动化模型选择，推进自动化机器学习的新兴领域，以创建不需要专家设计或调整的强大的全自动主动学习系统。该奖项反映了NSF的法定使命，并被认为值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估来支持。

项目成果

期刊论文数量（22）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

BINOCULARS for efficient, nonmyopic sequential experimental design

用于高效、非近视顺序实验设计的双筒望远镜

DOI：
发表时间：
2020
期刊：
Proceedings of the 37th International Conference on Machine Learning
影响因子：
0
作者：
Jiang, Shali;Chai, Henry;González, Javier;Garnett, Roman
通讯作者：
Garnett, Roman

Nonmyopic Multiclass Active Search with Diminishing Returns for Diverse Discovery

DOI：
发表时间：
2022-02
期刊：
影响因子：
0
作者：
Quan Nguyen;R. Garnett
通讯作者：
Quan Nguyen;R. Garnett

Automated measurement of quasar redshift with a Gaussian process

DOI：
10.1093/mnras/staa2826
发表时间：
2020-06
期刊：
Monthly Notices of the Royal Astronomical Society
影响因子：
4.8
作者：
Leah Fauber;M. Ho;Simeon Bird;C. Shelton;R. Garnett;Ishita Korde
通讯作者：
Leah Fauber;M. Ho;Simeon Bird;C. Shelton;R. Garnett;Ishita Korde

The Behavior and Convergence of Local Bayesian Optimization

局部贝叶斯优化的行为和收敛性

DOI：
发表时间：
2023
期刊：
Advances in neural information processing systems
影响因子：
0
作者：
Wu, Kaiwen;Kim, Kyurae;Garnett, Roman;Gardner, Jacob R.
通讯作者：
Gardner, Jacob R.

Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees

通过一次性多步树进行高效非近视贝叶斯优化

DOI：
发表时间：
2020
期刊：
Advances in neural information processing systems
影响因子：
0
作者：
Jiang, Shali;Jiang, Daniel;Balandat, Maximilian;Karrer, Brian;Gardner, Jacob;Garnett, Roman
通讯作者：
Garnett, Roman

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Roman Garnett其他文献

Introducing the ‘active search’ method for iterative virtual screening

引入迭代虚拟筛选的“主动搜索”方法

DOI：
10.1007/s10822-015-9832-9
发表时间：
2015
期刊：
Journal of Computer-Aided Molecular Design
影响因子：
3.5
作者：
Roman Garnett;Thomas Gärtner;Martin Vogt;Jürgen Bajorath
通讯作者：
Jürgen Bajorath

Idiographic Personality Gaussian Process for Psychological Assessment

心理评估的具体人格高斯过程

DOI：
发表时间：
2024
期刊：
影响因子：
0
作者：
Yehu Chen;Muchen Xi;Jacob Montgomery;Joshua Jackson;Roman Garnett
通讯作者：
Roman Garnett

A Greedy Approximation for k-Determinantal Point Processes

k-行列点过程的贪心近似

DOI：
发表时间：
2024
期刊：
International Conference on Artificial Intelligence and Statistics
影响因子：
0
作者：
Julia Grosse;Rahel Fischer;Roman Garnett;Phlipp Hennig
通讯作者：
Phlipp Hennig

Bayesian Networks to Assess the Newborn Stool Microbiome

贝叶斯网络评估新生儿粪便微生物组

DOI：
发表时间：
2016
期刊：
影响因子：
0
作者：
William E. Bennett;Michael R. Brent;Phillip I. Tarr;Roman Garnett
通讯作者：
Roman Garnett

Predicting unexpected influxes of players in EVE online

预测 EVE Online 中玩家的意外涌入

DOI：
10.1109/cig.2014.6932878
发表时间：
2014
期刊：
2014 IEEE Conference on Computational Intelligence and Games
影响因子：
0
作者：
Roman Garnett;Thomas Gärtner;Timothy Ellersiek;Eyjolfur Gudmondsson;Petur Oskarsson
通讯作者：
Petur Oskarsson