CAREER: Active Machine Learning for Automating Scientific Discovery
职业:用于自动化科学发现的主动机器学习
基本信息
- 批准号:1845434
- 负责人:
- 金额:$ 49.77万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-03-15 至 2025-02-28
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
It is often much easier to collect and catalog features of data than to analyze data to determine properties of interest. Such settings are pervasive in the natural sciences and engineering, where in-depth investigation can require human intervention, expensive computer simulation, or costly laboratory experiments. Humanity is at the tipping point of a data revolution, and our ability to collect and store information will likely outpace our capacity to extract useful knowledge from data. Active machine learning provides a solution to this dilemma: we adaptively design expensive experiments guided by statistical models of the underlying process to make the most-effective use of limited resources. Numerous studies have established active machine learning as a promising tool for automating scientific discovery; however, modern procedures are currently difficult for practitioners to adopt. Considerable expertise is required to effectively use the available tools, especially as the field of machine learning continues to develop rapidly. This project will transform the application of active machine learning to problems from science and engineering, developing novel experimental procedures and pioneering new paradigms of scientific discovery. This project will also dramatically increase the availability of these methods to non-experts through automation, facilitating the further integration of machine learning into practice across disciplines. All research will be motivated by problems and validated on data from applications across science and engineering, including materials science, drug discovery, astronomy, and robotics. The project's research objectives will be accompanied by a comprehensive education plan designed to introduce active machine learning to a broad range of future scientists and engineers.The project will entail two broad themes of inquiry, corresponding to the two critical components of an active learning pipeline: (1) experimental policies and (2) modeling. (1): The core of an active learning procedure is its policy, which decides which data to analyze. A primary challenge when building an active learning system is developing a computationally efficient and empirically effective policy for the given learning objective. This is not a straightforward task: the optimal procedure is computationally infeasible and natural approximations can suffer from myopic, greedy behavior. This project will improve the performance and theoretical understanding of policies for automated scientific discovery, developing and studying both established and novel paradigms for active scientific discovery. A theme throughout this investigation will be nonmyopic decision making, where one reasons about the impact of each decision on the entire learning task. Algorithmic development will be accompanied by extensive theoretical study, establishing fundamental learning bounds and seeking efficient approximation schemes when possible. (2): The second thrust of the investigation will be on modeling complex processes from data, as a policy's success hinges on being guided by an informative model. Model selection for active learning is rendered difficult by inherently limited training data, and accounting for model uncertainty is often critical. The project will investigate automated model selection inline with active learning, advancing the nascent field of automated machine learning to create robust, fully automated active learning systems that do not require expert design or tuning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
收集和分类数据的特征通常比分析数据以确定感兴趣的属性要容易得多。这种情况在自然科学和工程学中很普遍,深入的调查可能需要人工干预,昂贵的计算机模拟或昂贵的实验室实验。 人类正处于数据革命的临界点,我们收集和存储信息的能力可能会超过我们从数据中提取有用知识的能力。主动机器学习为这一困境提供了解决方案:我们自适应地设计昂贵的实验,这些实验由底层过程的统计模型指导,以最有效地利用有限的资源。许多研究已经建立了主动机器学习作为自动化科学发现的一种有前途的工具;然而,现代程序目前很难为从业者采用。有效使用现有工具需要大量的专业知识,特别是随着机器学习领域的持续快速发展。该项目将把主动机器学习的应用转化为科学和工程问题,开发新颖的实验程序,开创科学发现的新范式。该项目还将通过自动化大大提高这些方法对非专家的可用性,促进机器学习进一步融入跨学科实践。所有研究都将由问题驱动,并根据科学和工程应用的数据进行验证,包括材料科学,药物发现,天文学和机器人技术。该项目的研究目标将伴随着一个全面的教育计划,旨在向广泛的未来科学家和工程师介绍主动机器学习。该项目将涉及两个广泛的主题,对应于主动学习管道的两个关键组成部分:(1)实验政策和(2)建模。(1)主动学习过程的核心是它的策略,它决定分析哪些数据。构建主动学习系统的主要挑战是为给定的学习目标开发计算效率和经验有效的策略。这不是一个简单的任务:最优过程在计算上是不可行的,自然近似可能会受到短视,贪婪行为的影响。该项目将提高自动化科学发现政策的绩效和理论理解,开发和研究积极科学发现的既定和新范式。贯穿本研究的一个主题是非近视决策,即每个决策对整个学习任务的影响。数学发展将伴随着广泛的理论研究,建立基本的学习界限,并在可能的情况下寻求有效的近似方案。(2)调查的第二个重点将是从数据中建模复杂的过程,因为政策的成功取决于信息模型的指导。主动学习的模型选择由于固有的有限训练数据而变得困难,并且考虑模型的不确定性通常至关重要。该项目将研究与主动学习相结合的自动化模型选择,推进自动化机器学习的新兴领域,以创建不需要专家设计或调整的强大的全自动主动学习系统。该奖项反映了NSF的法定使命,并被认为值得通过使用基金会的智力价值和更广泛的影响审查标准进行评估来支持。
项目成果
期刊论文数量(22)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
BINOCULARS for efficient, nonmyopic sequential experimental design
用于高效、非近视顺序实验设计的双筒望远镜
- DOI:
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Jiang, Shali;Chai, Henry;González, Javier;Garnett, Roman
- 通讯作者:Garnett, Roman
Nonmyopic Multiclass Active Search with Diminishing Returns for Diverse Discovery
- DOI:
- 发表时间:2022-02
- 期刊:
- 影响因子:0
- 作者:Quan Nguyen;R. Garnett
- 通讯作者:Quan Nguyen;R. Garnett
Automated measurement of quasar redshift with a Gaussian process
- DOI:10.1093/mnras/staa2826
- 发表时间:2020-06
- 期刊:
- 影响因子:4.8
- 作者:Leah Fauber;M. Ho;Simeon Bird;C. Shelton;R. Garnett;Ishita Korde
- 通讯作者:Leah Fauber;M. Ho;Simeon Bird;C. Shelton;R. Garnett;Ishita Korde
The Behavior and Convergence of Local Bayesian Optimization
局部贝叶斯优化的行为和收敛性
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Wu, Kaiwen;Kim, Kyurae;Garnett, Roman;Gardner, Jacob R.
- 通讯作者:Gardner, Jacob R.
Efficient Nonmyopic Bayesian Optimization via One-Shot Multi-Step Trees
通过一次性多步树进行高效非近视贝叶斯优化
- DOI:
- 发表时间:2020
- 期刊:
- 影响因子:0
- 作者:Jiang, Shali;Jiang, Daniel;Balandat, Maximilian;Karrer, Brian;Gardner, Jacob;Garnett, Roman
- 通讯作者:Garnett, Roman
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Roman Garnett其他文献
Introducing the ‘active search’ method for iterative virtual screening
引入迭代虚拟筛选的“主动搜索”方法
- DOI:
10.1007/s10822-015-9832-9 - 发表时间:
2015 - 期刊:
- 影响因子:3.5
- 作者:
Roman Garnett;Thomas Gärtner;Martin Vogt;Jürgen Bajorath - 通讯作者:
Jürgen Bajorath
Idiographic Personality Gaussian Process for Psychological Assessment
心理评估的具体人格高斯过程
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Yehu Chen;Muchen Xi;Jacob Montgomery;Joshua Jackson;Roman Garnett - 通讯作者:
Roman Garnett
A Greedy Approximation for k-Determinantal Point Processes
k-行列点过程的贪心近似
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Julia Grosse;Rahel Fischer;Roman Garnett;Phlipp Hennig - 通讯作者:
Phlipp Hennig
Bayesian Networks to Assess the Newborn Stool Microbiome
贝叶斯网络评估新生儿粪便微生物组
- DOI:
- 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
William E. Bennett;Michael R. Brent;Phillip I. Tarr;Roman Garnett - 通讯作者:
Roman Garnett
Predicting unexpected influxes of players in EVE online
预测 EVE Online 中玩家的意外涌入
- DOI:
10.1109/cig.2014.6932878 - 发表时间:
2014 - 期刊:
- 影响因子:0
- 作者:
Roman Garnett;Thomas Gärtner;Timothy Ellersiek;Eyjolfur Gudmondsson;Petur Oskarsson - 通讯作者:
Petur Oskarsson
Roman Garnett的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Roman Garnett', 18)}}的其他基金
Collaborative Research: Accelerating the Discovery of Electronic Materials through Human-Computer Active Search
协作研究:通过人机主动搜索加速电子材料的发现
- 批准号:
1940224 - 财政年份:2019
- 资助金额:
$ 49.77万 - 项目类别:
Standard Grant
相似国自然基金
基于 UPLC-Q-TOF-MS/MS 分析的 异功散活性成分评价及提取工艺研究
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
酵母胞内多糖对酵母发酵性能的影响与机理
- 批准号:JCZRLH202500211
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
卵转铁蛋白纤维化自组装调控机制及其功能活性研究
- 批准号:JCZRYB202500630
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
基于PROTAC技术靶向EFTUD2小分子降解剂设计和结构优化与抗肺癌活性研究
- 批准号:JCZRYB202501469
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
金刚藤药渣生物发酵及其抗菌促生活性研究
- 批准号:JCZRLH202500524
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
马兰根抗肝损伤的活性成分及免疫调控机制研究
- 批准号:JCZRLH202500530
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
微波响应性植入体涂层的研发及其对感染性骨缺损的治疗与机制研究
- 批准号:JCZRLH202500568
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
基于“化瘀生新”探讨红花及其药效物质治疗椎间盘退变的机制研究
- 批准号:JCZRLH202500261
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
牙龈卟啉单胞菌通过ROS/FOXO1/GSDMD轴诱导巨噬细胞焦亡的机制研究
- 批准号:JCZRQN202500447
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
高原盐湖水中活性氯物种多因子耦合生成及其对提锂纳滤膜的跨尺度损伤
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
相似海外基金
CAREER: End-to-End Active Region-based Heliospheric Forecasting System Using Multi-spacecraft Data and Machine Learning
职业:使用多航天器数据和机器学习的基于端对端活动区域的日光层预报系统
- 批准号:
2240022 - 财政年份:2023
- 资助金额:
$ 49.77万 - 项目类别:
Continuing Grant
Active Evaluation of Machine Learning Models
机器学习模型的主动评估
- 批准号:
23H03456 - 财政年份:2023
- 资助金额:
$ 49.77万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Developing an Innovative Platform for Modeling Active Road User Interactions and Safety: Integration of Computer Vision, Agent-based, and Machine Learning Models
开发用于对主动道路用户交互和安全进行建模的创新平台:计算机视觉、基于代理和机器学习模型的集成
- 批准号:
RGPIN-2019-06688 - 财政年份:2022
- 资助金额:
$ 49.77万 - 项目类别:
Discovery Grants Program - Individual
CAREER: Minimize ab initio Tasks in Dynamics Simulations of Chemical Reactions with Active Machine Learning
职业:通过主动机器学习最小化化学反应动力学模拟中的从头开始任务
- 批准号:
2144031 - 财政年份:2022
- 资助金额:
$ 49.77万 - 项目类别:
Standard Grant
Collaborative Research: DMREF: Machine Learning-aided Discovery of Synthesizable, Active and Stable Heterogeneous Catalysts
合作研究:DMREF:机器学习辅助发现可合成、活性和稳定的多相催化剂
- 批准号:
2306125 - 财政年份:2022
- 资助金额:
$ 49.77万 - 项目类别:
Standard Grant
Neural Signal Representations for Physics-Based Machine Learning and Active 3D Imaging
基于物理的机器学习和主动 3D 成像的神经信号表示
- 批准号:
DGECR-2022-00412 - 财政年份:2022
- 资助金额:
$ 49.77万 - 项目类别:
Discovery Launch Supplement
Neural Signal Representations for Physics-Based Machine Learning and Active 3D Imaging
基于物理的机器学习和主动 3D 成像的神经信号表示
- 批准号:
RGPIN-2022-04829 - 财政年份:2022
- 资助金额:
$ 49.77万 - 项目类别:
Discovery Grants Program - Individual
Active Forever with our Well-bean Machine
使用我们的 Well-bean 机器永远活跃
- 批准号:
10019300 - 财政年份:2022
- 资助金额:
$ 49.77万 - 项目类别:
Small Business Research Initiative
Accelerated discovery of synthetic polymers for ribonucleoprotein delivery through the integration of active learning, machine learning, and polymer science
通过整合主动学习、机器学习和聚合物科学,加速发现用于核糖核蛋白递送的合成聚合物
- 批准号:
10195432 - 财政年份:2021
- 资助金额:
$ 49.77万 - 项目类别:
Mechanics of Active Slide-Ring Networks: from Molecular Motors to Molecular Machine
有源滑环网络的力学:从分子马达到分子机器
- 批准号:
2023179 - 财政年份:2021
- 资助金额:
$ 49.77万 - 项目类别:
Standard Grant