Open data-driven infrastructure for building biomolecular force fields for predictive biophysics and drug design
开放数据驱动的基础设施,用于构建用于预测生物物理学和药物设计的生物分子力场
基本信息
- 批准号:10412594
- 负责人:
- 金额:$ 17.77万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-03-01 至 2024-02-29
- 项目状态:已结题
- 来源:
- 关键词:AddressAgingArchitectureAreaAutomobile DrivingBiophysicsChemicalsCollaborationsCommunitiesComputer SimulationComputer softwareDNADataData SetDevelopmentDrug DesignEcosystemElementsEngineeringEnsureFaceFundingGenerationsGoalsIndustrializationInfrastructureIngestionInstitutesLearningLifeLiteratureMachine LearningMetadataMethodsModelingModernizationMolecularPharmaceutical PreparationsProblem SolvingProcessPropertyProteinsRNAResearchResearch PersonnelResourcesRetrievalScienceScientistSourceSupercomputingSystemTensorFlowTimeUnited States National Institutes of HealthWorkbasechemical reactioncluster computingcommunity based participatory researchcomputing resourcescostdashboarddata sharingdesigndrug discoveryfundamental researchimprovedmodels and simulationmolecular mechanicsopen datapredictive modelingquantumquantum chemistryrapid growthsimulationsoftware infrastructuresuccesssupercomputertool
项目摘要
PROJECT SUMMARY/ABSTRACT
Current generation molecular simulation models are insufficiently accurate, and current generation tools for building
those models are limited, not automated, and based on aging infrastructure.
Our original R01, “Open Data-driven Infrastructure for Building Biomolecular Force Fields for Predictive Biophysics
and Drug Design,” aims to solve these problems, producing a modern infrastructure for building, applying, and
improving accurate molecular mechanics force fields. As part of our NIH-funded project, we have collaborated
closely with the Molecular Sciences Software Institute (MolSSI) to use the QCArchive ecosystem to gen-
erate and continuously expand very large quantum chemical datasets relevant to biomolecular systems
on a variety of supercomputing resources. QCArchive now contains over 42M quantum chemical calculations
for over 39M molecules, and has become incredibly popular, with over 1.79M accesses/month.
Large quantum chemical datasets relevant to biomolecular systems are incredibly valuable to the AI/ML
community. Data is the key element needed for both fundamental research into ML architectures and constructing
predictive models for downstream use. Unfortunately, quantum chemical datasets are incredibly expensive to
generate, limiting in-house generation of large, useful datasets needed to drive AI/ML research to a few large
companies and researchers with access to sufficient computing resources. While AI/ML quantum chemical
methods have shown immense promise for biomolecular systems, the limited access to large, curated
datasets has greatly hindered researchers from making rapid progress in this area.
We aim to bridge this gap by working closely with MolSSI QCArchive developers to address robustness, scal-
ability, and data delivery challenges to meet the needs of the biomolecular AI/ML community requiring access
to large quantum chemistry datasets (Aim 1). Additional software developers will enable improvements to the
QCArchive infrastructure to meet the rapidly growing demands of the AI/ML community. As QCArchive is primarily
maintained by a single MolSSI Software Scientist, additional developers are necessary for fully enabling the AI/ML
community to take full advantage of the wealth of data generated by our NIH-funded project directly, as well as the
data actively being generated by the tools our project has engineered to enable distributed, fault-tolerant quantum
chemistry that is rapidly populating QCArchive. We will additionally develop interfaces and dashboards to enable
facile discovery, retrieval, and import of quantum chemical datasets within popular machine learning frameworks
(Aim 2). To ensure our tools are specifically useful for the most promising AI/ML applications, we will collaborate
directly with AI researchers in the OpenMM, TorchMD, and SchNetPack communities actively developing and
deploying quantum machine learning (QML) potentials for biomolecular simulation, with the goal of producing
generally useful tools suitable for the wider community yet capable of driving these high-priority applications.
项目总结/摘要
目前的分子模拟模型不够准确,目前的工具,
这些模式是有限的,不是自动化的,并且基于老化的基础设施。
我们最初的R 01,“为预测生物物理学构建生物分子力场的开放式数据驱动基础设施
和药物设计,”旨在解决这些问题,产生一个现代化的基础设施建设,应用,
改进精确的分子力学力场。作为NIH资助项目的一部分,
与分子科学软件研究所(MolSSI)密切合作,使用QCAriff生态系统生成-
评估并不断扩展与生物分子系统相关的超大型量子化学数据集
各种各样的超级计算资源。QCArchive现在包含超过42 M的量子化学计算
超过3900万个分子,并且已经变得非常受欢迎,每月超过179万次访问。
与生物分子系统相关的大型量子化学数据集对AI/ML来说非常有价值
社区数据是对ML架构进行基础研究和构建
下游使用的预测模型。不幸的是,量子化学数据集非常昂贵,
生成,限制内部生成大型,有用的数据集,这些数据集需要将AI/ML研究驱动到几个大型
公司和研究人员可以访问足够的计算资源。AI/ML量子化学
方法已经显示出对生物分子系统的巨大希望,
数据集极大地阻碍了研究人员在这一领域取得快速进展。
我们的目标是通过与MolSSI QCArchive开发人员密切合作来弥合这一差距,以解决鲁棒性,规模,
能力和数据交付挑战,以满足需要访问的生物分子AI/ML社区的需求
大型量子化学数据集(目标1)。其他软件开发人员将能够改进
QCArfile基础设施,以满足AI/ML社区快速增长的需求。由于QCArfile主要是
由一名MolSSI软件科学家维护,需要额外的开发人员才能完全启用AI/ML
社区充分利用我们NIH资助的项目直接产生的丰富数据,以及
我们的项目设计的工具正在积极生成数据,以实现分布式容错量子
正在快速填充QCArchive的化学。我们还将开发界面和仪表板,
在流行的机器学习框架内轻松发现、检索和导入量子化学数据集
(Aim 2)。为了确保我们的工具对最有前途的人工智能/机器学习应用程序特别有用,我们将进行合作
直接与OpenMM、TorchMD和SchNetPack社区的AI研究人员一起积极开发和
将量子机器学习(QML)潜力用于生物分子模拟,目标是产生
一般来说,这是一个非常有用的工具,适用于更广泛的社区,但能够推动这些高优先级的应用程序。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Michael R Shirts其他文献
Michael R Shirts的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Michael R Shirts', 18)}}的其他基金
Open Data-driven Infrastructure for Building Biomolecular Force Field for Predictive Biophysics and Drug Design
开放数据驱动的基础设施,用于构建用于预测生物物理学和药物设计的生物分子力场
- 批准号:
10166314 - 财政年份:2020
- 资助金额:
$ 17.77万 - 项目类别:
Open data-driven infrastructure for building biomolecular force fields for predictive biophysics and drug design
开放数据驱动的基础设施,用于构建用于预测生物物理学和药物设计的生物分子力场
- 批准号:
10356089 - 财政年份:2020
- 资助金额:
$ 17.77万 - 项目类别:
Open data-driven infrastructure for building biomolecular force fields for predictive biophysics and drug design
开放数据驱动的基础设施,用于构建用于预测生物物理学和药物设计的生物分子力场
- 批准号:
10580156 - 财政年份:2020
- 资助金额:
$ 17.77万 - 项目类别:
Open data-driven infrastructure for building biomolecular force fields for predictive biophysics and drug design
开放数据驱动的基础设施,用于构建用于预测生物物理学和药物设计的生物分子力场
- 批准号:
10592758 - 财政年份:2020
- 资助金额:
$ 17.77万 - 项目类别:
Open data-driven infrastructure for building biomolecular force fields for predictive biophysics and drug design
开放数据驱动的基础设施,用于构建用于预测生物物理学和药物设计的生物分子力场
- 批准号:
9887804 - 财政年份:2020
- 资助金额:
$ 17.77万 - 项目类别:
Drug Binding Free Energies with Implicit Solvent Methods
使用隐式溶剂方法的药物结合自由能
- 批准号:
7061270 - 财政年份:2005
- 资助金额:
$ 17.77万 - 项目类别:
Drug Binding Free Energies with Implicit Solvent Methods
使用隐式溶剂方法的药物结合自由能
- 批准号:
6934020 - 财政年份:2005
- 资助金额:
$ 17.77万 - 项目类别:
Drug Binding Free Energies with Implicit Solvent Methods
使用隐式溶剂方法的药物结合自由能
- 批准号:
7228984 - 财政年份:2005
- 资助金额:
$ 17.77万 - 项目类别:
相似海外基金
Genetic Architecture of Aging-Related TDP-43 and Mixed Pathology Dementia
衰老相关 TDP-43 和混合病理痴呆的遗传结构
- 批准号:
10658215 - 财政年份:2023
- 资助金额:
$ 17.77万 - 项目类别:
Chromatin architecture disruption and the vicious cycle of aging.
染色质结构破坏和衰老的恶性循环。
- 批准号:
10901040 - 财政年份:2023
- 资助金额:
$ 17.77万 - 项目类别:
Impact of Reproductive Aging on the Functional and Structural Architecture of the Human Brain
生殖衰老对人脑功能和结构的影响
- 批准号:
10313384 - 财政年份:2021
- 资助金额:
$ 17.77万 - 项目类别:
Relationship between clonal architecture of aging hematopoiesis and the risk of developing age-associated myeloid cancers.
衰老造血的克隆结构与发生与年龄相关的骨髓癌的风险之间的关系。
- 批准号:
305881 - 财政年份:2014
- 资助金额:
$ 17.77万 - 项目类别:
Operating Grants
Nucleosome architecture in aging and nuclear receptor activation in the liver
衰老中的核小体结构和肝脏中的核受体激活
- 批准号:
9442316 - 财政年份:2014
- 资助金额:
$ 17.77万 - 项目类别:
Nucleosome architecture in aging and nuclear receptor activation in the liver
衰老中的核小体结构和肝脏中的核受体激活
- 批准号:
9026604 - 财政年份:2014
- 资助金额:
$ 17.77万 - 项目类别:
Nucleosome architecture in aging and nuclear receptor activation in the liver
衰老中的核小体结构和肝脏中的核受体激活
- 批准号:
8679341 - 财政年份:2014
- 资助金额:
$ 17.77万 - 项目类别:
Transcriptional Architecture and Chromatin Landscape of Circadian Clocks in Aging
衰老过程中昼夜节律时钟的转录结构和染色质景观
- 批准号:
8707931 - 财政年份:2013
- 资助金额:
$ 17.77万 - 项目类别:
Transcriptional Architecture and Chromatin Landscape of Circadian Clocks in Aging
衰老过程中昼夜节律时钟的转录结构和染色质景观
- 批准号:
9063026 - 财政年份:2013
- 资助金额:
$ 17.77万 - 项目类别:
Transcriptional Architecture and Chromatin Landscape of Circadian Clocks in Aging
衰老过程中昼夜节律时钟的转录结构和染色质景观
- 批准号:
8580066 - 财政年份:2013
- 资助金额:
$ 17.77万 - 项目类别:














{{item.name}}会员




