Acquiring a GPU server to accelerate developing deep learning methods to reconstruct protein structures from cryo-EM data
购买 GPU 服务器以加速开发深度学习方法,以从冷冻电镜数据重建蛋白质结构
基本信息
- 批准号:10795465
- 负责人:
- 金额:$ 16.72万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-09-20 至 2026-05-31
- 项目状态:未结题
- 来源:
- 关键词:3-DimensionalAccelerationBiomedical ResearchConsumptionCryoelectron MicroscopyDataData SetDevelopmentGoalsGrantHigh Performance ComputingHumanImageInterventionMapsMemoryMethodsModelingNuclear Magnetic ResonanceParentsProcessProductivityProteinsPublishingResearchResolutionResourcesSpeedStructureTechniquesTechnologyTestingTimeTrainingX-Ray Crystallographyartificial intelligence methoddeep learningdeep learning modeldenoisingdensityimprovedlearning strategymicroscopic imagingparticleprotein complexprotein structurereconstructionresearch and developmentstructural biologytechnology developmenttool
项目摘要
Project Summary
The goal of this supplement is to acquire a Dell high-performance computing server with 8 Nvidia A100 Graphics
Processing Units (GPUs) to accelerate the development of deep learning methods to reconstruct protein
structures from cryogenic electron microscopy (cryo-EM) image data accurately and automatically. The cryo-EM
technology can determine the quaternary structure of large protein complexes and assemblies consisting of
many chains that are difficult or even impossible for traditional techniques such as X-ray crystallography or
nuclear magnetic resonance (NMR) to determine. As the cryo-EM technology routinely reached high resolution
in recent years, it has been revolutionizing the field of structural biology and widely used to determine structures
of large protein complexes and assemblies. However, the computational reconstruction of protein structures from
cryo-EM image data is still a time-consuming and labor-intensive process. The advanced artificial intelligence
(AI) methods such as deep learning hold the key to automate the process and improve the reconstruction
accuracy. The parent R01 grant of this supplement aims to develop cutting-edge deep learning models such as
2D and 3D transformers to automate the key tasks of reconstructing protein structures from cryo-EM data: (1)
picking protein particles in cryo-EM images (micrographs); (2) denoising cryo-EM density maps built from protein
particle images; (3) reconstructing protein structures from cryo-EM density maps; and (4) integrating the methods
of (1), (2) and (3) as a pipeline to automatically reconstruct high-accuracy protein structures from cryo-EM image
data without human intervention.
Our substantial progress in the first eight months of this project has demonstrated that the proposed
methods are fully feasible and highly promising. However, training and testing the large deep learning
transformer models on big cryo-EM datasets efficiently and effectively need a large amount of GPU computing
power. Using the current GPU resource available to us, it takes about one year for a developer to complete the
development of one deep learning method. Although the speed can yield significant progress, it is not fast enough
to maximize the potential and impact of the cutting-edge deep learning methods of the parent R01 project. This
supplement will enable us to acquire a high-performance computing server consisting of 8 Nvidia A100 80GB
GPUs to drastically speed up the research in the parent R01 project. This GPU servers can reduce the time of
completing the development of one deep learning model from about one year to less than two months, and
therefore drastically improve the productivity of the developers and greatly accelerate publishing and releasing
the methods and tools developed in this project. Moreover, the large (80GB) memory of each GPU will enable
us to train high-quality deep transformers consisting of millions of parameters to maximize the accuracy of
reconstructing protein structures from cryo-EM image data.
项目摘要
此补充的目标是购买一台配备8个Nvidia A100显卡的戴尔高性能计算服务器
处理单元(GPU),以加速深度学习方法的开发,以重建蛋白质
从低温电子显微镜(cryo-EM)图像数据中准确、自动地提取结构。冷冻EM
技术可以确定大型蛋白质复合物和组装体的四级结构,
许多链是困难的,甚至是不可能的传统技术,如X射线晶体学或
核磁共振(NMR)测定。由于冷冻电镜技术通常达到高分辨率,
近年来,它正在彻底改变结构生物学领域,并广泛用于确定结构
蛋白质复合体和组装体的能力。然而,蛋白质结构的计算重建,
冷冻EM图像数据仍然是一个耗时和劳动密集型的过程。先进的人工智能
(AI)深度学习等方法是自动化过程和改进重建的关键
精度该补充的父R01赠款旨在开发尖端的深度学习模型,例如
2D和3D转换器,用于自动执行从冷冻EM数据重建蛋白质结构的关键任务:(1)
在cryo-EM图像(显微照片)中拾取蛋白质颗粒;(2)从蛋白质构建的去噪cryo-EM密度图
粒子图像;(3)从冷冻EM密度图重建蛋白质结构;(4)整合这些方法
(1)、(2)和(3)作为流水线从冷冻EM图像自动重建高精度蛋白质结构
没有人为干预的数据。
我们在这个项目的头八个月取得的重大进展表明,
这些方法是完全可行的,非常有前途。然而,训练和测试大型深度学习
在大型冷冻EM数据集上高效地建立Transformer模型需要大量的GPU计算
动力.使用我们现有的GPU资源,开发人员需要大约一年的时间才能完成
开发一种深度学习方法。虽然速度可以产生重大进展,但还不够快
最大限度地发挥父R01项目的尖端深度学习方法的潜力和影响。这
补充将使我们能够获得一个高性能计算服务器,包括8个Nvidia A100 80 GB
GPU将大大加快父R01项目的研究速度。这种GPU服务器可以减少
从大约一年到不到两个月完成一个深度学习模型的开发,以及
因此,大大提高了开发人员的生产力,大大加快了发布和发布的速度
在这个项目中开发的方法和工具。此外,每个GPU的大内存(80 GB)将使
我们训练高品质的深变压器组成的数百万参数,以最大限度地提高精度,
从冷冻EM图像数据重建蛋白质结构。
项目成果
期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A large expert-curated cryo-EM image dataset for machine learning protein particle picking.
- DOI:10.1038/s41597-023-02280-2
- 发表时间:2023-06-22
- 期刊:
- 影响因子:9.8
- 作者:Dhakal, Ashwin;Gyawali, Rajan;Wang, Liguo;Cheng, Jianlin
- 通讯作者:Cheng, Jianlin
De Novo Atomic Protein Structure Modeling for Cryo-EM Density Maps Using 3D Transformer and Hidden Markov Model.
使用 3D Transformer 和隐马尔可夫模型对冷冻电镜密度图进行从头原子蛋白质结构建模。
- DOI:10.1101/2024.01.02.573943
- 发表时间:2024
- 期刊:
- 影响因子:0
- 作者:Giri,Nabin;Cheng,Jianlin
- 通讯作者:Cheng,Jianlin
CryoVirusDB: A Labeled Cryo-EM Image Dataset for AI-Driven Virus Particle Picking.
CryoVirusDB:用于 AI 驱动的病毒颗粒挑选的标记 Cryo-EM 图像数据集。
- DOI:10.1101/2023.12.25.573312
- 发表时间:2023
- 期刊:
- 影响因子:0
- 作者:Gyawali,Rajan;Dhakal,Ashwin;Wang,Liguo;Cheng,Jianlin
- 通讯作者:Cheng,Jianlin
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jianlin Cheng其他文献
Jianlin Cheng的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Jianlin Cheng', 18)}}的其他基金
Deep learning methods for automated and accurate reconstruction of protein structures from cryo-EM image data
用于从冷冻电镜图像数据自动准确重建蛋白质结构的深度学习方法
- 批准号:
10459829 - 财政年份:2022
- 资助金额:
$ 16.72万 - 项目类别:
Deep learning methods for automated and accurate reconstruction of protein structures from cryo-EM image data
用于从冷冻电镜图像数据自动准确重建蛋白质结构的深度学习方法
- 批准号:
10707036 - 财政年份:2022
- 资助金额:
$ 16.72万 - 项目类别:
Integrated Prediction of Protein Struture at 1D, 2D and 3D Levels
1D、2D 和 3D 水平的蛋白质结构综合预测
- 批准号:
7863766 - 财政年份:2010
- 资助金额:
$ 16.72万 - 项目类别:
Distance-based ab initio protein structure prediction
基于距离的从头算蛋白质结构预测
- 批准号:
10418784 - 财政年份:2010
- 资助金额:
$ 16.72万 - 项目类别:
Integrated Prediction of Protein Struture at 1D, 2D and 3D Levels
1D、2D 和 3D 水平的蛋白质结构综合预测
- 批准号:
8269738 - 财政年份:2010
- 资助金额:
$ 16.72万 - 项目类别:
Integrated Prediction and Validation of Protein Structures
蛋白质结构的综合预测和验证
- 批准号:
9119094 - 财政年份:2010
- 资助金额:
$ 16.72万 - 项目类别:
Distance-based ab initio protein structure prediction
基于距离的从头算蛋白质结构预测
- 批准号:
10627929 - 财政年份:2010
- 资助金额:
$ 16.72万 - 项目类别:
Integrated Prediction of Protein Struture at 1D, 2D and 3D Levels
1D、2D 和 3D 水平的蛋白质结构综合预测
- 批准号:
8476234 - 财政年份:2010
- 资助金额:
$ 16.72万 - 项目类别:
Distance-based ab initio protein structure prediction
基于距离的从头算蛋白质结构预测
- 批准号:
10251061 - 财政年份:2010
- 资助金额:
$ 16.72万 - 项目类别:
Integrated Prediction of Protein Struture at 1D, 2D and 3D Levels
1D、2D 和 3D 水平的蛋白质结构综合预测
- 批准号:
8059621 - 财政年份:2010
- 资助金额:
$ 16.72万 - 项目类别:
相似海外基金
EXCESS: The role of excess topography and peak ground acceleration on earthquake-preconditioning of landslides
过量:过量地形和峰值地面加速度对滑坡地震预处理的作用
- 批准号:
NE/Y000080/1 - 财政年份:2024
- 资助金额:
$ 16.72万 - 项目类别:
Research Grant
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328975 - 财政年份:2024
- 资助金额:
$ 16.72万 - 项目类别:
Continuing Grant
SHINE: Origin and Evolution of Compressible Fluctuations in the Solar Wind and Their Role in Solar Wind Heating and Acceleration
SHINE:太阳风可压缩脉动的起源和演化及其在太阳风加热和加速中的作用
- 批准号:
2400967 - 财政年份:2024
- 资助金额:
$ 16.72万 - 项目类别:
Standard Grant
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328973 - 财政年份:2024
- 资助金额:
$ 16.72万 - 项目类别:
Continuing Grant
Market Entry Acceleration of the Murb Wind Turbine into Remote Telecoms Power
默布风力涡轮机加速进入远程电信电力市场
- 批准号:
10112700 - 财政年份:2024
- 资助金额:
$ 16.72万 - 项目类别:
Collaborative R&D
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328972 - 财政年份:2024
- 资助金额:
$ 16.72万 - 项目类别:
Continuing Grant
Collaborative Research: A new understanding of droplet breakup: hydrodynamic instability under complex acceleration
合作研究:对液滴破碎的新认识:复杂加速下的流体动力学不稳定性
- 批准号:
2332916 - 财政年份:2024
- 资助金额:
$ 16.72万 - 项目类别:
Standard Grant
Collaborative Research: A new understanding of droplet breakup: hydrodynamic instability under complex acceleration
合作研究:对液滴破碎的新认识:复杂加速下的流体动力学不稳定性
- 批准号:
2332917 - 财政年份:2024
- 资助金额:
$ 16.72万 - 项目类别:
Standard Grant
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328974 - 财政年份:2024
- 资助金额:
$ 16.72万 - 项目类别:
Continuing Grant
Study of the Particle Acceleration and Transport in PWN through X-ray Spectro-polarimetry and GeV Gamma-ray Observtions
通过 X 射线光谱偏振法和 GeV 伽马射线观测研究 PWN 中的粒子加速和输运
- 批准号:
23H01186 - 财政年份:2023
- 资助金额:
$ 16.72万 - 项目类别:
Grant-in-Aid for Scientific Research (B)