Deep learning methods for automated and accurate reconstruction of protein structures from cryo-EM image data

用于从冷冻电镜图像数据自动准确重建蛋白质结构的深度学习方法

基本信息

  • 批准号:
    10707036
  • 负责人:
  • 金额:
    $ 30.27万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-09-20 至 2026-05-31
  • 项目状态:
    未结题

项目摘要

Project Summary Cryogenic electron microscopy (cryo-EM) has emerged as a major experimental technology to determine protein structures as it reached atomic resolution (1.2-4Å) in recent years. Compared to traditional techniques (i.e., X- ray crystallography and nuclear magnetic resonance), cryo-EM has the unique capability of determining the quaternary structures of large protein complexes and assemblies difficult or impossible for them to handle. The advance of cryo-EM technology has stimulated a revolution in structural biology of studying large protein complexes and assemblies that cannot be well studied before. However, the computational reconstruction of protein structures from cryo-EM image data is still a time-consuming, labor-intensive, error-prone, and often inaccurate process, due to the bottleneck in picking protein particles in cryo-EM images, substantial noise in 3D cryo-EM density maps generated from particle images, and lack of automated and accurate methods to build protein structures from density maps. We plan to develop advanced deep learning methods to reconstruct protein structures automatically and accurately from cryo-EM images data, leveraging the large amount of high- resolution cryo-EM data accumulated in the field and the latest advances in the deep learning technology. We will develop 2D transformer networks built on top of the attention mechanism that perform better than traditional convolutional and recurrent neural networks in image processing to pick single protein particles accurately and automatically in cryo-EM image data via a novel combination of unsupervised and supervised learning. Moreover, we formulate the problem of denoising 3D cryo-EM density maps generated from 2D particle images as a novel machine learning problem and will develop both 3D deep autoencoders and rotation- /translation-equivariant transformer networks to remove noise in cryo-EM density maps. Furthermore, we will develop end-to-end 3D rotation-/translation-equivariant networks to directly identify the backbone atoms of proteins from 3D density maps without using any known structure as template, which will be used by a novel hidden Markov model to build the high-resolution full-atom structures of any protein. The methods will be rigorously evaluated on the large amount of cryo-EM data and compared with existing methods. All these methods will be integrated together to create a fully automated machine learning pipeline, the first of its kind in the field, to reconstruct protein structures more accurately from cryo-EM images than existing methods. We will implement the individual deep learning methods as well as the entire pipeline as open-source packages released at GitHub for the community to use. We will further validate the tools and pipeline by applying them to the new cryo-EM data of a group of important membrane protein complexes (i.e., ion channels) to be generated at the Brookhaven National Laboratory.
项目摘要 低温电子显微镜(cryo-EM)已成为测定蛋白质的主要实验技术 结构,因为它达到了原子分辨率(1.2- 4 nm),近年来。与传统技术(即,X-的 射线晶体学和核磁共振),cryo-EM具有确定 大型蛋白质复合物的四级结构以及它们难以或不可能处理的组装。的 冷冻电镜技术的发展,刺激了研究大蛋白质的结构生物学的革命 复合物和组装体,以前不能很好地研究。然而,计算重建的 从冷冻EM图像数据中提取蛋白质结构仍然是一个耗时、劳动密集、容易出错且经常 不准确的过程,由于在冷冻EM图像中拾取蛋白质颗粒的瓶颈,3D中的大量噪声 从粒子图像生成的cryo-EM密度图,缺乏自动化和准确的方法来建立 密度图上的蛋白质结构我们计划开发先进的深度学习方法来重建蛋白质 结构自动和准确地从冷冻EM图像数据,利用大量的高- 该领域积累的高分辨率cryo-EM数据和深度学习技术的最新进展。 我们将开发建立在注意力机制之上的2D Transformer网络,其性能优于 传统的卷积和递归神经网络在图像处理中挑选单个蛋白质颗粒 通过无监督和有监督的新组合, 学习此外,我们制定了从2D粒子生成的3D冷冻EM密度图的去噪问题 图像作为一个新的机器学习问题,并将开发3D深度自动编码器和旋转- /消隐-等变Transformer网络,用于消除低温EM密度图中的噪声。此外,我们将 开发端到端的3D旋转/旋光等变网络,以直接识别 蛋白质从3D密度图,而不使用任何已知的结构作为模板,这将用于一个新的 隐马尔可夫模型来构建任何蛋白质的高分辨率全原子结构。方法将是 在大量的冷冻电镜数据上进行了严格的评估,并与现有的方法进行了比较。所有这些 方法将被集成在一起,以创建一个完全自动化的机器学习管道,这是世界上第一个这样的机器学习管道。 该领域,重建蛋白质结构更准确地从冷冻EM图像比现有的方法。我们将 实现单独的深度学习方法以及整个管道作为开源软件包发布 在GitHub上供社区使用。我们将通过将这些工具和管道应用于新的 一组重要的膜蛋白复合物(即,离子通道),以产生在 布鲁克海文国家实验室。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Jianlin Cheng其他文献

Jianlin Cheng的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Jianlin Cheng', 18)}}的其他基金

Acquiring a GPU server to accelerate developing deep learning methods to reconstruct protein structures from cryo-EM data
购买 GPU 服务器以加速开发深度学习方法,以从冷冻电镜数据重建蛋白质结构
  • 批准号:
    10795465
  • 财政年份:
    2022
  • 资助金额:
    $ 30.27万
  • 项目类别:
Deep learning methods for automated and accurate reconstruction of protein structures from cryo-EM image data
用于从冷冻电镜图像数据自动准确重建蛋白质结构的深度学习方法
  • 批准号:
    10459829
  • 财政年份:
    2022
  • 资助金额:
    $ 30.27万
  • 项目类别:
Integrated Prediction of Protein Struture at 1D, 2D and 3D Levels
1D、2D 和 3D 水平的蛋白质结构综合预测
  • 批准号:
    7863766
  • 财政年份:
    2010
  • 资助金额:
    $ 30.27万
  • 项目类别:
Distance-based ab initio protein structure prediction
基于距离的从头算蛋白质结构预测
  • 批准号:
    10418784
  • 财政年份:
    2010
  • 资助金额:
    $ 30.27万
  • 项目类别:
Integrated Prediction of Protein Struture at 1D, 2D and 3D Levels
1D、2D 和 3D 水平的蛋白质结构综合预测
  • 批准号:
    8269738
  • 财政年份:
    2010
  • 资助金额:
    $ 30.27万
  • 项目类别:
Integrated Prediction and Validation of Protein Structures
蛋白质结构的综合预测和验证
  • 批准号:
    9119094
  • 财政年份:
    2010
  • 资助金额:
    $ 30.27万
  • 项目类别:
Distance-based ab initio protein structure prediction
基于距离的从头算蛋白质结构预测
  • 批准号:
    10627929
  • 财政年份:
    2010
  • 资助金额:
    $ 30.27万
  • 项目类别:
Integrated Prediction of Protein Struture at 1D, 2D and 3D Levels
1D、2D 和 3D 水平的蛋白质结构综合预测
  • 批准号:
    8476234
  • 财政年份:
    2010
  • 资助金额:
    $ 30.27万
  • 项目类别:
Distance-based ab initio protein structure prediction
基于距离的从头算蛋白质结构预测
  • 批准号:
    10251061
  • 财政年份:
    2010
  • 资助金额:
    $ 30.27万
  • 项目类别:
Integrated Prediction of Protein Struture at 1D, 2D and 3D Levels
1D、2D 和 3D 水平的蛋白质结构综合预测
  • 批准号:
    8059621
  • 财政年份:
    2010
  • 资助金额:
    $ 30.27万
  • 项目类别:

相似海外基金

DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
  • 批准号:
    EP/Y029089/1
  • 财政年份:
    2024
  • 资助金额:
    $ 30.27万
  • 项目类别:
    Research Grant
CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
  • 批准号:
    2337776
  • 财政年份:
    2024
  • 资助金额:
    $ 30.27万
  • 项目类别:
    Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
  • 批准号:
    2338816
  • 财政年份:
    2024
  • 资助金额:
    $ 30.27万
  • 项目类别:
    Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
  • 批准号:
    2338846
  • 财政年份:
    2024
  • 资助金额:
    $ 30.27万
  • 项目类别:
    Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
  • 批准号:
    2348261
  • 财政年份:
    2024
  • 资助金额:
    $ 30.27万
  • 项目类别:
    Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
  • 批准号:
    2348346
  • 财政年份:
    2024
  • 资助金额:
    $ 30.27万
  • 项目类别:
    Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
  • 批准号:
    2348457
  • 财政年份:
    2024
  • 资助金额:
    $ 30.27万
  • 项目类别:
    Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
  • 批准号:
    2404989
  • 财政年份:
    2024
  • 资助金额:
    $ 30.27万
  • 项目类别:
    Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
  • 批准号:
    2339310
  • 财政年份:
    2024
  • 资助金额:
    $ 30.27万
  • 项目类别:
    Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
  • 批准号:
    2339669
  • 财政年份:
    2024
  • 资助金额:
    $ 30.27万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了