Transparent Deep Learning for Directed Protein Evolution

用于定向蛋白质进化的透明深度学习

基本信息

  • 批准号:
    2745409
  • 负责人:
  • 金额:
    --
  • 依托单位:
  • 依托单位国家:
    英国
  • 项目类别:
    Studentship
  • 财政年份:
    2022
  • 资助国家:
    英国
  • 起止时间:
    2022 至 无数据
  • 项目状态:
    未结题

项目摘要

Protein engineering is a complex process, which requires finding an amino acid sequence associated with a desired function. As the design space grows exponentially as a function of the number of residues, de-novo design is currently an intractable problem. To overcome the curse of protein design complexity, scientists routinely rely on an iterative process consisting of random mutagenesis and selection of protein variants, called Directed Evolution (DE, 1); while this process led to remarkable results, it is extremely slow, low-throughput and expensive, as the probability of generating functional proteins at each step is low. Thus, for the last 30 years, scientists have developed biophysical models and optimisation methods to predict protein structure and function in-silico; however, these methods are usually not scalable to large proteins and are limited by the accuracy of the underlying biophysical models. Recently, Machine Learning (ML) and, in particular, Deep Learning (DL) have largely overcome these problems by learning functional relationships associated with protein folding and function directly from data [2]. However, it remains opaque and challenging to understand how a DL model makes structural and functional predictions [3], thus limiting their utility in understanding the biological design principles associated with functional proteins. AIMS AND OBJECTIVES: In collaboration with ZenithAI (OT/ZAI), we propose to design and build transparent and explainable deep learning models for protein design. The protein design space increases exponentially with the number of amino acid positions considered but functional proteins are extremely rare. Therefore, transparent models can provide a principled protein selection method, by only looking at important and uncertain amino acid positions, ultimately reducing the burden of experimental screening of protein variants. WORKPLAN. The project is structured in 3 work packages. - WP1 - The student will develop a deep learning framework for protein engineering, using state-of-the-art variational and adversarial models coupled with sequence-to-sequence models, which will be trained using curated protein sequence information stratified by species and function. - WP2 - The student will then develop probabilistic models to quantify uncertainty in designs by exploiting gradient and weights information learned by the model, ultimately to define a score to prioritise proteins for experimental testing. - WP3 - The student will use the model to design variants of the human S1PL enzyme, which will then be tested in the lab. S1PL is a central enzyme in the sphingolipid pathway, which is essential for proper cell functioning and it has a causal role in many diseases, including cancer and neurodegenerative disorders.TRAINING PROGRAM. The student will receive training in machine learning, statistical learning and deep learning, and will build a competitive profile in biological sequence modelling and design. The student will be also introduced to the emerging field of synthetic biology and will learn modern DNA cloning and assembly techniques and the use of protein expression systems at scale. We also put a strong emphasis on reproducible research; the student will receive training in advanced research software engineering and in reproducible workflows for data analyses.
Protein engineering is a complex process, which requires finding an amino acid sequence associated with a desired function. As the design space grows exponentially as a function of the number of residues, de-novo design is currently an intractable problem. To overcome the curse of protein design complexity, scientists routinely rely on an iterative process consisting of random mutagenesis and selection of protein variants, called Directed Evolution (DE, 1); while this process led to remarkable results, it is extremely slow, low-throughput and expensive, as the probability of generating functional proteins at each step is low. Thus, for the last 30 years, scientists have developed biophysical models and optimisation methods to predict protein structure and function in-silico; however, these methods are usually not scalable to large proteins and are limited by the accuracy of the underlying biophysical models. Recently, Machine Learning (ML) and, in particular, Deep Learning (DL) have largely overcome these problems by learning functional relationships associated with protein folding and function directly from data [2]. However, it remains opaque and challenging to understand how a DL model makes structural and functional predictions [3], thus limiting their utility in understanding the biological design principles associated with functional proteins. AIMS AND OBJECTIVES: In collaboration with ZenithAI (OT/ZAI), we propose to design and build transparent and explainable deep learning models for protein design. The protein design space increases exponentially with the number of amino acid positions considered but functional proteins are extremely rare. Therefore, transparent models can provide a principled protein selection method, by only looking at important and uncertain amino acid positions, ultimately reducing the burden of experimental screening of protein variants. WORKPLAN. The project is structured in 3 work packages. - WP1 - The student will develop a deep learning framework for protein engineering, using state-of-the-art variational and adversarial models coupled with sequence-to-sequence models, which will be trained using curated protein sequence information stratified by species and function. - WP2 - The student will then develop probabilistic models to quantify uncertainty in designs by exploiting gradient and weights information learned by the model, ultimately to define a score to prioritise proteins for experimental testing. - WP3 - The student will use the model to design variants of the human S1PL enzyme, which will then be tested in the lab. S1PL is a central enzyme in the sphingolipid pathway, which is essential for proper cell functioning and it has a causal role in many diseases, including cancer and neurodegenerative disorders.TRAINING PROGRAM. The student will receive training in machine learning, statistical learning and deep learning, and will build a competitive profile in biological sequence modelling and design. The student will be also introduced to the emerging field of synthetic biology and will learn modern DNA cloning and assembly techniques and the use of protein expression systems at scale. We also put a strong emphasis on reproducible research; the student will receive training in advanced research software engineering and in reproducible workflows for data analyses.

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

其他文献

吉治仁志 他: "トランスジェニックマウスによるTIMP-1の線維化促進機序"最新医学. 55. 1781-1787 (2000)
Hitoshi Yoshiji 等:“转基因小鼠中 TIMP-1 的促纤维化机制”现代医学 55. 1781-1787 (2000)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
LiDAR Implementations for Autonomous Vehicle Applications
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
生命分子工学・海洋生命工学研究室
生物分子工程/海洋生物技术实验室
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
吉治仁志 他: "イラスト医学&サイエンスシリーズ血管の分子医学"羊土社(渋谷正史編). 125 (2000)
Hitoshi Yoshiji 等人:“血管医学与科学系列分子医学图解”Yodosha(涉谷正志编辑)125(2000)。
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:
Effect of manidipine hydrochloride,a calcium antagonist,on isoproterenol-induced left ventricular hypertrophy: "Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,K.,Teragaki,M.,Iwao,H.and Yoshikawa,J." Jpn Circ J. 62(1). 47-52 (1998)
钙拮抗剂盐酸马尼地平对异丙肾上腺素引起的左心室肥厚的影响:“Yoshiyama,M.,Takeuchi,K.,Kim,S.,Hanatani,A.,Omura,T.,Toda,I.,Akioka,
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
  • 通讯作者:

的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('', 18)}}的其他基金

An implantable biosensor microsystem for real-time measurement of circulating biomarkers
用于实时测量循环生物标志物的植入式生物传感器微系统
  • 批准号:
    2901954
  • 财政年份:
    2028
  • 资助金额:
    --
  • 项目类别:
    Studentship
Exploiting the polysaccharide breakdown capacity of the human gut microbiome to develop environmentally sustainable dishwashing solutions
利用人类肠道微生物群的多糖分解能力来开发环境可持续的洗碗解决方案
  • 批准号:
    2896097
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
A Robot that Swims Through Granular Materials
可以在颗粒材料中游动的机器人
  • 批准号:
    2780268
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Likelihood and impact of severe space weather events on the resilience of nuclear power and safeguards monitoring.
严重空间天气事件对核电和保障监督的恢复力的可能性和影响。
  • 批准号:
    2908918
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Proton, alpha and gamma irradiation assisted stress corrosion cracking: understanding the fuel-stainless steel interface
质子、α 和 γ 辐照辅助应力腐蚀开裂:了解燃料-不锈钢界面
  • 批准号:
    2908693
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Field Assisted Sintering of Nuclear Fuel Simulants
核燃料模拟物的现场辅助烧结
  • 批准号:
    2908917
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Assessment of new fatigue capable titanium alloys for aerospace applications
评估用于航空航天应用的新型抗疲劳钛合金
  • 批准号:
    2879438
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Developing a 3D printed skin model using a Dextran - Collagen hydrogel to analyse the cellular and epigenetic effects of interleukin-17 inhibitors in
使用右旋糖酐-胶原蛋白水凝胶开发 3D 打印皮肤模型,以分析白细胞介素 17 抑制剂的细胞和表观遗传效应
  • 批准号:
    2890513
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
CDT year 1 so TBC in Oct 2024
CDT 第 1 年,预计 2024 年 10 月
  • 批准号:
    2879865
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship
Understanding the interplay between the gut microbiome, behavior and urbanisation in wild birds
了解野生鸟类肠道微生物组、行为和城市化之间的相互作用
  • 批准号:
    2876993
  • 财政年份:
    2027
  • 资助金额:
    --
  • 项目类别:
    Studentship

相似国自然基金

基于Deep Unrolling的高分辨近红外二区荧光分子断层成像方法研究
  • 批准号:
    12271434
  • 批准年份:
    2022
  • 资助金额:
    46 万元
  • 项目类别:
    面上项目
基于深度森林(Deep Forest)模型的表面增强拉曼光谱分析方法研究
  • 批准号:
    2020A151501709
  • 批准年份:
    2020
  • 资助金额:
    10.0 万元
  • 项目类别:
    省市级项目
面向Deep Web的数据整合关键技术研究
  • 批准号:
    61872168
  • 批准年份:
    2018
  • 资助金额:
    62.0 万元
  • 项目类别:
    面上项目
基于Deep-learning的三江源区冰川监测动态识别技术研究
  • 批准号:
    51769027
  • 批准年份:
    2017
  • 资助金额:
    38.0 万元
  • 项目类别:
    地区科学基金项目
具有时序处理能力的Spiking-Deep Learning(脉冲深度学习)方法研究
  • 批准号:
    61573081
  • 批准年份:
    2015
  • 资助金额:
    64.0 万元
  • 项目类别:
    面上项目
基于语义计算的海量Deep Web知识探索机制研究
  • 批准号:
    61272411
  • 批准年份:
    2012
  • 资助金额:
    80.0 万元
  • 项目类别:
    面上项目
Deep Web数据集成查询结果抽取与整合关键技术研究
  • 批准号:
    61100167
  • 批准年份:
    2011
  • 资助金额:
    20.0 万元
  • 项目类别:
    青年科学基金项目
面向Deep Web的大规模知识库自动构建方法研究
  • 批准号:
    61170020
  • 批准年份:
    2011
  • 资助金额:
    57.0 万元
  • 项目类别:
    面上项目
Deep Web敏感聚合信息保护方法研究
  • 批准号:
    61003054
  • 批准年份:
    2010
  • 资助金额:
    20.0 万元
  • 项目类别:
    青年科学基金项目
基于逻辑强化学习的Deep Web模式匹配研究
  • 批准号:
    61070122
  • 批准年份:
    2010
  • 资助金额:
    32.0 万元
  • 项目类别:
    面上项目

相似海外基金

CAREER: Adaptive Deep Learning Systems Towards Edge Intelligence
职业:迈向边缘智能的自适应深度学习系统
  • 批准号:
    2338512
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Continuing Grant
CRII: OAC: A Compressor-Assisted Collective Communication Framework for GPU-Based Large-Scale Deep Learning
CRII:OAC:基于 GPU 的大规模深度学习的压缩器辅助集体通信框架
  • 批准号:
    2348465
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
Deep Learningを活用した超音波ガイドによる安全な静脈穿刺法の開発
利用深度学习的超声引导开发安全静脉穿刺方法
  • 批准号:
    24K13362
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
MFB: Better Homologous Folding using Computational Linguistics and Deep Learning
MFB:使用计算语言学和深度学习更好的同源折叠
  • 批准号:
    2330737
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Standard Grant
深層学習(Deep learning)による骨転移検出AIモデルの開発と臨床応用
深度学习骨转移检测AI模型开发及临床应用
  • 批准号:
    24K18754
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Deep Learningを活用した安静時心電図からの非侵襲的冠動脈疾患予測
使用深度学习通过静息心电图进行无创冠状动脉疾病预测
  • 批准号:
    24K19024
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
  • 批准号:
    EP/Y029089/1
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Research Grant
Navigating Chemical Space with Natural Language Processing and Deep Learning
利用自然语言处理和深度学习驾驭化学空间
  • 批准号:
    EP/Y004167/1
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Research Grant
Developing and Visualising a Retrieval-Augmented Deep Learning Model for Population Health Management
开发和可视化用于人口健康管理的检索增强深度学习模型
  • 批准号:
    2905946
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Studentship
Deep Learning with Limited Data for Battery Materials Design
电池材料设计中数据有限的深度学习
  • 批准号:
    EP/Y000552/1
  • 财政年份:
    2024
  • 资助金额:
    --
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了