Discovering interpretable mechanisms explaining high dimensional biomolecular data
发现解释高维生物分子数据的可解释机制
基本信息
- 批准号:10711988
- 负责人:
- 金额:$ 41万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-09-01 至 2028-07-31
- 项目状态:未结题
- 来源:
- 关键词:AccelerationAddressAmino Acid SequenceAmyloid beta-ProteinAntibiotic ResistanceAntibioticsAutomobile DrivingBehaviorBiological AssayBiologyCell Fate ControlCollaborationsComplexComputing MethodologiesDataData SetDimensionsDirected Molecular EvolutionDiseaseFoundationsFutureGoalsHealthHumanInformaticsIntuitionLearningLibrariesLiquid substanceMethodsModelingNetwork-basedNeurobiologyNeurodegenerative DisordersPatientsPatternPeptide LibraryPeptidesPharmaceutical PreparationsPhysicsProteinsRNARNA SequencesResourcesRunningSamplingScienceStructureTimeTrainingWorkartificial neural networkbeta-Lactamasedeep learningdisease diagnosisexperimental studyhigh dimensionalityinsightinventionlarge datasetsmolecular dynamicsmonomermutantneural networkneurotoxicprotein aggregationself assemblysimulationtau Proteinstau aggregation
项目摘要
Discovering interpretable mechanisms explaining high-dimensional biomolecular data
Project summary. How protein and RNA sequence encodes folding, aggregation, and function is a fundamental
question with wide-ranging human health implications. Discovering predictive principles for this encoding
requires computational approaches that offer mechanistic insight, especially for the large fraction of intrinsically
disordered proteins for which experimental structural information is limited. Yet the complexity and dimensionality
of this problem poses fundamental challenges to existing computational methods. The axiomatic approach,
modeling behavior from first-principles, is limited by simulation runtime and unknown context-dependent
parameters. Informatics-based approaches such as deep learning could potentially discover principles by
integrating large datasets across scales and complexity. However, these models produce “black box” predictions
that i) are difficult to understand and ii) generalize poorly beyond their training data (i.e. well-understood regime).
My lab developed methods to overcome limitations of both types of approaches. (1) Axiomatic: we
developed a statistical physics method to exponentially enhance sampling of protein self-assembly from
structurally heterogeneous monomers in molecular dynamics simulations. (2) Informatic: we invented essence
neural networks (ENNs) based on neurobiological principles and demonstrated that they overcome the above
limitations of deep learning on a wide range of learning tasks, including sequence-to-function prediction.
Using both axiomatic and informatic approaches, in the next five years my lab will tackle three instances
of the sequence-structure-function problem: 1) Use enhanced sampling molecular dynamics simulations to
discover transition states of neurotoxic oligomer and fibril formation of Abeta and tau peptide monomers; 2) Use
ENNs to discover the RNA-sequence rules driving RNA-associated tau fibril aggregation in neurodegenerative
disease using tau protein and colocalized RNA sequence datasets; 3) Use ENNs to distill the sequence rules
determining whether a strain or mutant of beta lactamase protein can neutralize each antibiotic within a diverse
drug panel, and identify potential future antibiotic resistant mutants. Our long-term goal is to develop an ENN-
based platform for automated transformation of data into axioms. Leveraging well-established collaborations
with colleagues of wide expertise, we will pursue these goals by combining our unique computational approaches
with experimental resources, including time-resolved protein aggregation assays, patient-derived tau fibrils co-
localized with sequence-specific RNA, high-throughput liquid culture antibiotic screens, multiplexed directed
evolution experiments of antibiotic resistance, and large in-house libraries of peptide and RNA mutant libraries.
This work lays the foundation for transforming large datasets into human-understandable rules
connecting sequence to function and relating these rules to physical mechanisms of structural dynamics. This in
turn could accelerate disease diagnosis and treatment.
发现解释高维生物分子数据的可解释机制
项目摘要。蛋白质和RNA序列如何编码折叠,聚集和功能是一个基本的
对人类健康有着广泛影响的问题。发现这种编码的预测原则
需要提供机械洞察力的计算方法,特别是对于大部分本质上
实验结构信息有限的无序蛋白质。然而,
这个问题的解决对现有的计算方法提出了根本性的挑战。公理方法,
从第一性原理建模行为,受到仿真运行时间和未知上下文相关性的限制
参数基于信息学的方法,如深度学习,可以通过以下方式发现原理:
跨规模和复杂性集成大型数据集。然而,这些模型产生“黑箱”预测
i)难以理解,ii)在训练数据之外概括性差(即,良好理解的机制)。
我的实验室开发了克服这两种方法局限性的方法。(1)公理:我们
开发了一种统计物理方法,以指数方式增强蛋白质自组装的采样,
分子动力学模拟中的结构异质单体。(2)信息:我们发明了精华
基于神经生物学原理的神经网络(恩斯),并证明它们克服了上述问题
深度学习在广泛的学习任务上的局限性,包括序列到功能的预测。
在接下来的五年里,我的实验室将同时使用公理化和信息化方法来解决三个问题。
序列-结构-功能问题:1)使用增强的采样分子动力学模拟,
发现神经毒性寡聚体的过渡态和Abeta和Tau肽单体的原纤维形成; 2)用途
恩斯发现驱动神经退行性疾病中RNA相关tau纤维聚集的RNA序列规则
使用tau蛋白和共定位的RNA序列数据集的疾病; 3)使用恩斯提取序列规则
确定β-内酰胺酶蛋白的菌株或突变体是否可以中和多种抗生素中的每种抗生素。
药物面板,并确定潜在的未来抗生素耐药突变体。我们的长期目标是发展一个新的网络-
基于平台的数据自动转换为公理。利用完善的合作关系
我们将与具有广泛专业知识的同事一起,通过结合我们独特的计算方法来实现这些目标
利用实验资源,包括时间分辨蛋白质聚集测定,患者来源的tau纤维共
定位与序列特异性RNA,高通量液体培养抗生素筛选,多重定向
抗生素抗性的进化实验,以及肽和RNA突变体库的大型内部库。
这项工作为将大型数据集转换为人类可理解的规则奠定了基础
将序列与功能联系起来,并将这些规则与结构动力学的物理机制联系起来。这
可以加快疾病的诊断和治疗。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Milo Lin其他文献
Milo Lin的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
Rational design of rapidly translatable, highly antigenic and novel recombinant immunogens to address deficiencies of current snakebite treatments
合理设计可快速翻译、高抗原性和新型重组免疫原,以解决当前蛇咬伤治疗的缺陷
- 批准号:
MR/S03398X/2 - 财政年份:2024
- 资助金额:
$ 41万 - 项目类别:
Fellowship
Re-thinking drug nanocrystals as highly loaded vectors to address key unmet therapeutic challenges
重新思考药物纳米晶体作为高负载载体以解决关键的未满足的治疗挑战
- 批准号:
EP/Y001486/1 - 财政年份:2024
- 资助金额:
$ 41万 - 项目类别:
Research Grant
CAREER: FEAST (Food Ecosystems And circularity for Sustainable Transformation) framework to address Hidden Hunger
职业:FEAST(食品生态系统和可持续转型循环)框架解决隐性饥饿
- 批准号:
2338423 - 财政年份:2024
- 资助金额:
$ 41万 - 项目类别:
Continuing Grant
Metrology to address ion suppression in multimodal mass spectrometry imaging with application in oncology
计量学解决多模态质谱成像中的离子抑制问题及其在肿瘤学中的应用
- 批准号:
MR/X03657X/1 - 财政年份:2024
- 资助金额:
$ 41万 - 项目类别:
Fellowship
CRII: SHF: A Novel Address Translation Architecture for Virtualized Clouds
CRII:SHF:一种用于虚拟化云的新型地址转换架构
- 批准号:
2348066 - 财政年份:2024
- 资助金额:
$ 41万 - 项目类别:
Standard Grant
BIORETS: Convergence Research Experiences for Teachers in Synthetic and Systems Biology to Address Challenges in Food, Health, Energy, and Environment
BIORETS:合成和系统生物学教师的融合研究经验,以应对食品、健康、能源和环境方面的挑战
- 批准号:
2341402 - 财政年份:2024
- 资助金额:
$ 41万 - 项目类别:
Standard Grant
The Abundance Project: Enhancing Cultural & Green Inclusion in Social Prescribing in Southwest London to Address Ethnic Inequalities in Mental Health
丰富项目:增强文化
- 批准号:
AH/Z505481/1 - 财政年份:2024
- 资助金额:
$ 41万 - 项目类别:
Research Grant
ERAMET - Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
ERAMET - 快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
- 批准号:
10107647 - 财政年份:2024
- 资助金额:
$ 41万 - 项目类别:
EU-Funded
Ecosystem for rapid adoption of modelling and simulation METhods to address regulatory needs in the development of orphan and paediatric medicines
快速采用建模和模拟方法的生态系统,以满足孤儿药和儿科药物开发中的监管需求
- 批准号:
10106221 - 财政年份:2024
- 资助金额:
$ 41万 - 项目类别:
EU-Funded
Recite: Building Research by Communities to Address Inequities through Expression
背诵:社区开展研究,通过表达解决不平等问题
- 批准号:
AH/Z505341/1 - 财政年份:2024
- 资助金额:
$ 41万 - 项目类别:
Research Grant