Nonlinear Manifold Learning of Protein Folding Funnels from Delay-Embedded Experimental Measurements
来自延迟嵌入实验测量的蛋白质折叠漏斗的非线性流形学习
基本信息
- 批准号:1841810
- 负责人:
- 金额:$ 16.2万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2018
- 资助国家:美国
- 起止时间:2018-07-01 至 2021-07-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Proteins are the molecular engines that perform biological functions essential to life. A major milestone in the understanding of protein behavior emerged with the advent of the "new view" of protein folding. This perspective conceives of protein structure, stability, and dynamics as governed by a molecular-level landscape not unlike a relief map describing the surface of the Earth. Each point on the landscape - analogous to latitude and longitude - corresponds to a particular spatial arrangement of protein atoms. The altitude of each point on the map defines protein stability - unstable conformations lie on mountaintops and stable conformations within valley floors. Determining these landscapes is a key goal of protein biology since they are useful in the understanding of natural proteins and design of synthetic proteins as drugs, enzymes, and molecular machines. It is relatively straightforward to calculate these landscapes for small proteins using computer simulations, but it has not been possible to do so from experimental measurements. It is the primary objective of this research to combine mathematical tools from the modeling of dynamical systems with machine learning approaches to analyze high-dimensional datasets to determine approximate protein folding landscapes directly from experimental data. The approach will first be validated in computer simulations of small proteins where the folding landscape is known. Theoretical analyses will place bounds on how close the approximate landscapes are to the true landscapes, and place conditions on the experimental data required for their determination. Ultimately the approach will be applied to experimental measurements of a tuberculosis protein. The computational analysis tool will be released as user-friendly software for free public download. Positive research experiences have great benefits for undergraduate success and retention, and this award will support summer and academic year research opportunities. New educational outreach materials will be developed for the University of Illinois "Engineering Open House" to promote awareness of materials science and engineering among middle- and high-school students.The aim of this work is to integrate nonlinear manifold learning with dynamical systems theory to reconstruct protein folding landscapes from experimental time series measuring a single system observable. The "new view" of protein folding revolutionized understanding of folding as a conformational search over rugged and funneled free energy landscapes parameterized by a small number of emergent collective variables, with transformative implications for the understanding and design of proteins as drugs, enzymes, and molecular machines. It is now relatively routine to determine multidimensional folding landscapes from computer simulations in which all atomic coordinates are known, but it has not been possible to do so from experimental measurements of protein dynamics that are restricted to small numbers of coarse-grained observables. This research project integrates Takens' delay embeddings with nonlinear manifold learning using diffusion maps to first project univariate time series in an experimentally measurable observable into a high-dimensional space in which the dynamics are C1-equivalent to those in real space, and then extract from this space a topologically and geometrically equivalent reconstruction of the folding funnel to that which would have been determined from knowledge of all atomic coordinates. The reconstructed landscape preserves the topology of the true funnel - the metastable configurations and folding pathways - but the topography may be perturbed, i.e., the heights and depths of the free energy peaks and valleys. The three primary objectives of this work are to (i) validate the approach in molecular dynamics simulations of small proteins for which the true landscape is known, (ii) place conditions on the sampling resolution and signal-to-noise ratio in experimental measurements for robust landscape recovery, and theoretical bounds on the induced topographical perturbations, and (iii) apply the approach to experimental single-molecule Forster resonance energy transfer (smFRET) measurements on the lid-opening and closing dynamics of Mycobacterium tuberculosis protein tyrosine phosphatase (Mtb-PtpB).
蛋白质是执行生命所必需的生物功能的分子引擎。随着蛋白质折叠“新观点”的出现,蛋白质行为理解的一个重要里程碑出现了。这种观点认为蛋白质的结构,稳定性和动力学是由分子水平的景观所控制的,就像描述地球表面的地形图一样。景观上的每一点--类似于纬度和经度--对应于蛋白质原子的特定空间排列。地图上每个点的海拔高度定义了蛋白质的稳定性--不稳定的构象位于山顶,稳定的构象位于谷底。确定这些景观是蛋白质生物学的一个关键目标,因为它们有助于理解天然蛋白质和设计合成蛋白质作为药物,酶和分子机器。使用计算机模拟来计算小蛋白质的这些景观是相对简单的,但从实验测量中还不可能做到这一点。本研究的主要目的是将动力学系统建模中的联合收割机数学工具与机器学习方法相结合,分析高维数据集,直接从实验数据中确定近似的蛋白质折叠景观。该方法将首先在折叠景观已知的小蛋白质的计算机模拟中进行验证。理论分析将对近似景观与真实景观的接近程度设定界限,并对确定它们所需的实验数据设定条件。最终,该方法将应用于结核蛋白的实验测量。计算分析工具将作为用户友好型软件发布,供公众免费下载。积极的研究经验对本科生的成功和保留有很大的好处,这个奖项将支持夏季和学年的研究机会。将为伊利诺伊大学的“工程开放日”开发新的教育推广材料,以提高初中和高中学生对材料科学和工程的认识。这项工作的目的是将非线性流形学习与动力系统理论相结合,从测量单个可观察系统的实验时间序列中重建蛋白质折叠景观。蛋白质折叠的“新观点”彻底改变了对折叠的理解,将其视为对由少量涌现的集体变量参数化的崎岖和漏斗状自由能景观的构象搜索,对蛋白质作为药物,酶和分子机器的理解和设计具有变革性的影响。现在,从已知所有原子坐标的计算机模拟中确定多维折叠景观是相对常规的,但从蛋白质动力学的实验测量中不可能做到这一点,因为这些实验测量仅限于少量粗粒度的可观测量。该研究项目将Takens的延迟嵌入与使用扩散映射的非线性流形学习相结合,首先将实验可测量的可观测单变量时间序列投影到高维空间中,其中动态与真实的空间中的动态是C1等价的,然后从这个空间中提取折叠漏斗的拓扑和几何等效重建,以根据所有的知识来确定原子坐标重建的景观保留了真正漏斗的拓扑结构-亚稳态配置和折叠路径-但地形可能会受到干扰,即,自由能峰和谷的高度和深度。这项工作的三个主要目标是(i)验证小蛋白质分子动力学模拟的方法,其中真实景观是已知的,(ii)在实验测量中对采样分辨率和信噪比设置条件,以实现鲁棒的景观恢复,以及诱导的地形扰动的理论界限,以及(iii)将该方法应用于结核分枝杆菌蛋白酪氨酸磷酸酶(Mtb-PtpB)的盖打开和关闭动力学的实验性单分子Forster共振能量转移(smFRET)测量。
项目成果
期刊论文数量(7)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Learned Reconstruction of Protein Folding Trajectories from Noisy Single-Molecule Time Series
从嘈杂的单分子时间序列中学习重建蛋白质折叠轨迹
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:5.5
- 作者:Topel, Maximilian;Ejaz, Ayesha;Squires, Allison;Ferguson, Andrew L.
- 通讯作者:Ferguson, Andrew L.
Reconstruction of protein structures from single-molecule time series
- DOI:10.1063/5.0024732
- 发表时间:2020-11-21
- 期刊:
- 影响因子:4.4
- 作者:Topel,Maximilian;Ferguson,Andrew L.
- 通讯作者:Ferguson,Andrew L.
Statistically optimal continuous free energy surfaces from biased simulations and multistate reweighting
来自有偏模拟和多态重新加权的统计最优连续自由能表面
- DOI:10.1021/acs.jctc.0c00077
- 发表时间:2020
- 期刊:
- 影响因子:5.5
- 作者:Shirts, Michael R.;Ferguson, Andrew L.
- 通讯作者:Ferguson, Andrew L.
Recovery of Protein Folding Funnels from Single-Molecule Time Series by Delay Embeddings and Manifold Learning
通过延迟嵌入和流形学习从单分子时间序列恢复蛋白质折叠漏斗
- DOI:10.1021/acs.jpcb.8b08800
- 发表时间:2018
- 期刊:
- 影响因子:0
- 作者:Wang, Jiang;Ferguson, Andrew L.
- 通讯作者:Ferguson, Andrew L.
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Andrew Ferguson其他文献
Enough is Enough: Policy Uncertainty and Acquisition Abandonment
受够了:政策不确定性和收购放弃
- DOI:
10.2139/ssrn.3883981 - 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Andrew Ferguson;Wei;P. Lam - 通讯作者:
P. Lam
‘Know when to fold 'em’: Policy uncertainty and acquisition abandonment
“知道何时放弃”:政策不确定性和收购放弃
- DOI:
10.1111/acfi.13179 - 发表时间:
2023 - 期刊:
- 影响因子:0
- 作者:
Andrew Ferguson;Cecilia Wei Hu;P. Lam - 通讯作者:
P. Lam
Share Purchase Plans in Australia: Issuer Characteristics and Valuation Implications
澳大利亚的股票购买计划:发行人特征和估值影响
- DOI:
10.1177/031289620803300205 - 发表时间:
2008 - 期刊:
- 影响因子:4.8
- 作者:
P. Brown;Andrew Ferguson;K. Stone - 通讯作者:
K. Stone
Nutrition and Isolation in a Rural US Population over 80 Years Old: A Descriptive Analysis of a Vulnerable Population
美国农村 80 岁以上人口的营养和隔离:弱势群体的描述性分析
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Courtney D Wellman;Andrew Ferguson;Thomas McIntosh;Alperen Korkmaz;Robert B Walker;Adam M. Franks - 通讯作者:
Adam M. Franks
Market reactions to Australian boutique resource investor presentations
市场对澳大利亚精品资源投资者演讲的反应
- DOI:
10.1016/j.resourpol.2011.07.004 - 发表时间:
2011 - 期刊:
- 影响因子:10.2
- 作者:
Andrew Ferguson;T. Scott - 通讯作者:
T. Scott
Andrew Ferguson的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Andrew Ferguson', 18)}}的其他基金
Collaborative Research: DMREF: Closed-Loop Design of Polymers with Adaptive Networks for Extreme Mechanics
合作研究:DMREF:采用自适应网络进行极限力学的聚合物闭环设计
- 批准号:
2323730 - 财政年份:2023
- 资助金额:
$ 16.2万 - 项目类别:
Standard Grant
Latent Space Simulators for the Efficient Estimation of Long-time Molecular Thermodynamics and Kinetics
用于有效估计长时间分子热力学和动力学的潜在空间模拟器
- 批准号:
2152521 - 财政年份:2022
- 资助金额:
$ 16.2万 - 项目类别:
Standard Grant
REU SITE: Research Experience for Undergraduates in Molecular Engineering
REU 网站:分子工程本科生的研究经验
- 批准号:
2050878 - 财政年份:2021
- 资助金额:
$ 16.2万 - 项目类别:
Standard Grant
EAGER: (ST1) Collaborative Research: Exploring the emergence of peptide-based compartments through iterative machine learning, molecular modeling, and cell-free protein synthesis
EAGER:(ST1)协作研究:通过迭代机器学习、分子建模和无细胞蛋白质合成探索基于肽的隔室的出现
- 批准号:
1939463 - 财政年份:2019
- 资助金额:
$ 16.2万 - 项目类别:
Standard Grant
EAGER: Collaborative Research: Type II: Data-Driven Characterization and Engineering of Protein Hydrophobicity
EAGER:合作研究:II 类:数据驱动的蛋白质疏水性表征和工程
- 批准号:
1844505 - 财政年份:2019
- 资助金额:
$ 16.2万 - 项目类别:
Standard Grant
Nonlinear dimensionality reduction and enhanced sampling in molecular simulation using auto-associative neural networks
使用自关联神经网络进行分子模拟中的非线性降维和增强采样
- 批准号:
1841805 - 财政年份:2018
- 资助金额:
$ 16.2万 - 项目类别:
Standard Grant
CAREER: Teaching Machines to Design Self-Assembling Materials
职业:教授机器设计自组装材料
- 批准号:
1841800 - 财政年份:2018
- 资助金额:
$ 16.2万 - 项目类别:
Continuing Grant
DMREF: Collaborative Research: Self-assembled peptide-pi-electron supramolecular polymers for bioinspired energy harvesting, transport and management
DMREF:合作研究:用于仿生能量收集、运输和管理的自组装肽-π-电子超分子聚合物
- 批准号:
1841807 - 财政年份:2018
- 资助金额:
$ 16.2万 - 项目类别:
Standard Grant
DMREF: Collaborative Research: Self-assembled peptide-pi-electron supramolecular polymers for bioinspired energy harvesting, transport and management
DMREF:合作研究:用于仿生能量收集、运输和管理的自组装肽-π-电子超分子聚合物
- 批准号:
1729011 - 财政年份:2017
- 资助金额:
$ 16.2万 - 项目类别:
Standard Grant
Nonlinear dimensionality reduction and enhanced sampling in molecular simulation using auto-associative neural networks
使用自关联神经网络进行分子模拟中的非线性降维和增强采样
- 批准号:
1664426 - 财政年份:2017
- 资助金额:
$ 16.2万 - 项目类别:
Standard Grant
相似国自然基金
基于高速可重构匹配网络的VHF宽带多路跳频Manifold耦合器基础问题研究
- 批准号:61001012
- 批准年份:2010
- 资助金额:22.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Collaborative Research: CPS: Medium: Data Driven Modeling and Analysis of Energy Conversion Systems -- Manifold Learning and Approximation
合作研究:CPS:媒介:能量转换系统的数据驱动建模和分析——流形学习和逼近
- 批准号:
2223987 - 财政年份:2023
- 资助金额:
$ 16.2万 - 项目类别:
Standard Grant
Collaborative Research: CPS: Medium: Data Driven Modeling and Analysis of Energy Conversion Systems -- Manifold Learning and Approximation
合作研究:CPS:媒介:能量转换系统的数据驱动建模和分析——流形学习和逼近
- 批准号:
2223985 - 财政年份:2023
- 资助金额:
$ 16.2万 - 项目类别:
Standard Grant
Collaborative Research: CPS: Medium: Data Driven Modeling and Analysis of Energy Conversion Systems -- Manifold Learning and Approximation
合作研究:CPS:媒介:能量转换系统的数据驱动建模和分析——流形学习和逼近
- 批准号:
2223986 - 财政年份:2023
- 资助金额:
$ 16.2万 - 项目类别:
Standard Grant
Deep Intrinsic Learning for On-line Process Control of Manufacturing Manifold Data
用于制造流形数据在线过程控制的深度内在学习
- 批准号:
2121625 - 财政年份:2022
- 资助金额:
$ 16.2万 - 项目类别:
Standard Grant
Collaborative Research: New Methods, Theory and Applications for Nonsmooth Manifold-Based Learning
协作研究:非平滑流形学习的新方法、理论和应用
- 批准号:
2243650 - 财政年份:2022
- 资助金额:
$ 16.2万 - 项目类别:
Standard Grant
Taking On the "Curse of Dimensionality" in Chemical Kinetics: Complex Chemical Reaction Prediction Using Manifold Learning
应对化学动力学中的“维数诅咒”:利用流形学习预测复杂化学反应
- 批准号:
2227112 - 财政年份:2022
- 资助金额:
$ 16.2万 - 项目类别:
Standard Grant
CAREER: Exploiting Low-Dimensional Structures in Data Science: Manifold Learning, Partial Differential Equation Identification, and Neural Networks
职业:在数据科学中利用低维结构:流形学习、偏微分方程识别和神经网络
- 批准号:
2145167 - 财政年份:2022
- 资助金额:
$ 16.2万 - 项目类别:
Continuing Grant
Development and Industrial Application of Universal Manifold Learning Algorithm for Realization of Super Smart Society
实现超级智能社会的通用流形学习算法开发及产业应用
- 批准号:
21H04599 - 财政年份:2021
- 资助金额:
$ 16.2万 - 项目类别:
Grant-in-Aid for Scientific Research (A)
Controlling Geometry: Applications in Physics, Biology, and Manifold Learning
控制几何:在物理、生物学和流形学习中的应用
- 批准号:
2111474 - 财政年份:2021
- 资助金额:
$ 16.2万 - 项目类别:
Continuing Grant
Manifold representations and active learning for 21 st century biology
21 世纪生物学的流形表示和主动学习
- 批准号:
10401890 - 财政年份:2021
- 资助金额:
$ 16.2万 - 项目类别: