Statistical methods for higher order dependences to understand protein functions
用于了解蛋白质功能的高阶依赖性统计方法
基本信息
- 批准号:10492723
- 负责人:
- 金额:$ 20.33万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-09-23 至 2024-08-31
- 项目状态:已结题
- 来源:
- 关键词:AlgorithmsAmino Acid SequenceAmino AcidsBiologicalCellsCharacteristicsComplexDataData SetDatabasesDependenceDrug DesignMeasuresMethodsModelingMolecularMolecular StructureOutcomePhysical environmentPrognosisProtein DynamicsProteinsPublic HealthScienceStatistical MethodsStructureSystemUncertaintyamino groupbasecomputer frameworkdiscrete datagenomic dataimprovednovelnovel strategiesprotein functionprotein structurestatisticsweb site
项目摘要
This proposal brings together a strong team from molecular science and statistics to tackle the important
problem of how to integrate protein structure and sequence information in complex systems. Some of the
most important characteristics of these data are the strong correlations buried within them, with the
pairwise correlations in the sequence data already being routinely used to predict structural contacts. Here,
we are developing novel ways to use huge data sets to extract higher-order dependences, which are now
possible with the availability of the large volumes of sequence data from genomics; and in addition, in the
molecular structures such higher-order dependences are directly observable in the protein structures where
groups of amino acids interact directly. Importantly, these higher-order dependences reflect the dense
physical environment in the cell that requires for proper statistical characterization. A new model free
information-theoretic measure is introduced to quantify the higher-order dependences, which serves as the
central method in this project. By identifying the major challenges in drawing statistical inference based on
this measure, we develop, evaluate, and improve a new statistical inference and computational framework
for analyses of higher-order dependences with discrete data of a general type, motivated by the protein
multiple sequence data. The new computationally efficient framework makes it possible to discover reliable
higher-order dependences with the ability of quantifying uncertainty. The preliminary data here combine the
information from sequences and structures to yield unexpected results that immediately relate to the
dynamics of the protein structures. The outcome is an entirely new approach to handle the large volumes
of protein sequence data and other omics data now available and the enormous volumes about to arrive on
the doorsteps of omics analysts.
这项提案汇集了来自分子科学和统计学的强大团队,以解决重要的
如何在复杂系统中整合蛋白质结构和序列信息。一些
这些数据最重要的特征是隐藏在其中的强相关性,
序列数据中的成对相关性已经被常规用于预测结构接触。在这里,
我们正在开发新的方法来使用巨大的数据集来提取高阶依赖关系,
可能与来自基因组学的大量序列数据的可用性有关;此外,
在蛋白质结构中可以直接观察到这种高阶依赖性,
氨基酸基团直接相互作用。重要的是,这些高阶依赖关系反映了
细胞中需要适当统计特征的物理环境。免费的新模型
引入信息论测度来量化高阶相关性,
在这个项目中的核心方法。通过确定基于以下方面进行统计推断的主要挑战,
通过这一措施,我们开发,评估和改进了一个新的统计推断和计算框架
用于分析高阶依赖与一般类型的离散数据,由蛋白质激发
多序列数据。新的计算效率框架使得发现可靠的
具有量化不确定性能力的高阶相关性。这里的初步数据结合了联合收割机
从序列和结构中获取信息,以产生与
蛋白质结构的动力学。其结果是一个全新的方法来处理大量的
大量的蛋白质序列数据和其他组学数据,
经济学分析师的大门
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Wen Zhou其他文献
Wen Zhou的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Wen Zhou', 18)}}的其他基金
Statistical methods for higher order dependences to understand protein functions
用于了解蛋白质功能的高阶依赖性统计方法
- 批准号:
10378307 - 财政年份:2021
- 资助金额:
$ 20.33万 - 项目类别:
Statistical methods for higher order dependences to understand protein functions
用于了解蛋白质功能的高阶依赖性统计方法
- 批准号:
10707332 - 财政年份:2021
- 资助金额:
$ 20.33万 - 项目类别:
相似海外基金
Cerebral infarction treatment strategy using collagen-like "triple helix peptide" containing functional amino acid sequence
含功能氨基酸序列的类胶原“三螺旋肽”治疗脑梗塞策略
- 批准号:
23K06972 - 财政年份:2023
- 资助金额:
$ 20.33万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Establishment of a screening method for functional microproteins independent of amino acid sequence conservation
不依赖氨基酸序列保守性的功能性微生物蛋白筛选方法的建立
- 批准号:
23KJ0939 - 财政年份:2023
- 资助金额:
$ 20.33万 - 项目类别:
Grant-in-Aid for JSPS Fellows
Effects of amino acid sequence and lipids on the structure and self-association of transmembrane helices
氨基酸序列和脂质对跨膜螺旋结构和自缔合的影响
- 批准号:
19K07013 - 财政年份:2019
- 资助金额:
$ 20.33万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Construction of electron-transfer amino acid sequence probe with an interaction for protein and cell
蛋白质与细胞相互作用的电子转移氨基酸序列探针的构建
- 批准号:
16K05820 - 财政年份:2016
- 资助金额:
$ 20.33万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Development of artificial antibody of anti-bitter taste receptor using random amino acid sequence library
利用随机氨基酸序列库开发抗苦味受体人工抗体
- 批准号:
16K08426 - 财政年份:2016
- 资助金额:
$ 20.33万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The aa15-17 amino acid sequence in the terminal protein domain of HBV polymerase as a viral factor affect-ing in vivo as well as in vitro replication activity of the virus.
HBV聚合酶末端蛋白结构域中的aa15-17氨基酸序列作为影响病毒体内和体外复制活性的病毒因子。
- 批准号:
25461010 - 财政年份:2013
- 资助金额:
$ 20.33万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Amino acid sequence analysis of fossil proteins using mass spectrometry
使用质谱法分析化石蛋白质的氨基酸序列
- 批准号:
23654177 - 财政年份:2011
- 资助金额:
$ 20.33万 - 项目类别:
Grant-in-Aid for Challenging Exploratory Research
Precise hybrid synthesis of glycoprotein through amino acid sequence-specific introduction of oligosaccharide followed by enzymatic transglycosylation reaction
通过氨基酸序列特异性引入寡糖,然后进行酶促糖基转移反应,精确杂合合成糖蛋白
- 批准号:
22550105 - 财政年份:2010
- 资助金额:
$ 20.33万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Estimating selection on amino-acid sequence polymorphisms in Drosophila
果蝇氨基酸序列多态性选择的估计
- 批准号:
NE/D00232X/1 - 财政年份:2006
- 资助金额:
$ 20.33万 - 项目类别:
Research Grant
Construction of a neural network for detecting novel domains from amino acid sequence information only
构建仅从氨基酸序列信息检测新结构域的神经网络
- 批准号:
16500189 - 财政年份:2004
- 资助金额:
$ 20.33万 - 项目类别:
Grant-in-Aid for Scientific Research (C)