Computer Analysis Of Low-complexity Amino Acid And Nucleotide Sequences

低复杂性氨基酸和核苷酸序列的计算机分析

基本信息

  • 批准号:
    7735065
  • 负责人:
  • 金额:
    $ 22.42万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
  • 资助国家:
    美国
  • 起止时间:
  • 项目状态:
    未结题

项目摘要

The goal of this project is to define and analyze segments of protein and nucleotide sequences showing compositional bias and to understand their structural, functional and evolutionary significance, and their pathology. These sequences include local low complexity regions or domains, including conformationally mobile or intrinsically unstructured regions of proteins, tandemly-repeated sequences, and also more generally distributed amino acid content bias. The latter can reflect directional mutation pressures at the genomic level and constraints specific to protein or domain function. Low complexity regions comprise a large proportion of the genome-encoded amino acids, and may contain homopolymeric tracts or mosaics of a few amino acids, or repeated patterns, frequently subtle, including those typical of many non-globular domains and dynamic or intrinsically unstructured segments of proteins. Mathematical definitions and algorithms have been developed to define and identify regions of compositional bias, and to discover and analyze properties of these regions relevant to their structures, interactions, and evolution. These methods are also valuable, for both nucleotide and amino acid sequences, in detecting and eliminating some artifacts in sequence database searches and alignment analysis. Strong background bias is shown by proteins encoded by very AT-rich or GC-rich genomes, which include those of several important infectious disease organisms, raising problems for sequence alignment algorithms. Local regions of low complexity and tandemly repeated amino acid sequences occur in many proteins involved in cellular differentiation and embryonic development, RNA processing, transcriptional regulation, signal transduction and aspects of cellular and extracellular structural integrity. Experimental data indicate that low complexity segments of proteins are generally non-globular, intrinsically unstructured, or conformationally mobile: however, knowledge of the molecular structures and dynamics of these domains is still very limited. They are generally relatively intractable to investigation by crystallography and NMR, and they still account for less than 1% of the residues in current structural databases. Moreover, current structure prediction methods based on molecular mechanics and dynamics have given inconsistent results when applied to low-complexity amino acid sequences. Accordingly, we are experimenting with ab initio quantum chemical methods to investigate the ensembles of conformational states accessible to these regions of proteins. Together with the limited amount of available high-resolution structural and biophysical data, this approach is starting to raise more focussed questions for further experiments. We are currently investigating repeated domains that are under trial as components of malaria vaccines.
该项目的目标是定义和分析蛋白质和核苷酸序列的片段,显示组成的偏见,并了解其结构,功能和进化的意义,以及它们的病理。这些序列包括局部低复杂性区域或结构域,包括蛋白质的构象移动的或固有非结构化区域、串联重复序列,以及更普遍分布的氨基酸含量偏差。后者可以反映基因组水平的定向突变压力和蛋白质或结构域功能的特定约束。低复杂性区域包含大比例的基因组编码的氨基酸,并且可以包含几个氨基酸的均聚物片段或镶嵌物,或重复的模式,通常是细微的,包括许多非球状结构域和蛋白质的动态或固有非结构化片段的典型模式。数学定义和算法已经被开发来定义和识别成分偏差的区域,并发现和分析这些区域的结构,相互作用和演化相关的属性。对于核苷酸和氨基酸序列,这些方法在检测和消除序列数据库搜索和比对分析中的一些伪影方面也是有价值的。 强背景偏差显示由非常AT-丰富或GC-丰富的基因组编码的蛋白质,其中包括几个重要的传染病生物体,提出了序列比对算法的问题。低复杂性和串联重复的氨基酸序列的局部区域出现在参与细胞分化和胚胎发育、RNA加工、转录调节、信号转导以及细胞和细胞外结构完整性方面的许多蛋白质中。实验数据表明,蛋白质的低复杂性片段通常是非球形的,本质上是非结构化的,或构象移动的:然而,这些结构域的分子结构和动力学的知识仍然非常有限。它们通常相对难以通过晶体学和NMR进行研究,并且它们在当前结构数据库中仍然占不到1%的残基。 此外,当前基于分子力学和动力学的结构预测方法在应用于低复杂度氨基酸序列时给出了不一致的结果。 因此,我们正在实验从头量子化学方法来研究这些蛋白质区域的构象状态。再加上有限的高分辨率结构和生物物理数据,这种方法开始为进一步的实验提出更集中的问题。 我们目前正在研究作为疟疾疫苗成分的重复领域。

项目成果

期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A global compositional complexity measure for biological sequences: AT-rich and GC-rich genomes encode less complex proteins.
生物序列的整体组成复杂性测量:富含 AT 和 GC 的基因组编码不太复杂的蛋白质。
  • DOI:
    10.1016/s0097-8485(99)00048-0
  • 发表时间:
    2000
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Wan,H;Wootton,JC
  • 通讯作者:
    Wootton,JC
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

JOHN C. WOOTTON其他文献

JOHN C. WOOTTON的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('JOHN C. WOOTTON', 18)}}的其他基金

COMPUTER ANALYSIS OF SEQUENCES FROM MICROORGANISMS
微生物序列的计算机分析
  • 批准号:
    6111061
  • 财政年份:
  • 资助金额:
    $ 22.42万
  • 项目类别:
Computational Biology and Genetics Of Malaria Parasites
疟疾寄生虫的计算生物学和遗传学
  • 批准号:
    6681329
  • 财政年份:
  • 资助金额:
    $ 22.42万
  • 项目类别:
Computational Biology and Genetics Of Malaria and Toxoplasma Parasites
疟疾和弓形虫寄生虫的计算生物学和遗传学
  • 批准号:
    7969203
  • 财政年份:
  • 资助金额:
    $ 22.42万
  • 项目类别:
Computational Biology and Genetics Of Malaria and Toxopl
疟疾和弓形虫的计算生物学和遗传学
  • 批准号:
    7316231
  • 财政年份:
  • 资助金额:
    $ 22.42万
  • 项目类别:
Computational Biology and Genetics Of Malaria Parasites
疟疾寄生虫的计算生物学和遗传学
  • 批准号:
    6988451
  • 财政年份:
  • 资助金额:
    $ 22.42万
  • 项目类别:
Computer Analysis Of Low-complexity Amino Acid And Nucle
低复杂性氨基酸和核酸的计算机分析
  • 批准号:
    7316230
  • 财政年份:
  • 资助金额:
    $ 22.42万
  • 项目类别:
Computational Biology and Genetics Of Malaria Parasites
疟疾寄生虫的计算生物学和遗传学
  • 批准号:
    6843563
  • 财政年份:
  • 资助金额:
    $ 22.42万
  • 项目类别:
Analysis-Low-complexity Amino Acid-Nucleotide Sequences
低复杂性氨基酸-核苷酸序列分析
  • 批准号:
    7148025
  • 财政年份:
  • 资助金额:
    $ 22.42万
  • 项目类别:
Computer Analysis Of Low-complexity Amino Acid And Nucleotide Sequences
低复杂性氨基酸和核苷酸序列的计算机分析
  • 批准号:
    7594457
  • 财政年份:
  • 资助金额:
    $ 22.42万
  • 项目类别:
Computational Biology and Genetics Of Malaria and Toxoplasma Parasites
疟疾和弓形虫寄生虫的计算生物学和遗传学
  • 批准号:
    7735066
  • 财政年份:
  • 资助金额:
    $ 22.42万
  • 项目类别:

相似海外基金

CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
  • 批准号:
    2337776
  • 财政年份:
    2024
  • 资助金额:
    $ 22.42万
  • 项目类别:
    Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
  • 批准号:
    2338816
  • 财政年份:
    2024
  • 资助金额:
    $ 22.42万
  • 项目类别:
    Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
  • 批准号:
    2338846
  • 财政年份:
    2024
  • 资助金额:
    $ 22.42万
  • 项目类别:
    Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
  • 批准号:
    2348261
  • 财政年份:
    2024
  • 资助金额:
    $ 22.42万
  • 项目类别:
    Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
  • 批准号:
    2348346
  • 财政年份:
    2024
  • 资助金额:
    $ 22.42万
  • 项目类别:
    Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
  • 批准号:
    2348457
  • 财政年份:
    2024
  • 资助金额:
    $ 22.42万
  • 项目类别:
    Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
  • 批准号:
    2404989
  • 财政年份:
    2024
  • 资助金额:
    $ 22.42万
  • 项目类别:
    Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
  • 批准号:
    2339310
  • 财政年份:
    2024
  • 资助金额:
    $ 22.42万
  • 项目类别:
    Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
  • 批准号:
    2339669
  • 财政年份:
    2024
  • 资助金额:
    $ 22.42万
  • 项目类别:
    Continuing Grant
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
  • 批准号:
    EP/Y029089/1
  • 财政年份:
    2024
  • 资助金额:
    $ 22.42万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了