Statistics of Sequence Comparison
序列比较统计
基本信息
- 批准号:9555728
- 负责人:
- 金额:$ 19.28万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:
- 资助国家:美国
- 起止时间:至
- 项目状态:未结题
- 来源:
- 关键词:AcetyltransferaseAmino Acid SequenceAmino AcidsAppearanceBiochemistryBiologicalCharacteristicsCollaborationsCorrelation StudiesCouplingDNA SequenceDataDevelopmentElementsFrequenciesGenomeGoalsIndividualInstitutesLengthMapsMarylandMeasuresMethodsModelingMolecular BiologyPaperPatternPositioning AttributeProteinsPublishingRoleScienceSystemUniversitiesWorkbaseimprovedmedical schoolsprogramsstatisticsthree dimensional structurevector
项目摘要
The current direction of this project, in collaboration with Dr.
Andrew Neuwald of the Institute for Genome Sciences and Department
of Biochemistry & Molecular Biology at the University of Maryland
School of Medicine, continued throughout this year. A previous
focus had been the development of an improved method for multiple
alignment that could identify the common elements shared by large
and diverse protein superfamilies. A central aim this year was
to extend this method to a hierarchical multiple alignment model.
Such a model is based on the fact that large protein superfamilies
frequently have diversified to fulfill distinct functional roles
within different subfamilies. Each subfamily has distinct
structural constraints, which yield distinct amino acid frequency
vectors at particular positions characteristic of that subfamily.
Although, within a subfamily, the amino acids at different
positions may be independent, the changes in frequency vectors
across multiple positions characteristic of each subfamily yields
the appearance of correlation between positions when a simple,
non-hierarchical model of a superfamily is constructed. Earlier
approaches have modeled these apparent correlations directly,
using pairwise coupling terms, but we model them by constructing
an explicit hierarchical model, with individual sequences assigned
to distinct nodes within the hierarchy. We apply the Minimum
Description Length principle to insure that the hierarchical
models we construct do not overfit the data, but have statistical
support. We completed the development of a hierarchical
multiple alignment program, and applied it to the analysis
of N-acetyltransferases. Based upon statistical correlations,
this approach identified a number of subfamilies, characterized
by protein positions with distinctive amino acid usage, which
suggested specific, previously uncharacterized biological
mechanisms. A paper describing this this work was published.
Another aim of this project, launched last year, was significantly
advanced. The hierarchical models constructed by our approach
include the explicit description of a set of "distinguishing
positions" characteristic of each node in the hierarchy. When
mapped only available three-dimensional structures, these positions
often cluster together in space, and can aid in the development of
specific hypotheses for the biological mechanisms underlying the
diversification of protein subfamilies. We developed appropriate
measures for the clustering of distinguished positions, and
derived methods to assess their statistical significance. A paper
describing this work is in press. Work continues on extending the
clustering measures to allow them to capture more biologically
relevant information.
该项目的当前方向,与博士合作。
基因组科学研究所及系的 Andrew Neuwald
马里兰大学生物化学与分子生物学系
医学院,今年继续。 以前的一个
重点是开发一种改进的方法,用于多种
对齐可以识别大型共享的共同元素
和多样化的蛋白质超家族。 今年的中心目标是
将此方法扩展到分层多重比对模型。
这种模型基于以下事实:大型蛋白质超家族
经常进行多元化以履行不同的职能角色
不同亚科内。 每个亚科都有不同的
结构限制,产生不同的氨基酸频率
该亚科特有的特定位置处的向量。
尽管在一个亚科内,不同位置的氨基酸
位置可能是独立的,频率向量的变化
跨越每个亚科产量的多个位置特征
当一个简单的、
构建了超家族的非等级模型。 早些时候
方法直接对这些明显的相关性进行建模,
使用成对耦合项,但我们通过构造对它们进行建模
明确的分层模型,分配了单独的序列
到层次结构中的不同节点。 我们应用最低限度
说明 长度原则,确保层次结构
我们构建的模型不会过度拟合数据,但具有统计性
支持。 我们完成了分层的开发
多重比对程序,并将其应用于分析
N-乙酰转移酶。 根据统计相关性,
这种方法确定了许多亚科,其特征是
通过具有独特氨基酸使用的蛋白质位置,
建议特定的、以前未表征的生物
机制。 发表了一篇描述这项工作的论文。
该项目于去年启动,其另一个目标是显着
先进的。 我们的方法构建的层次模型
包括对一组“区分
层次结构中每个节点的“位置”特征。当
仅映射可用的三维结构,这些位置
经常在太空中聚集在一起,并且可以帮助发展
其生物学机制的具体假设
蛋白质亚家族的多样化。 我们开发了合适的
杰出职位聚类的措施,以及
派生方法来评估其统计显着性。 一张纸
描述这项工作正在印刷中。 扩展工作仍在继续
采取聚类措施,使他们能够捕获更多的生物信息
相关信息。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
STEPHEN F ALTSCHUL其他文献
STEPHEN F ALTSCHUL的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('STEPHEN F ALTSCHUL', 18)}}的其他基金
Improvements And Extensions To The Blast Algorithms
Blast 算法的改进和扩展
- 批准号:
6546809 - 财政年份:
- 资助金额:
$ 19.28万 - 项目类别:
Improvements And Extensions To The Blast Algorithms
Blast 算法的改进和扩展
- 批准号:
6843572 - 财政年份:
- 资助金额:
$ 19.28万 - 项目类别:
IMPROVEMENTS AND EXTENSIONS TO THE BLAST ALGORITHMS
Blast 算法的改进和扩展
- 批准号:
6432754 - 财政年份:
- 资助金额:
$ 19.28万 - 项目类别:
相似海外基金
Cerebral infarction treatment strategy using collagen-like "triple helix peptide" containing functional amino acid sequence
含功能氨基酸序列的类胶原“三螺旋肽”治疗脑梗塞策略
- 批准号:
23K06972 - 财政年份:2023
- 资助金额:
$ 19.28万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Establishment of a screening method for functional microproteins independent of amino acid sequence conservation
不依赖氨基酸序列保守性的功能性微生物蛋白筛选方法的建立
- 批准号:
23KJ0939 - 财政年份:2023
- 资助金额:
$ 19.28万 - 项目类别:
Grant-in-Aid for JSPS Fellows
Effects of amino acid sequence and lipids on the structure and self-association of transmembrane helices
氨基酸序列和脂质对跨膜螺旋结构和自缔合的影响
- 批准号:
19K07013 - 财政年份:2019
- 资助金额:
$ 19.28万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Construction of electron-transfer amino acid sequence probe with an interaction for protein and cell
蛋白质与细胞相互作用的电子转移氨基酸序列探针的构建
- 批准号:
16K05820 - 财政年份:2016
- 资助金额:
$ 19.28万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Development of artificial antibody of anti-bitter taste receptor using random amino acid sequence library
利用随机氨基酸序列库开发抗苦味受体人工抗体
- 批准号:
16K08426 - 财政年份:2016
- 资助金额:
$ 19.28万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The aa15-17 amino acid sequence in the terminal protein domain of HBV polymerase as a viral factor affect-ing in vivo as well as in vitro replication activity of the virus.
HBV聚合酶末端蛋白结构域中的aa15-17氨基酸序列作为影响病毒体内和体外复制活性的病毒因子。
- 批准号:
25461010 - 财政年份:2013
- 资助金额:
$ 19.28万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Amino acid sequence analysis of fossil proteins using mass spectrometry
使用质谱法分析化石蛋白质的氨基酸序列
- 批准号:
23654177 - 财政年份:2011
- 资助金额:
$ 19.28万 - 项目类别:
Grant-in-Aid for Challenging Exploratory Research
Precise hybrid synthesis of glycoprotein through amino acid sequence-specific introduction of oligosaccharide followed by enzymatic transglycosylation reaction
通过氨基酸序列特异性引入寡糖,然后进行酶促糖基转移反应,精确杂合合成糖蛋白
- 批准号:
22550105 - 财政年份:2010
- 资助金额:
$ 19.28万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Estimating selection on amino-acid sequence polymorphisms in Drosophila
果蝇氨基酸序列多态性选择的估计
- 批准号:
NE/D00232X/1 - 财政年份:2006
- 资助金额:
$ 19.28万 - 项目类别:
Research Grant
Construction of a neural network for detecting novel domains from amino acid sequence information only
构建仅从氨基酸序列信息检测新结构域的神经网络
- 批准号:
16500189 - 财政年份:2004
- 资助金额:
$ 19.28万 - 项目类别:
Grant-in-Aid for Scientific Research (C)