ABI Innovation: New methods for multiple sequence alignment with improved accuracy and scalability

ABI Innovation:多序列比对的新方法,具有更高的准确性和可扩展性

基本信息

  • 批准号:
    1458652
  • 负责人:
  • 金额:
    $ 86.16万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2015
  • 资助国家:
    美国
  • 起止时间:
    2015-08-15 至 2021-07-31
  • 项目状态:
    已结题

项目摘要

Multiple sequence alignment (MSA) is one of the most basic bioinformatics steps, in which a set of molecular sequences (i.e., DNA, RNA, or amino acid sequences) are arranged inside a matrix to identify corresponding positions. MSA calculation is a fundamental first step in many biological analyses. Because of its broad applicability and importance, many MSA methods have been developed and are in wide use today. Unfortunately, many real world biological datasets have features (large size and fragmentary sequences, for example) that make accurate MSA calculation very difficult. Because poorly estimated alignments result in errors in downstream biological analyses, new MSA techniques are needed that can produce accurate alignments on difficult datasets. This project will develop MSA methods with greatly improved accuracy, and that can analyze the large and heterogeneous sequence datasets being assembled in different biology projects nationally. The project also has a substantial outreach component to women's colleges and minority serving institutions, and summer software schools to train biologists in the use of the project software.Multiple sequence alignment (MSA) and phylogeny estimation are two very basic bioinformatics problems, which sit at the intersection of machine learning, statistical estimation, and evolutionary and structural biology. MSA has particular importance in constructing evolutionary trees, understanding the function and structure of proteins, detecting interactions between proteins, and even genome assembly. Large-scale MSA and phylogeny estimation also require high performance computing and parallel algorithms, in order to provide adequate scalability. The team will develop new machine learning techniques to greatly improve MSA methods, and hence also phylogeny estimation, since it depends on accurate multiple sequence alignments. The core of this project is algorithm development, utilizing a variety of machine learning techniques (including Hidden Markov Models), statistical estimation methods (especially Bayesian MCMC and maximum likelihood), and novel algorithmic strategies, all focused on improving scalability and accuracy. More information about the project can be found at: http://tandy.cs.illinois.edu/MSAproject.html
多序列比对(MSA)是最基本的生物信息学步骤之一,其中一组分子序列(即,DNA、RNA或氨基酸序列)排列在矩阵内以识别相应的位置。MSA计算是许多生物分析中基本的第一步。由于其广泛的适用性和重要性,许多MSA方法已被开发并广泛使用。 不幸的是,许多真实的世界生物数据集具有使得精确的MSA计算非常困难的特征(例如,大尺寸和片段序列)。 由于估计不佳的比对导致下游生物分析中的错误,因此需要新的MSA技术,该技术可以在困难的数据集上产生准确的比对。该项目将开发准确性大大提高的MSA方法,并且可以分析全国不同生物学项目中组装的大型且异构的序列数据集。该项目也有一个实质性的外展组成部分,女子学院和少数民族服务机构,暑期软件学校,以培训生物学家在使用项目软件。多序列比对(MSA)和同源性估计是两个非常基本的生物信息学问题,这坐在机器学习,统计估计,进化和结构生物学的交叉点。MSA在构建进化树,理解蛋白质的功能和结构,检测蛋白质之间的相互作用,甚至基因组组装方面具有特别重要的意义。大规模MSA和概率估计也需要高性能计算和并行算法,以便提供足够的可扩展性。该团队将开发新的机器学习技术,以极大地改进MSA方法,从而改进遗传学估计,因为它依赖于准确的多序列比对。该项目的核心是算法开发,利用各种机器学习技术(包括隐马尔可夫模型),统计估计方法(特别是贝叶斯MCMC和最大似然)和新颖的算法策略,所有这些都专注于提高可扩展性和准确性。有关该项目的更多信息,请访问:http://tandy.cs.illinois.edu/MSAproject.html

项目成果

期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
MAGUS+eHMMs: improved multiple sequence alignment accuracy for fragmentary sequences.
  • DOI:
    10.1093/bioinformatics/btab788
  • 发表时间:
    2022-01-27
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Shen C;Zaharias P;Warnow T
  • 通讯作者:
    Warnow T
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Tandy Warnow其他文献

EC-SBM synthetic network generator
  • DOI:
    10.1007/s41109-025-00701-2
  • 发表时间:
    2025-05-01
  • 期刊:
  • 影响因子:
    1.500
  • 作者:
    The-Anh Vu-Le;Lahari Anne;George Chacko;Tandy Warnow
  • 通讯作者:
    Tandy Warnow
A perspective on 16S rRNA operational taxonomic unit clustering using sequence similarity
关于使用序列相似性进行 16S rRNA 操作分类单元聚类的观点
  • DOI:
    10.1038/npjbiofilms.2016.4
  • 发表时间:
    2016-04-20
  • 期刊:
  • 影响因子:
    9.200
  • 作者:
    Nam-Phuong Nguyen;Tandy Warnow;Mihai Pop;Bryan White
  • 通讯作者:
    Bryan White
Correction to: The performance of coalescent-based species tree estimation methods under models of missing data
  • DOI:
    10.1186/s12864-020-6540-1
  • 发表时间:
    2020-02-10
  • 期刊:
  • 影响因子:
    3.700
  • 作者:
    Michael Nute;Jed Chou;Erin K. Molloy;Tandy Warnow
  • 通讯作者:
    Tandy Warnow
Analyzing the Order of Items in Manuscripts of The Canterbury Tales
  • DOI:
    10.1023/a:1021818600001
  • 发表时间:
    2003-02-01
  • 期刊:
  • 影响因子:
    1.800
  • 作者:
    Matthew Spencer;Barbara Bordalejo;Li-San Wang;Adrian C. Barbrook;Linne R. Mooney;Peter Robinson;Tandy Warnow;Christopher J. Howe
  • 通讯作者:
    Christopher J. Howe
An experimental study of Quartets MaxCut and other supertree methods
  • DOI:
    10.1186/1748-7188-6-7
  • 发表时间:
    2011-04-19
  • 期刊:
  • 影响因子:
    1.700
  • 作者:
    M Shel Swenson;Rahul Suri;C Randal Linder;Tandy Warnow
  • 通讯作者:
    Tandy Warnow

Tandy Warnow的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Tandy Warnow', 18)}}的其他基金

IIBR Informatics: Advancing Bioinformatics Methods using Ensembles of Profile Hidden Markov Models
IIBR 信息学:使用轮廓隐马尔可夫模型集成推进生物信息学方法
  • 批准号:
    2006069
  • 财政年份:
    2020
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Standard Grant
AitF: Full: Collaborative Research: Graph-theoretic algorithms to improve phylogenomic analyses
AitF:完整:协作研究:改进系统发育分析的图论算法
  • 批准号:
    1535977
  • 财政年份:
    2015
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Standard Grant
III: AF: Medium: Collaborative Research: Scalable and Highly Accurate Methods for Metagenomics
III:AF:中:协作研究:可扩展且高度准确的宏基因组学方法
  • 批准号:
    1513629
  • 财政年份:
    2015
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Continuing Grant
Collaborative Research: Novel Methodologies for Genome-scale Evolutionary Analysis of Multi-locus data
合作研究:多位点数据基因组规模进化分析的新方法
  • 批准号:
    1461364
  • 财政年份:
    2014
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Standard Grant
Collaborative Research: Novel Methodologies for Genome-scale Evolutionary Analysis of Multi-locus data
合作研究:多位点数据基因组规模进化分析的新方法
  • 批准号:
    1062335
  • 财政年份:
    2011
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Standard Grant
Collaborative Research: Large-scale simultaneous multiple alignment and phylogeny estimation
合作研究:大规模同时多重比对和系统发育估计
  • 批准号:
    0733029
  • 财政年份:
    2007
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Continuing Grant
Information Technology Research (ITR): Building the Tree of Life -- A National Resource for Phyloinformatics and Computational Phylogenetics
信息技术研究(ITR):构建生命之树——系统信息学和计算系统发育学的国家资源
  • 批准号:
    0715370
  • 财政年份:
    2006
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Cooperative Agreement
Information Technology Research (ITR): Building the Tree of Life -- A National Resource for Phyloinformatics and Computational Phylogenetics
信息技术研究(ITR):构建生命之树——系统信息学和计算系统发育学的国家资源
  • 批准号:
    0331654
  • 财政年份:
    2003
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Cooperative Agreement
Information Technology Research (ITR): Building the Tree of Life -- A National Resource for Phyloinformatics and Computational Phylogenetics
信息技术研究(ITR):构建生命之树——系统信息学和计算系统发育学的国家资源
  • 批准号:
    0331453
  • 财政年份:
    2003
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Cooperative Agreement
ITR: Collaborative Research, Algorithms for Inferring Reticulate Evolution in Historical Linguistics
ITR:历史语言学中推断网状进化的协作研究和算法
  • 批准号:
    0312830
  • 财政年份:
    2003
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Standard Grant

相似海外基金

New Physics in Artificial Spin Ice via Materials Innovation
通过材料创新实现人造旋转冰的新物理
  • 批准号:
    2419407
  • 财政年份:
    2024
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Continuing Grant
NSF Engines Development Award: Building a space-based innovation ecosystem in New Mexico (NM)
NSF 发动机发展奖:在新墨西哥州 (NM) 建立天基创新生态系统
  • 批准号:
    2314657
  • 财政年份:
    2024
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Cooperative Agreement
NSF Engines Development Award: Advancing a materials innovation ecosystem for manufacturing sustainability in upstate New York (NY)
NSF 发动机开发奖:推进材料创新生态系统,实现纽约州北部制造业的可持续发展
  • 批准号:
    2315307
  • 财政年份:
    2024
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Cooperative Agreement
Exploring sustainable growth models and innovation as a new growth strategy in declining industries
探索可持续增长模式和创新作为衰退行业的新增长战略
  • 批准号:
    23K01523
  • 财政年份:
    2023
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
New GORDON RESEARCH CONFERENCE: Advanced Cell and Tissue Biomanufacturing: Technology Development and Innovation through Convergence
新戈登研究会议:先进细胞和组织生物制造:通过融合进行技术开发和创新
  • 批准号:
    10682943
  • 财政年份:
    2023
  • 资助金额:
    $ 86.16万
  • 项目类别:
New Physics in Artificial Spin Ice via Materials Innovation
通过材料创新实现人造旋转冰的新物理学
  • 批准号:
    2310275
  • 财政年份:
    2023
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Continuing Grant
Targeted Infusion Project: Curricular Innovation of Microbiology and Genetics Laboratories - New Strategies for Biology Student Success
定向输注项目:微生物学和遗传学实验室的课程创新——生物学学生成功的新策略
  • 批准号:
    2306045
  • 财政年份:
    2023
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Standard Grant
Implementation of EDI-D Principles and Open Communication Practices for Scientific Innovation and Dissemination - A New Canadian Connective Tissue Conference Workshop
实施 EDI-D 原则和科学创新和传播的开放式交流实践 - 加拿大新结缔组织会议研讨会
  • 批准号:
    487891
  • 财政年份:
    2023
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Miscellaneous Programs
Collaborative Research: CCRI: New: An Infrastructure for Sustainable Innovation and Research in Computer Science Education
合作研究:CCRI:新:计算机科学教育可持续创新和研究的基础设施
  • 批准号:
    2213790
  • 财政年份:
    2022
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Standard Grant
Creation of a New Unified Theory of Tactile Sensation for VR Innovation –Gestalts of Tactile, Heat and Pain Senses–
为 VR 创新创建新的统一触觉理论
  • 批准号:
    22K19791
  • 财政年份:
    2022
  • 资助金额:
    $ 86.16万
  • 项目类别:
    Grant-in-Aid for Challenging Research (Exploratory)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了