IIBR Informatics: Advancing Bioinformatics Methods using Ensembles of Profile Hidden Markov Models

IIBR 信息学:使用轮廓隐马尔可夫模型集成推进生物信息学方法

基本信息

  • 批准号:
    2006069
  • 负责人:
  • 金额:
    $ 50万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-08-15 至 2024-07-31
  • 项目状态:
    已结题

项目摘要

Many steps in biological research pipelines involve the use of machine learning models, and these have become standard tools for many basic problems. Elaborations on basic machine learning models ("ensembles" of machine learning models) can provide improvements in accuracy compared to standard usage, for various biological questions. However, the design of these ensembles has been fairly ad hoc, and their use can be computationally intensive, which reduces their appeal in practice. This project will advance this technology by developing statistically rigorous techniques for building ensembles of machine learning models, with the goal of improving accuracy. The project will also develop methods that use these ensembles for new biological problems, including protein structure and function prediction. Broader impacts include software school, engagement with under-represented groups, and open-source software. Profile Hidden Markov Models (i.e., profile HMMs) are probabilistic graphical models that are in wide use in bioinformatics. Research over the last decade has shown that ensembles of profile HMMs (e-HMMs) can provide greater accuracy than a single profile HMM for many applications in bioinformatics, including phylogenetic placement, multiple sequence alignment, and taxonomic identification of metagenomic reads. This project will advance the use of e-HMMs by developing statistically rigorous techniques for building e-HMMs with the goal of improving accuracy and improving understanding of e-HMMs, and will also develop methods that use e-HMMs for protein structure and function prediction. Broader impacts include software schools, engagement with under-represented groups, and open-source software. Project software and papers are available at http://tandy.cs.illinois.edu/eHMMproject.html.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
生物学研究管道中的许多步骤都涉及到机器学习模型的使用,这些模型已经成为解决许多基本问题的标准工具。对于各种生物学问题,对基本机器学习模型(机器学习模型的“集成”)的详细阐述可以提供与标准用法相比的准确性改进。然而,这些合奏的设计是相当特别的,而且它们的使用可能需要大量的计算,这降低了它们在实践中的吸引力。该项目将通过开发统计上严格的技术来促进这项技术,以建立机器学习模型的集成,目标是提高精度。该项目还将开发使用这些集合来解决新的生物学问题的方法,包括蛋白质结构和功能预测。更广泛的影响包括软件学校、与代表不足的群体的互动以及开放源码软件。轮廓隐马尔可夫模型是生物信息学中广泛使用的概率图形模型。过去十年的研究表明,对于生物信息学中的许多应用,包括系统发育定位、多序列比对和元基因组读数的分类鉴定,轮廓HMM的集合(e-HMM)可以提供比单个轮廓HMM更高的准确性。该项目将通过开发建立e-HMM的严格统计技术来推动e-HMM的使用,目的是提高e-HMM的准确性和加深对e-HMM的理解,并将开发使用E-HMM进行蛋白质结构和功能预测的方法。更广泛的影响包括软件学校、与代表不足的群体的接触以及开放源码软件。项目软件和论文可在http://tandy.cs.illinois.edu/eHMMproject.html.This上获得,该奖项反映了国家科学基金会的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(7)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Large-Scale Multiple Sequence Alignment and the Maximum Weight Trace Alignment Merging Problem
WITCH: Improved Multiple Sequence Alignment Through Weighted Consensus Hidden Markov Model Alignment
  • DOI:
    10.1089/cmb.2021.0585
  • 发表时间:
    2022-05-17
  • 期刊:
  • 影响因子:
    1.7
  • 作者:
    Shen, Chengze;Park, Minhyuk;Warnow, Tandy
  • 通讯作者:
    Warnow, Tandy
MAGUS+eHMMs: improved multiple sequence alignment accuracy for fragmentary sequences.
  • DOI:
    10.1093/bioinformatics/btab788
  • 发表时间:
    2022-01-27
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Shen C;Zaharias P;Warnow T
  • 通讯作者:
    Warnow T
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Tandy Warnow其他文献

EC-SBM synthetic network generator
  • DOI:
    10.1007/s41109-025-00701-2
  • 发表时间:
    2025-05-01
  • 期刊:
  • 影响因子:
    1.500
  • 作者:
    The-Anh Vu-Le;Lahari Anne;George Chacko;Tandy Warnow
  • 通讯作者:
    Tandy Warnow
A perspective on 16S rRNA operational taxonomic unit clustering using sequence similarity
关于使用序列相似性进行 16S rRNA 操作分类单元聚类的观点
  • DOI:
    10.1038/npjbiofilms.2016.4
  • 发表时间:
    2016-04-20
  • 期刊:
  • 影响因子:
    9.200
  • 作者:
    Nam-Phuong Nguyen;Tandy Warnow;Mihai Pop;Bryan White
  • 通讯作者:
    Bryan White
Correction to: The performance of coalescent-based species tree estimation methods under models of missing data
  • DOI:
    10.1186/s12864-020-6540-1
  • 发表时间:
    2020-02-10
  • 期刊:
  • 影响因子:
    3.700
  • 作者:
    Michael Nute;Jed Chou;Erin K. Molloy;Tandy Warnow
  • 通讯作者:
    Tandy Warnow
Analyzing the Order of Items in Manuscripts of The Canterbury Tales
  • DOI:
    10.1023/a:1021818600001
  • 发表时间:
    2003-02-01
  • 期刊:
  • 影响因子:
    1.800
  • 作者:
    Matthew Spencer;Barbara Bordalejo;Li-San Wang;Adrian C. Barbrook;Linne R. Mooney;Peter Robinson;Tandy Warnow;Christopher J. Howe
  • 通讯作者:
    Christopher J. Howe
An experimental study of Quartets MaxCut and other supertree methods
  • DOI:
    10.1186/1748-7188-6-7
  • 发表时间:
    2011-04-19
  • 期刊:
  • 影响因子:
    1.700
  • 作者:
    M Shel Swenson;Rahul Suri;C Randal Linder;Tandy Warnow
  • 通讯作者:
    Tandy Warnow

Tandy Warnow的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Tandy Warnow', 18)}}的其他基金

AitF: Full: Collaborative Research: Graph-theoretic algorithms to improve phylogenomic analyses
AitF:完整:协作研究:改进系统发育分析的图论算法
  • 批准号:
    1535977
  • 财政年份:
    2015
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
ABI Innovation: New methods for multiple sequence alignment with improved accuracy and scalability
ABI Innovation:多序列比对的新方法,具有更高的准确性和可扩展性
  • 批准号:
    1458652
  • 财政年份:
    2015
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
III: AF: Medium: Collaborative Research: Scalable and Highly Accurate Methods for Metagenomics
III:AF:中:协作研究:可扩展且高度准确的宏基因组学方法
  • 批准号:
    1513629
  • 财政年份:
    2015
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
Collaborative Research: Novel Methodologies for Genome-scale Evolutionary Analysis of Multi-locus data
合作研究:多位点数据基因组规模进化分析的新方法
  • 批准号:
    1461364
  • 财政年份:
    2014
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: Novel Methodologies for Genome-scale Evolutionary Analysis of Multi-locus data
合作研究:多位点数据基因组规模进化分析的新方法
  • 批准号:
    1062335
  • 财政年份:
    2011
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Collaborative Research: Large-scale simultaneous multiple alignment and phylogeny estimation
合作研究:大规模同时多重比对和系统发育估计
  • 批准号:
    0733029
  • 财政年份:
    2007
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
Information Technology Research (ITR): Building the Tree of Life -- A National Resource for Phyloinformatics and Computational Phylogenetics
信息技术研究(ITR):构建生命之树——系统信息学和计算系统发育学的国家资源
  • 批准号:
    0715370
  • 财政年份:
    2006
  • 资助金额:
    $ 50万
  • 项目类别:
    Cooperative Agreement
Information Technology Research (ITR): Building the Tree of Life -- A National Resource for Phyloinformatics and Computational Phylogenetics
信息技术研究(ITR):构建生命之树——系统信息学和计算系统发育学的国家资源
  • 批准号:
    0331654
  • 财政年份:
    2003
  • 资助金额:
    $ 50万
  • 项目类别:
    Cooperative Agreement
Information Technology Research (ITR): Building the Tree of Life -- A National Resource for Phyloinformatics and Computational Phylogenetics
信息技术研究(ITR):构建生命之树——系统信息学和计算系统发育学的国家资源
  • 批准号:
    0331453
  • 财政年份:
    2003
  • 资助金额:
    $ 50万
  • 项目类别:
    Cooperative Agreement
ITR: Collaborative Research, Algorithms for Inferring Reticulate Evolution in Historical Linguistics
ITR:历史语言学中推断网状进化的协作研究和算法
  • 批准号:
    0312830
  • 财政年份:
    2003
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant

相似海外基金

REU Site: Program for Access to Training in Health Informatics (PATHI)
REU 网站:健康信息学培训计划 (PATHI)
  • 批准号:
    2348793
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Travel: IEEE International Conference on Healthcare Informatics (IEEE ICHI 2024) Doctoral Consortium Travel Scholarship
旅行:IEEE 国际医疗信息学会议 (IEEE ICHI 2024) 博士联盟旅行奖学金
  • 批准号:
    2414093
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Reliable Tensor-Network Fusion Approach to Medical Informatics: Novel Techniques and Benchmarks
可靠的张量网络融合医学信息学方法:新技术和基准
  • 批准号:
    24K03005
  • 财政年份:
    2024
  • 资助金额:
    $ 50万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
CAREER: Transforming Personal Informatics Systems to Support Routine Transitions in Healthy Eating
职业:转变个人信息系统以支持健康饮食的常规转变
  • 批准号:
    2414270
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Continuing Grant
Travel: NSF Student Travel Grant for 2023 IEEE-EMBS International Conference on Biomedical and Health Informatics (BHI)
旅行:2023 年 IEEE-EMBS 国际生物医学和健康信息学会议 (BHI) 的 NSF 学生旅行补助金
  • 批准号:
    2331680
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Standard Grant
Development of Informatics Materials with an Awareness of the High School-University connection and a Learning Support Environment for Data-Driven Instruction
开发具有高中与大学联系意识的信息学材料和数据驱动教学的学习支持环境
  • 批准号:
    23H01019
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Categorical Duality and Semantics Across Mathematics, Informatics and Physics and their Applications to Categorical Machine Learning and Quantum Computing
数学、信息学和物理领域的分类对偶性和语义及其在分类机器学习和量子计算中的应用
  • 批准号:
    23K13008
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Pioneering Research of industrial materials informatics for innovative lithium battery anodes
创新锂电池阳极工业材料信息学的开创性研究
  • 批准号:
    23K18465
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Grant-in-Aid for Challenging Research (Exploratory)
ACTS (AD Clinical Trial Simulation): Developing Advanced Informatics Approaches for an Alzheimer's Disease Clinical Trial Simulation System
ACTS(AD 临床试验模拟):为阿尔茨海默病临床试验模拟系统开发先进的信息学方法
  • 批准号:
    10753675
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
Establishment of polymer informatics incorporating polymer-specific hierarchy and search for new electrolytes
建立结合聚合物特定层次结构的聚合物信息学并寻找新的电解质
  • 批准号:
    23H02027
  • 财政年份:
    2023
  • 资助金额:
    $ 50万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了