III: AF: Medium: Collaborative Research: Scalable and Highly Accurate Methods for Metagenomics

III:AF:中:协作研究:可扩展且高度准确的宏基因组学方法

基本信息

  • 批准号:
    1513629
  • 负责人:
  • 金额:
    $ 62.67万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2015
  • 资助国家:
    美国
  • 起止时间:
    2015-09-01 至 2020-08-31
  • 项目状态:
    已结题

项目摘要

Metagenomic studies of microbial communities can generate millions to billions of sequencing reads. The assignment of accurate taxonomic labels to these sequences is a critical component in many analyses, but is complicated by the fact that the majority of the organisms found in environmental or host-associated communities cannot be easily cultured in a laboratory. Even among the organisms that can be cultured, relatively few have been sequenced, even partially. Thus, many commonly encountered organisms are largely absent from existing databases of known genomes and genes. Providing taxonomic labels to metagenomic sequences, thus, requires extrapolating the knowledge contained in sequence databases to previously unseen DNA strings. Simple similarity-based approaches (e.g., picking the best database hit as the best guess at the taxonomic label) have been shown to be insufficiently accurate, leading to the development of more sophisticated methods. Further developments are necessary to handle the characteristics of emerging sequencing technologies, such as high error rates with large numbers of insertions and deletions. To date, metagenomic taxon identification methods have been evaluated with respect to their ability to estimate the distribution of bacterial taxa (species, genera, families, etc.) within a metagenomic sample. Yet, different scientific and clinical settings may require specific types of analyses, and this one type of evaluation may not be the most appropriate for all settings. For example, in a clinical setting the most important question may be to detect whether a specific pathogen is present, while in a scientific setting the most interesting question may be to be able to determine if an observed read comes from a never-been-seen-before species. New evaluation strategies must be developed that specifically target the specific needs of the application domain. All the methods developed in the project will be made into open-source software that is freely available to the scientific public. Researchers will provide training activities each year with funds available to students and postdocs from around the country, and an outreach program to minority serving institutions and women?s colleges. A summer REU program will also be provided at the University of Maryland, College Park.The team will develop a new framework for integrating the formal definition of biological use-cases with evaluation datasets and metrics in order to ensure the software being developed adequately addresses the needs of the end-users. Second, they will develop new approaches for marker-based taxon identification and abundance profiling that can leverage multiple sources of information (e.g., multiple markers) as well as handle the high error rates of third-generation sequencing technologies. These approaches will build upon experience developing TIPP - a taxonomic profiling package recently published by the team that outperforms the leading metagenomic taxonomic profiling software, in particular for novel sequences, or for longer, high-error sequences. Finally they plan to develop high-performance computing implementations of these methods in order to enable rapid analysis of sample. Speed of analysis is particularly important in clinical settings where medical treatments may depend on the rate at which the method can return an analysis. Speed is also important in non-medical applications where faster analyses enable researchers to perform deeper or broader analyses of microbial communities.
微生物群落的宏基因组研究可以为数十亿个测序读数产生数百万美元。在许多分析中,将准确的分类标签分配给这些序列是一个关键的组成部分,但由于在实验室中很难在环境或宿主相关的社区中发现的大多数生物体而变得复杂。即使在可以培养的生物中,相对较少的测序也是部分的。因此,许多通常遇到的生物在很大程度上不存在已知基因组和基因的数据库中。因此,将分类学标签提供给宏基因组序列,需要将序列数据库中包含的知识推断为以前看不见的DNA字符串。基于简单的相似性方法(例如,将最佳数据库作为对分类标签的最佳猜测)已被证明是不够准确的,从而导致开发更复杂的方法。需要进一步的发展来处理新兴测序技术的特征,例如具有大量插入和删除的高错误率。迄今为止,已经评估了宏基因组分类识别方法,以估计元基因组样品中细菌分类群(物种,属,家庭等)的分布的能力。但是,不同的科学和临床环境可能需要特定类型的分析,而这种类型的评估可能并不适合所有设置。例如,在临床环境中,最重要的问题可能是检测是否存在特定的病原体,而在科学环境中,最有趣的问题可能是能够确定观察到的读数是否来自从未见过的物种。必须制定新的评估策略,以专门针对应用领域的特定需求。 该项目中开发的所有方法都将用于开源软件,该软件可供科学公众免费使用。研究人员每年将为全国各地的学生和博士后提供资金,并向少数派服务机构和妇女的大学提供宣传计划。马里兰州大学公园还将提供夏季REU计划。该团队将开发一个新的框架,以将生物用例的正式定义与评估数据集和指标相结合,以确保开发的软件可以充分满足最终用户的需求。其次,他们将开发用于基于标记的分类单元识别和丰度分析的新方法,这些方法可以利用多个信息来源(例如,多个标记)以及处理第三代测序技术的高误差率。这些方法将基于开发TIPP的经验,这是该团队最近发布的分类概况软件包,其表现优于领先的宏基因组分类学分析软件,尤其是新序列,或更长的高率序列。最后,他们计划开发这些方法的高性能计算实现,以便对样本进行快速分析。在医疗治疗可能取决于该方法返回分析的速度的临床环境中,分析速度尤为重要。速度在非医疗应用中也很重要,在非医学应用中,更快的分析使研究人员能够对微生物群落进行更深入或更广泛的分析。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Tandy Warnow其他文献

Addressing Polymorphism in Linguistic Phylogenetics
解决语言系统发育中的多态性
  • DOI:
    10.1111/1467-968x.12289
  • 发表时间:
    2024
  • 期刊:
  • 影响因子:
    0.3
  • 作者:
    Marc E. Canby;Steven N. Evans;Donald Ringe;Tandy Warnow
  • 通讯作者:
    Tandy Warnow
½ºº ʹóó×øøòò ×øøòòò¹¹¹××× Ññøøó× ½¹½½
½°º Å11óó×øøòò ×øøòòòòì11××× Ññøøó× ½1½½
  • DOI:
  • 发表时间:
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Tandy Warnow
  • 通讯作者:
    Tandy Warnow

Tandy Warnow的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Tandy Warnow', 18)}}的其他基金

IIBR Informatics: Advancing Bioinformatics Methods using Ensembles of Profile Hidden Markov Models
IIBR 信息学:使用轮廓隐马尔可夫模型集成推进生物信息学方法
  • 批准号:
    2006069
  • 财政年份:
    2020
  • 资助金额:
    $ 62.67万
  • 项目类别:
    Standard Grant
AitF: Full: Collaborative Research: Graph-theoretic algorithms to improve phylogenomic analyses
AitF:完整:协作研究:改进系统发育分析的图论算法
  • 批准号:
    1535977
  • 财政年份:
    2015
  • 资助金额:
    $ 62.67万
  • 项目类别:
    Standard Grant
ABI Innovation: New methods for multiple sequence alignment with improved accuracy and scalability
ABI Innovation:多序列比对的新方法,具有更高的准确性和可扩展性
  • 批准号:
    1458652
  • 财政年份:
    2015
  • 资助金额:
    $ 62.67万
  • 项目类别:
    Standard Grant
Collaborative Research: Novel Methodologies for Genome-scale Evolutionary Analysis of Multi-locus data
合作研究:多位点数据基因组规模进化分析的新方法
  • 批准号:
    1461364
  • 财政年份:
    2014
  • 资助金额:
    $ 62.67万
  • 项目类别:
    Standard Grant
Collaborative Research: Novel Methodologies for Genome-scale Evolutionary Analysis of Multi-locus data
合作研究:多位点数据基因组规模进化分析的新方法
  • 批准号:
    1062335
  • 财政年份:
    2011
  • 资助金额:
    $ 62.67万
  • 项目类别:
    Standard Grant
Collaborative Research: Large-scale simultaneous multiple alignment and phylogeny estimation
合作研究:大规模同时多重比对和系统发育估计
  • 批准号:
    0733029
  • 财政年份:
    2007
  • 资助金额:
    $ 62.67万
  • 项目类别:
    Continuing Grant
Information Technology Research (ITR): Building the Tree of Life -- A National Resource for Phyloinformatics and Computational Phylogenetics
信息技术研究(ITR):构建生命之树——系统信息学和计算系统发育学的国家资源
  • 批准号:
    0715370
  • 财政年份:
    2006
  • 资助金额:
    $ 62.67万
  • 项目类别:
    Cooperative Agreement
Information Technology Research (ITR): Building the Tree of Life -- A National Resource for Phyloinformatics and Computational Phylogenetics
信息技术研究(ITR):构建生命之树——系统信息学和计算系统发育学的国家资源
  • 批准号:
    0331654
  • 财政年份:
    2003
  • 资助金额:
    $ 62.67万
  • 项目类别:
    Cooperative Agreement
Information Technology Research (ITR): Building the Tree of Life -- A National Resource for Phyloinformatics and Computational Phylogenetics
信息技术研究(ITR):构建生命之树——系统信息学和计算系统发育学的国家资源
  • 批准号:
    0331453
  • 财政年份:
    2003
  • 资助金额:
    $ 62.67万
  • 项目类别:
    Cooperative Agreement
ITR: Collaborative Research, Algorithms for Inferring Reticulate Evolution in Historical Linguistics
ITR:历史语言学中推断网状进化的协作研究和算法
  • 批准号:
    0312830
  • 财政年份:
    2003
  • 资助金额:
    $ 62.67万
  • 项目类别:
    Standard Grant

相似国自然基金

H2S介导剪接因子BraU2AF65a的S-巯基化修饰促进大白菜开花的分子机制
  • 批准号:
    32372727
  • 批准年份:
    2023
  • 资助金额:
    50 万元
  • 项目类别:
    面上项目
AF9通过ARRB2-MRGPRB2介导肠固有肥大细胞活化促进重症急性胰腺炎发生MOF的研究
  • 批准号:
    82300739
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
剪接因子U2AF1突变在急性髓系白血病原发耐药中的机制研究
  • 批准号:
    82370157
  • 批准年份:
    2023
  • 资助金额:
    49 万元
  • 项目类别:
    面上项目
线粒体活性氧介导的胎盘早衰在孕期双酚AF暴露致婴幼儿神经发育迟缓中的作用
  • 批准号:
    82304160
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目
U2AF2-circMMP1调控能量代谢促进结直肠癌肝转移的分子机制
  • 批准号:
    82303789
  • 批准年份:
    2023
  • 资助金额:
    30 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Collaborative Research: AF: Medium: The Communication Cost of Distributed Computation
合作研究:AF:媒介:分布式计算的通信成本
  • 批准号:
    2402836
  • 财政年份:
    2024
  • 资助金额:
    $ 62.67万
  • 项目类别:
    Continuing Grant
Collaborative Research: AF: Medium: Foundations of Oblivious Reconfigurable Networks
合作研究:AF:媒介:遗忘可重构网络的基础
  • 批准号:
    2402851
  • 财政年份:
    2024
  • 资助金额:
    $ 62.67万
  • 项目类别:
    Continuing Grant
Collaborative Research: AF: Medium: Algorithms Meet Machine Learning: Mitigating Uncertainty in Optimization
协作研究:AF:媒介:算法遇见机器学习:减轻优化中的不确定性
  • 批准号:
    2422926
  • 财政年份:
    2024
  • 资助金额:
    $ 62.67万
  • 项目类别:
    Continuing Grant
Collaborative Research: AF: Medium: Fast Combinatorial Algorithms for (Dynamic) Matchings and Shortest Paths
合作研究:AF:中:(动态)匹配和最短路径的快速组合算法
  • 批准号:
    2402283
  • 财政年份:
    2024
  • 资助金额:
    $ 62.67万
  • 项目类别:
    Continuing Grant
Collaborative Research: AF: Medium: Foundations of Oblivious Reconfigurable Networks
合作研究:AF:媒介:遗忘可重构网络的基础
  • 批准号:
    2402852
  • 财政年份:
    2024
  • 资助金额:
    $ 62.67万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了