New algorithms and tools for large-scale genomic analyses

用于大规模基因组分析的新算法和工具

基本信息

  • 批准号:
    9272425
  • 负责人:
  • 金额:
    $ 49万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2012
  • 资助国家:
    美国
  • 起止时间:
    2012-04-19 至 2019-04-30
  • 项目状态:
    已结题

项目摘要

 DESCRIPTION (provided by applicant): The exploration and interpretation of large, complex datasets is vital to discovery in genomics. However, researchers now confront a fundamental limitation; unprecedented experiments are possible thanks to modern DNA sequencing technologies, yet existing "genome arithmetic" techniques for comparing and dissecting the resulting datasets are incapable of keeping pace with inexorable growth in dataset size and complexity. Genome arithmetic (GA) represents a powerful and widely used set of techniques that allow one to explore relationships among sets of genome features (e.g., a gene, sequence alignment, ChIP-seq peak, or anything that can be described with chromosome coordinates). GA is used for a broad spectrum of analyses including: the detection of intersecting/overlapping features (e.g., sequence alignments and exons), describing feature coverage among datasets, and the merging, subtraction, and complementation of feature datasets. GA functionality is used by all genome browsers and data visualization tools, and by analysis software such as GATK and SAMTOOLS. Owing to its power and flexibility, own BEDTOOLS software is extremely popular and is used in a broad range of complex genomic analyses. However, while GA is central to genomic analysis and discovery, the core algorithms employed by all existing tools are inherently incapable of keeping pace with the scale and diversity of modern genomic datasets. Restricted to these approaches, the present analytic bottleneck will become increasingly acute. Therefore, the overall objective of this proposal is to provide the genomics community with innovative new algorithms and software that keep pace with modern genomics experiments and facilitate future discoveries. The Specific Aims are to: (1) Create an ecosystem and software that allows researchers to easily integrate diverse genome annotations and datasets into their research. We will develop new tools that make it easy and reproducible for researchers to collect datasets germane to a given experiment. (2) Dramatically expand the utility, flexibility, and performance of BEDTOOLS. We will devise and implement new algorithms for scalable and flexible analysis of large-scale genome datasets. (3) Develop a workbench for visualizing and quantifying the biological significance of relationships among genomic datasets. We will leverage the technologies from Aims 1 and 2 to develop a comprehensive statistical and visualization "workbench" for the R statistical package that will allow researchers to detect biological relationships among genome datasets. The proposed research will devise entirely new, scalable approaches for genome arithmetic. This will provide the community with powerful new techniques for exploring and interpreting genomics experiments and give tool developers robust approaches for software development and improvement.
 描述(由申请人提供): 对大型复杂数据集的探索和解释对于基因组学的发现至关重要。然而,研究人员现在面临着一个根本性的限制;由于现代DNA测序技术,前所未有的实验是可能的,但现有的“基因组算术”技术用于比较和解剖所产生的数据集无法跟上数据集大小和复杂性的不可阻挡的增长。基因组算术(GA)代表了一组强大且广泛使用的技术,其允许人们探索基因组特征组之间的关系(例如,基因、序列比对、ChIP-seq峰或可以用染色体坐标描述的任何东西)。GA用于广泛的分析,包括:交叉/重叠特征的检测(例如,序列比对和外显子),描述数据集之间的特征覆盖,以及特征数据集的合并、减法和互补。GA功能被所有基因组浏览器和数据可视化工具以及分析软件(如GATK和SAMTOOLS)使用。由于其强大的功能和灵活性,BEDTOOLS软件非常受欢迎,并用于广泛的复杂基因组分析。然而,虽然GA是基因组分析和发现的核心,但所有现有工具所采用的核心算法本质上都无法跟上现代基因组数据集的规模和多样性。由于受到这些方法的限制,目前的分析瓶颈将变得日益严重。 因此,该提案的总体目标是为基因组学界提供创新的新算法和软件,以跟上现代基因组学实验的步伐,并促进未来的发现。具体目标是:(1)创建一个生态系统和软件,使研究人员能够轻松地将不同的基因组注释和数据集整合到他们的研究中。我们将开发新的工具,使研究人员能够轻松和可重复地收集与给定实验密切相关的数据集。(2)极大地扩展了床具的实用性、灵活性和性能。我们将设计和实施新的算法,用于大规模基因组数据集的可扩展和灵活的分析。(3)开发一个工作台,用于可视化和量化基因组数据集之间关系的生物学意义。我们将利用目标1和2中的技术为R统计包开发一个全面的统计和可视化“工作台”,使研究人员能够检测基因组数据集之间的生物学关系。拟议中的研究将为基因组算术设计全新的、可扩展的方法。这将为社区提供强大的新技术来探索和解释基因组学实验,并为工具开发人员提供强大的软件开发和改进方法。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Aaron R Quinlan其他文献

Extending reference assembly models
  • DOI:
    10.1186/s13059-015-0587-3
  • 发表时间:
    2015-01-24
  • 期刊:
  • 影响因子:
    9.400
  • 作者:
    Deanna M Church;Valerie A Schneider;Karyn Meltz Steinberg;Michael C Schatz;Aaron R Quinlan;Chen-Shan Chin;Paul A Kitts;Bronwen Aken;Gabor T Marth;Michael M Hoffman;Javier Herrero;M Lisandra Zepeda Mendoza;Richard Durbin;Paul Flicek
  • 通讯作者:
    Paul Flicek
Erratum: A reference bacterial genome dataset generated on the MinIONTM portable single-molecule nanopore sequencer
  • DOI:
    10.1186/s13742-015-0043-z
  • 发表时间:
    2015-02-13
  • 期刊:
  • 影响因子:
    3.900
  • 作者:
    Joshua Quick;Aaron R Quinlan;Nicholas J Loman
  • 通讯作者:
    Nicholas J Loman

Aaron R Quinlan的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Aaron R Quinlan', 18)}}的其他基金

New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
  • 批准号:
    10357060
  • 财政年份:
    2022
  • 资助金额:
    $ 49万
  • 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
  • 批准号:
    10560502
  • 财政年份:
    2022
  • 资助金额:
    $ 49万
  • 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
  • 批准号:
    10576268
  • 财政年份:
    2020
  • 资助金额:
    $ 49万
  • 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
  • 批准号:
    9973582
  • 财政年份:
    2020
  • 资助金额:
    $ 49万
  • 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
  • 批准号:
    10341175
  • 财政年份:
    2020
  • 资助金额:
    $ 49万
  • 项目类别:
Scalable detection and interpretation of structural variation in human genomes
人类基因组结构变异的可扩展检测和解释
  • 批准号:
    10153847
  • 财政年份:
    2020
  • 资助金额:
    $ 49万
  • 项目类别:
Software for exploring all forms of genetic variation in any species
用于探索任何物种中所有形式的遗传变异的软件
  • 批准号:
    9749979
  • 财政年份:
    2017
  • 资助金额:
    $ 49万
  • 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
  • 批准号:
    8273206
  • 财政年份:
    2012
  • 资助金额:
    $ 49万
  • 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
  • 批准号:
    8661785
  • 财政年份:
    2012
  • 资助金额:
    $ 49万
  • 项目类别:
New algorithms and tools for large-scale genomic analyses
用于大规模基因组分析的新算法和工具
  • 批准号:
    8460819
  • 财政年份:
    2012
  • 资助金额:
    $ 49万
  • 项目类别:

相似海外基金

Medcircuit, the algorithmic software reducing waiting times in emergency department and general practice waiting rooms.
MedCircuit,一种算法软件,可减少急诊科和全科候诊室的等待时间。
  • 批准号:
    133416
  • 财政年份:
    2018
  • 资助金额:
    $ 49万
  • 项目类别:
    Feasibility Studies
SHF: Small: Programming Abstractions for Algorithmic Software Synthesis
SHF:小型:算法软件综合的编程抽象
  • 批准号:
    0916351
  • 财政年份:
    2009
  • 资助金额:
    $ 49万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了