Analysis Tools and Software for Second Generation Sequencing Data

第二代测序数据的分析工具和软件

基本信息

  • 批准号:
    8280415
  • 负责人:
  • 金额:
    $ 32.21万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2010
  • 资助国家:
    美国
  • 起止时间:
    2010-08-11 至 2013-05-17
  • 项目状态:
    已结题

项目摘要

DESCRIPTION (provided by applicant): Second-generation sequencing (sec-gen) technology is poised to radically change how genomic data is obtained and used. Capable of sequencing millions of short strands of DNA in parallel, this technology can be used to assemble complex genomes for a small fraction of the price and time of previous technologies. In fact, a recently formed international consortium, the 1000 Genomes Project, plans to sequence the genomes of approximately 1,200 people. The possibility of comparative analysis at the sequence level of a large number of samples across multiple populations may be achievable within the next five years. These datasets also present unprecedented challenges in statistical analysis and data management. For example, a central goal of the 1000 Genomes Project is to quantify across-sample variation at the single nucleotide level. At this resolution, small error rates in sequencing prove significant, especially for rare variants. Furthermore, sec-gen sequencing is a relatively new technology for which potential biases and sources of obscuring variation are not yet fully understood. Therefore, modeling and quantifying the uncertainty inherent in the generation of sequencing reads is of utmost importance. Properly relating this uncertainty to the true underlying variation in the genome, especially, variation between and among populations will be essential for projects that use sec-gen sequencing data to meet their scientific goals. Although genome sequencing is the application that most attention has received, sec-gen technology is also being used to produce quantitative measurements related to applications previously associated with microarrays. Of these, chromatin immunoprecipitation followed by sequencing (ChIP- Seq) has been the most successful. Existing tools have been developed for analyzing one sample at a time. Methodology for drawing inference from multiple samples has not yet been developed. The demand for such methods will increase rapidly as the technology becomes more economical and multiple samples become standard. Other applications for which statistical methodology is needed are RNA and microRNA transcription analysis. In all these sequencing applications, a number of critical steps are required to convert raw intensity measures into the sequence reads that will be used in down-stream analysis. Ad-hoc approaches, that assign weights to each base call, are unsuitable. Our goal is to create a sound and unified statistical and computational methodology for representing and managing uncertainty throughout the sec-gen sequencing data analysis pipeline built on a robust, modular and extensible software platform. PUBLIC HEALTH RELEVANCE: Second-generation sequencing technology is poised to radically change how genomic data is obtained and used. These datasets also present unprecedented challenges in statistical analysis and modeling and quantifying uncertainty inherent in the generation of sequencing reads is of utmost importance. We will develop data analysis tools for widely used applications using statistical methods that account for this uncertainty.
描述(由申请人提供):第二代测序(sec-gen)技术有望从根本上改变基因组数据的获取和使用方式。该技术能够并行测序数百万短链DNA,可用于组装复杂的基因组,价格和时间仅为以前技术的一小部分。事实上,最近成立的一个国际财团,1000个基因组计划,计划对大约1,200人的基因组进行测序。在未来五年内,可能实现对多个群体的大量样本进行序列水平的比较分析。 这些数据集也给统计分析和数据管理带来了前所未有的挑战。例如,1000个基因组计划的中心目标是在单核苷酸水平上量化跨样本变异。在这种分辨率下,测序中的小错误率证明是重要的,特别是对于罕见的变异。此外,第二代测序是一种相对较新的技术,其潜在的偏差和模糊变异的来源尚未完全了解。因此,建模和量化测序读数生成中固有的不确定性是至关重要的。将这种不确定性与基因组中真正的潜在变异(特别是种群之间的变异)正确联系起来,对于使用第二代测序数据实现其科学目标的项目至关重要。 尽管基因组测序是最受关注的应用,但sec-gen技术也被用于产生与先前与微阵列相关的应用相关的定量测量。其中,染色质免疫沉淀随后测序(ChIP-Seq)是最成功的。现有的工具已经被开发用于一次分析一个样品。从多个样本中得出推论的方法尚未制定。随着技术变得更加经济,多样品成为标准,对这种方法的需求将迅速增加。需要统计方法的其他应用是RNA和microRNA转录分析。在所有这些测序应用中,需要许多关键步骤来将原始强度测量转化为将用于下游分析的序列读数。为每个碱基调用分配权重的特设方法是不合适的。我们的目标是创建一个健全和统一的统计和计算方法,用于表示和管理整个sec-gen测序数据分析管道中的不确定性,该管道建立在一个强大的,模块化的和可扩展的软件平台上。 公共卫生相关性:第二代测序技术有望从根本上改变基因组数据的获取和使用方式。这些数据集还在统计分析和建模方面提出了前所未有的挑战,并且量化测序读数生成中固有的不确定性是至关重要的。我们将使用统计方法为广泛使用的应用程序开发数据分析工具,以解决这种不确定性。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Rafael Angel Irizarry其他文献

Rafael Angel Irizarry的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Rafael Angel Irizarry', 18)}}的其他基金

Next Generation Computational Tools for Functional Genomics
下一代功能基因组学计算工具
  • 批准号:
    9979396
  • 财政年份:
    2020
  • 资助金额:
    $ 32.21万
  • 项目类别:
Next Generation Computational Tools for Functional Genomics
下一代功能基因组学计算工具
  • 批准号:
    10666501
  • 财政年份:
    2020
  • 资助金额:
    $ 32.21万
  • 项目类别:
Next Generation Computational Tools for Functional Genomics
下一代功能基因组学计算工具
  • 批准号:
    10267687
  • 财政年份:
    2020
  • 资助金额:
    $ 32.21万
  • 项目类别:
Next Generation Computational Tools for Functional Genomics
下一代功能基因组学计算工具
  • 批准号:
    10448436
  • 财政年份:
    2020
  • 资助金额:
    $ 32.21万
  • 项目类别:
Data Analysis Tools for Emerging High-Throughput Technologies
适用于新兴高通量技术的数据分析工具
  • 批准号:
    10461727
  • 财政年份:
    2019
  • 资助金额:
    $ 32.21万
  • 项目类别:
Data Analysis Tools for Emerging High-Throughput Technologies
适用于新兴高通量技术的数据分析工具
  • 批准号:
    9922327
  • 财政年份:
    2019
  • 资助金额:
    $ 32.21万
  • 项目类别:
Data Analysis Tools for Emerging High-Throughput Technologies
适用于新兴高通量技术的数据分析工具
  • 批准号:
    10159937
  • 财政年份:
    2019
  • 资助金额:
    $ 32.21万
  • 项目类别:
Data Analysis Tools for Emerging High-Throughput Technologies
适用于新兴高通量技术的数据分析工具
  • 批准号:
    10612937
  • 财政年份:
    2019
  • 资助金额:
    $ 32.21万
  • 项目类别:
Biomedical Data Science Online Curriculum on HarvardX
HarvardX 生物医学数据科学在线课程
  • 批准号:
    8829975
  • 财政年份:
    2014
  • 资助金额:
    $ 32.21万
  • 项目类别:
Biomedical Data Science Online Curriculum on HarvardX
HarvardX 生物医学数据科学在线课程
  • 批准号:
    9130901
  • 财政年份:
    2014
  • 资助金额:
    $ 32.21万
  • 项目类别:

相似海外基金

CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
  • 批准号:
    2337776
  • 财政年份:
    2024
  • 资助金额:
    $ 32.21万
  • 项目类别:
    Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
  • 批准号:
    2338816
  • 财政年份:
    2024
  • 资助金额:
    $ 32.21万
  • 项目类别:
    Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
  • 批准号:
    2338846
  • 财政年份:
    2024
  • 资助金额:
    $ 32.21万
  • 项目类别:
    Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
  • 批准号:
    2348261
  • 财政年份:
    2024
  • 资助金额:
    $ 32.21万
  • 项目类别:
    Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
  • 批准号:
    2348346
  • 财政年份:
    2024
  • 资助金额:
    $ 32.21万
  • 项目类别:
    Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
  • 批准号:
    2348457
  • 财政年份:
    2024
  • 资助金额:
    $ 32.21万
  • 项目类别:
    Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
  • 批准号:
    2404989
  • 财政年份:
    2024
  • 资助金额:
    $ 32.21万
  • 项目类别:
    Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
  • 批准号:
    2339310
  • 财政年份:
    2024
  • 资助金额:
    $ 32.21万
  • 项目类别:
    Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
  • 批准号:
    2339669
  • 财政年份:
    2024
  • 资助金额:
    $ 32.21万
  • 项目类别:
    Continuing Grant
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
  • 批准号:
    EP/Y029089/1
  • 财政年份:
    2024
  • 资助金额:
    $ 32.21万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了