Genomics, EHRs, GPUs, and Next Generation Computational Statistics

基因组学、EHR、GPU 和下一代计算统计

基本信息

  • 批准号:
    10672959
  • 负责人:
  • 金额:
    $ 64.43万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2011
  • 资助国家:
    美国
  • 起止时间:
    2011-08-26 至 2024-06-30
  • 项目状态:
    已结题

项目摘要

Abstract The future challenges of statistical genetics are enormous. Data sets continue to grow; studies with 106 cases and 107 markers have become feasible, but current algorithms and software do not scale to this size. We need to rethink and rebuild many of our statistical analysis techniques and tools to scale effectively. In addition, health data will soon be commonly collected from mobile and wearable devices, dramatically increasing its volume and utility. Precision health and predictive medicine raise the stakes even further. Concurrently, the nature of computing is rapidly changing. To take advantage of hardware advances, particularly ubiquitous parallel computing, new statistical approaches and algorithms and new programming paradigms must be brought online. This renewal proposal targets the application of state-of-the-art statistical techniques and tools to develop genetic analysis algorithms that can scale to studies with millions of subjects, such as the US Department of Veterans Affairs' Million Veteran Program (MVP) and the UK Biobank. Biobank-scale data sets have many ben- efits, particularly the potential power to detect the subtle effects of each of the many genes involved in common diseases. Another benefit is that these data sets can be more representative of the populace by including large numbers of people from multiple ancestries, different social strata, and all sexes. To effectively and efficiently analyze these massive data sets requires advances in the current statistical genetics tools. Effective statistical analysis takes many forms: algorithms that converge in fewer iterations, powerful statistics that accommodate all available data, and computational methods that take advantage of massively parallel computing hardware such as graphics processing units (GPUs) and other coprocessors. We will deliver algorithms that can directly handle biobank-scale data sets for many computationally-challenging statistical genetics tasks, including genome-wide association studies (GWAS) with trait data from electronic health records (EHRs). More generally, our algorithm focus will benefit all scientific fields driven by computational statistics and high-dimensional optimization. Of course, for statistical algorithm development to be immediately useful it must be accompanied by fast, easy-to-use software. We will promptly deliver open-source software that (1) enables interactive and reproducible analyses with informative intermediate results, (2) provides quality graphics, (3) scales to big data analytics, (4) embraces parallel and distributed computing, (5) adapts to rapid hardware evolution, (6) allows cloud computing, and (7) fosters easy communication between clinicians, geneticists, statisticians, and computer scientists. Recent breakthroughs in computer languages bring all these goals within reach. Our overall objective is the design and construction of state-of-the-art statistical genetics algorithms and software for modern, massive genetic and EHR data. Numerical accuracy, computational efficiency, and software sustainability are our priorities. We will deliver a unified, cross-platform, high-level, reproducible, interactive analysis environment that is fast and efficient even for biobank-scale data sets.
摘要 统计遗传学未来的挑战是巨大的。数据集继续增长;研究106例 和107个标记物已经变得可行,但是当前的算法和软件不能按比例缩放到这个尺寸。我们需要 重新思考和重建我们的许多统计分析技术和工具,以有效地扩展。此外,健康数据 将很快普遍从移动的和可穿戴设备收集,从而显著增加其体积和效用。 精准健康和预测医学进一步提高了风险。同时,计算的本质 正在迅速变化。为了利用硬件的进步,特别是无处不在的并行计算, 必须将统计方法和算法以及新的编程范例上网。 这项更新建议的目标是应用最先进的统计技术和工具, 遗传分析算法可以扩展到数百万受试者的研究,例如美国农业部。 退伍军人事务部的百万退伍军人计划(MVP)和英国生物银行。生物银行规模的数据集有许多本- 的影响,特别是潜在的权力,以检测每个微妙的影响,许多基因涉及共同的 疾病另一个贝内是,这些数据集可以更好地代表大众, 来自多个血统、不同社会阶层和所有性别的人数。有效和高效地 分析这些庞大的数据集需要在现有的统计遗传学工具方面取得进展。有效的统计 分析有多种形式:以更少的迭代收敛的算法,容纳所有内容的强大统计数据 可用的数据,以及利用大规模并行计算硬件的计算方法, 作为图形处理单元(GPU)和其他协处理器。我们将提供可以直接处理 生物库规模的数据集,用于许多具有计算挑战性的统计遗传学任务,包括全基因组 关联研究(GWAS)与来自电子健康记录(EHR)的性状数据。更一般地说,我们的算法 Focus将使所有由计算统计和高维优化驱动的科学领域受益。 当然,为了使统计算法的开发立即有用,它必须伴随着快速, 易于使用的软件。我们将迅速提供开源软件,(1)使互动和可复制的 具有信息丰富的中间结果的分析,(2)提供高质量的图形,(3)扩展到大数据分析,(4) 包含并行和分布式计算,(5)适应快速的硬件发展,(6)允许云计算, 以及(7)促进临床医生、遗传学家、统计学家和计算机科学家之间的轻松沟通。最近 计算机语言的突破使所有这些目标都触手可及。 我们的总体目标是设计和构建最先进的统计遗传学算法, 现代化的、大量的基因和电子病历数据的软件。数值精度、计算效率和软件 可持续性是我们的优先事项。我们将提供一个统一的、跨平台的、高水平的、可复制的、互动的艾德 即使对于生物库规模的数据集,也是快速有效的分析环境。

项目成果

期刊论文数量(10)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
EM vs MM: A Case Study.
EM 与 MM:案例研究。
Regularized matrix regression.
Interactions Between Adiponectin-Pathway Polymorphisms and Obesity on Postmenopausal Breast Cancer Risk Among African American Women: The WHI SHARe Study.
  • DOI:
    10.3389/fonc.2021.698198
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    4.7
  • 作者:
    Nam GE;Zhang ZF;Rao J;Zhou H;Jung SY
  • 通讯作者:
    Jung SY
Inferring heterogeneous evolutionary processes through time: from sequence substitution to phylogeography.
  • DOI:
    10.1093/sysbio/syu015
  • 发表时间:
    2014-07
  • 期刊:
  • 影响因子:
    6.5
  • 作者:
    Bielejec F;Lemey P;Baele G;Rambaut A;Suchard MA
  • 通讯作者:
    Suchard MA
BEAST 2: a software platform for Bayesian evolutionary analysis.
  • DOI:
    10.1371/journal.pcbi.1003537
  • 发表时间:
    2014-04
  • 期刊:
  • 影响因子:
    4.3
  • 作者:
    Bouckaert R;Heled J;Kühnert D;Vaughan T;Wu CH;Xie D;Suchard MA;Rambaut A;Drummond AJ
  • 通讯作者:
    Drummond AJ
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Eric Sobel其他文献

Eric Sobel的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Eric Sobel', 18)}}的其他基金

Genomics, EHRs, GPUs, and Next Generation Computational Statistics
基因组学、EHR、GPU 和下一代计算统计
  • 批准号:
    10264804
  • 财政年份:
    2011
  • 资助金额:
    $ 64.43万
  • 项目类别:
Genomics GPUs and next generation computational statistics
基因组学 GPU 和下一代计算统计
  • 批准号:
    8539067
  • 财政年份:
    2011
  • 资助金额:
    $ 64.43万
  • 项目类别:
Genomics, EHRs, GPUs, and Next Generation Computational Statistics
基因组学、EHR、GPU 和下一代计算统计
  • 批准号:
    10450816
  • 财政年份:
    2011
  • 资助金额:
    $ 64.43万
  • 项目类别:
Genomics GPUs and next generation computational statistics
基因组学 GPU 和下一代计算统计
  • 批准号:
    8324508
  • 财政年份:
    2011
  • 资助金额:
    $ 64.43万
  • 项目类别:
Genomics GPUs and next generation computational statistics
基因组学 GPU 和下一代计算统计
  • 批准号:
    8085977
  • 财政年份:
    2011
  • 资助金额:
    $ 64.43万
  • 项目类别:
Genomics, GPUs, and Next Generation Computational Statistics
基因组学、GPU 和下一代计算统计
  • 批准号:
    9100873
  • 财政年份:
    2011
  • 资助金额:
    $ 64.43万
  • 项目类别:
Genomics, GPUs, and Next Generation Computational Statistics
基因组学、GPU 和下一代计算统计
  • 批准号:
    8888381
  • 财政年份:
    2011
  • 资助金额:
    $ 64.43万
  • 项目类别:
Computer Cluster and Storage to Support Whole Genome Sequencing and Analysis
支持全基因组测序和分析的计算机集群和存储
  • 批准号:
    7595696
  • 财政年份:
    2009
  • 资助金额:
    $ 64.43万
  • 项目类别:
COMPILING AND TESTING STATISTICAL GENETICS APPLICATIONS
编译和测试统计遗传学应用程序
  • 批准号:
    7627683
  • 财政年份:
    2007
  • 资助金额:
    $ 64.43万
  • 项目类别:
COMPILING AND TESTING STATISTICAL GENETICS APPLICATIONS
编译和测试统计遗传学应用程序
  • 批准号:
    7369416
  • 财政年份:
    2006
  • 资助金额:
    $ 64.43万
  • 项目类别:

相似海外基金

AI-based prediction of the belepharoptosis etiologies by means of machine learning algorithmic analysis of length-tensile force chart of levator muscle
通过提上睑肌长度-拉力图的机器学习算法分析,基于人工智能的上睑下垂病因预测
  • 批准号:
    22K09863
  • 财政年份:
    2022
  • 资助金额:
    $ 64.43万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Algorithmic analysis of symmetric-key cryptographic primitives
对称密钥密码原语的算法分析
  • 批准号:
    262074-2008
  • 财政年份:
    2013
  • 资助金额:
    $ 64.43万
  • 项目类别:
    Discovery Grants Program - Individual
Algorithmic analysis of symmetric-key cryptographic primitives
对称密钥密码原语的算法分析
  • 批准号:
    262074-2008
  • 财政年份:
    2012
  • 资助金额:
    $ 64.43万
  • 项目类别:
    Discovery Grants Program - Individual
Algorithmic analysis of symmetric-key cryptographic primitives
对称密钥密码原语的算法分析
  • 批准号:
    262074-2008
  • 财政年份:
    2011
  • 资助金额:
    $ 64.43万
  • 项目类别:
    Discovery Grants Program - Individual
Unified Approach for Nanotechnology CAD/Computation by Algorithmic Analysis of Periodic Crystal Structures
通过周期性晶体结构的算法分析实现纳米技术 CAD/计算的统一方法
  • 批准号:
    22650002
  • 财政年份:
    2010
  • 资助金额:
    $ 64.43万
  • 项目类别:
    Grant-in-Aid for Challenging Exploratory Research
Algorithmic analysis of symmetric-key cryptographic primitives
对称密钥密码原语的算法分析
  • 批准号:
    262074-2008
  • 财政年份:
    2010
  • 资助金额:
    $ 64.43万
  • 项目类别:
    Discovery Grants Program - Individual
Algorithmic analysis of symmetric-key cryptographic primitives
对称密钥密码原语的算法分析
  • 批准号:
    262074-2008
  • 财政年份:
    2009
  • 资助金额:
    $ 64.43万
  • 项目类别:
    Discovery Grants Program - Individual
Algorithmic analysis of symmetric-key cryptographic primitives
对称密钥密码原语的算法分析
  • 批准号:
    262074-2008
  • 财政年份:
    2008
  • 资助金额:
    $ 64.43万
  • 项目类别:
    Discovery Grants Program - Individual
Mathematical & Algorithmic Analysis of Natural and Artificial DNA Sequences
数学
  • 批准号:
    0218568
  • 财政年份:
    2002
  • 资助金额:
    $ 64.43万
  • 项目类别:
    Standard Grant
Algorithmic Analysis and Congestion Control of Connection-Oriented Services in Large Scale Communication Networks.
大规模通信网络中面向连接的服务的算法分析和拥塞控制。
  • 批准号:
    9404947
  • 财政年份:
    1994
  • 资助金额:
    $ 64.43万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了