III: Medium: Scalable Machine Learning for Genome-Wide Association Analyses

III:媒介:用于全基因组关联分析的可扩展机器学习

基本信息

  • 批准号:
    1705121
  • 负责人:
  • 金额:
    $ 97.52万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2017
  • 资助国家:
    美国
  • 起止时间:
    2017-07-01 至 2022-06-30
  • 项目状态:
    已结题

项目摘要

Over the past decade, genome-wide association studies (GWAS) have discovered genetic variants associated with numerous diseases as well as other complex phenotypes. Despite their success, major gaps remain in our understanding of how genetic changes affect phenotype. These gaps, coupled with advances in high-throughput technologies to measure genetic variation, have motivated GWAS of increasingly larger scale. However, the statistical and computational challenges posed by the scale and complexity of these studies present a critical bottleneck in realizing their promise. These recent advances in scalable ML provide the potential for paradigm-shifting advances in the field of GWAS. However, these concepts have yet to be rigorously explored in the context of the GWAS modeling and testing problems. Exploring the intersection of these domains introduces fundamentally new statistical and computational challenges. The team will develop a suite of modeling and testing methods that target massive modern genomics datasets. The techniques that we will build upon include low-rank matrix approximation, kernel methods and matrix completion. They will also provide open-source software tailored to parallel and distributed computing environments to facilitate wide-spread adoption of methods.Exploring GWAS through the lens of scalable machine learning introduces several research directions and requires the development of novel algorithms and analyses. Firstly, the focus of much scalable ML research has been on the statistical task of prediction, while GWAS inference problems also emphasize hypothesis testing and parameter estimation. Characterizing the behavior of scalable ML methods in these novel settings is a challenging open problem. The team will develop principled GWAS modeling and testing methods. The results to also be of great interest to the scalable ML community. Secondly, while scalable ML techniques are designed to be general purpose and domain-agnostic, the GWAS setting introduces rich biologically-motivated domain knowledge that needs to be leveraged to improve the quality of inference. Statistical models that are able to encode this prior knowledge while still permitting efficient inference will be developed. Ultimately the algorithms will be implemented as efficient parallel and distributed algorithms for these core modeling and testing problems, and develop robust open-source implementations that leverage modern computing infrastructure.1The proposed methods will dramatically improve the scalability of current GWAS analyses, on the one hand, while enabling the development of increasingly realistic genomic models, on the other. Collaborations and open-source artifacts will enable the wide-spread adoption of these methods by the human genetics community. This project will lead to a closer interaction of the genomics and machine learning communities at UCLA and outside.
在过去的十年中,全基因组关联研究(GWAS)已经发现了与许多疾病以及其他复杂表型相关的遗传变异。尽管他们取得了成功,但我们对遗传变化如何影响表型的理解仍然存在重大差距。这些差距,再加上高通量技术的进步,以衡量遗传变异,促使GWAS的规模越来越大。然而,这些研究的规模和复杂性所带来的统计和计算挑战是实现其承诺的关键瓶颈。 可扩展ML的这些最新进展为GWAS领域的范式转移提供了潜力。然而,这些概念还有待于在GWAS建模和测试问题的上下文中进行严格的探索。探索这些领域的交集带来了全新的统计和计算挑战。该团队将开发一套针对大规模现代基因组学数据集的建模和测试方法。我们将建立的技术包括低秩矩阵近似,核方法和矩阵完成。他们还将提供为并行和分布式计算环境量身定制的开源软件,以促进方法的广泛采用。通过可扩展机器学习的透镜探索GWAS引入了几个研究方向,并需要开发新的算法和分析。首先,许多可扩展的ML研究的重点是预测的统计任务,而GWAS推理问题也强调假设检验和参数估计。在这些新的环境中描述可扩展ML方法的行为是一个具有挑战性的开放问题。该团队将开发原则性的GWAS建模和测试方法。这些结果也引起了可扩展ML社区的极大兴趣。其次,虽然可扩展的ML技术被设计为通用和领域不可知的,但GWAS设置引入了丰富的生物学驱动的领域知识,需要利用这些知识来提高推理的质量。将开发能够对这种先验知识进行编码,同时仍然允许有效推理的统计模型。最终,这些算法将被实现为这些核心建模和测试问题的高效并行和分布式算法,并开发利用现代计算基础设施的强大的开源实现。1所提出的方法将大大提高当前GWAS分析的可扩展性,一方面,同时使越来越现实的基因组模型的开发成为可能。合作和开放源代码的人工制品将使人类遗传学界能够广泛采用这些方法。该项目将导致基因组学和机器学习社区在加州大学洛杉矶分校和外部更密切的互动。

项目成果

期刊论文数量(23)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Learning Fair Representations for Kernel Models
学习内核模型的公平表示
A Unifying Framework for Imputing Summary Statistics in Genome-Wide Association Studies
全基因组关联研究中汇总统计数据的统一框架
  • DOI:
    10.1089/cmb.2019.0449
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    1.7
  • 作者:
    Wu, Yue;Eskin, Eleazar;Sankararaman, Sriram
  • 通讯作者:
    Sankararaman, Sriram
CONTRA: Contrarian statistics for controlled variable selection
  • DOI:
  • 发表时间:
    2021-04
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Mukund Sudarshan;A. Puli;Lakshminarayanan Subramanian;S. Sankararaman;R. Ranganath
  • 通讯作者:
    Mukund Sudarshan;A. Puli;Lakshminarayanan Subramanian;S. Sankararaman;R. Ranganath
Quantifying the contribution of dominance deviation effects to complex trait variation in biobank-scale data.
  • DOI:
    10.1016/j.ajhg.2021.03.018
  • 发表时间:
    2021-05-06
  • 期刊:
  • 影响因子:
    9.8
  • 作者:
    Pazokitoroudi A;Chiu AM;Burch KS;Pasaniuc B;Sankararaman S
  • 通讯作者:
    Sankararaman S
STENSL: Microbial Source Tracking with ENvironment SeLection.
  • DOI:
    10.1128/msystems.00995-21
  • 发表时间:
    2022-10-26
  • 期刊:
  • 影响因子:
    6.4
  • 作者:
  • 通讯作者:
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Sriram Sankararaman其他文献

Characterizing the genetic architecture of drug response using gene-context interaction methods
利用基因-环境相互作用方法描绘药物反应的遗传结构
  • DOI:
    10.1016/j.xgen.2024.100722
  • 发表时间:
    2024-12-11
  • 期刊:
  • 影响因子:
    9.000
  • 作者:
    Michal Sadowski;Mike Thompson;Joel Mefford;Tanushree Haldar;Akinyemi Oni-Orisan;Richard Border;Ali Pazokitoroudi;Na Cai;Julien F. Ayroles;Sriram Sankararaman;Andy W. Dahl;Noah Zaitlen
  • 通讯作者:
    Noah Zaitlen
dotears: Scalable and consistent directed acyclic graph estimation using observational and interventional data
多泪:使用观测数据和干预数据进行可扩展且一致的有向无环图估计
  • DOI:
    10.1016/j.isci.2024.111673
  • 发表时间:
    2025-02-21
  • 期刊:
  • 影响因子:
    4.100
  • 作者:
    Albert Xue;Jingyou Rao;Sriram Sankararaman;Harold Pimentel
  • 通讯作者:
    Harold Pimentel
Identifying common disease trajectories of Alzheimer’s disease with electronic health records
利用电子健康记录识别阿尔茨海默病的常见疾病轨迹
  • DOI:
    10.1016/j.ebiom.2025.105831
  • 发表时间:
    2025-08-01
  • 期刊:
  • 影响因子:
    10.800
  • 作者:
    Mingzhou Fu;Sriram Sankararaman;Bogdan Pasaniuc;Keith Vossel;Timothy S. Chang
  • 通讯作者:
    Timothy S. Chang
OP-CBIO201112 5640..5648
OP-CBIO201112 5640..5648
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    A. Majumdar;Kathryn S. Burch;Tanushree Haldar;Sriram Sankararaman;Bogdan Pasaniuc;W. J. Gauderman;John S. Witte
  • 通讯作者:
    John S. Witte
Investigating the sources of variable impact of pathogenic variants in monogenic metabolic conditions
研究单基因代谢疾病中致病变异的可变影响的来源
  • DOI:
    10.1038/s41467-025-60339-7
  • 发表时间:
    2025-06-05
  • 期刊:
  • 影响因子:
    15.700
  • 作者:
    Angela Wei;Richard Border;Boyang Fu;Sinéad Cullina;Nadav Brandes;Seon-Kyeong Jang;Sriram Sankararaman;Eimear E. Kenny;Miriam S. Udler;Vasilis Ntranos;Noah Zaitlen;Valerie A. Arboleda
  • 通讯作者:
    Valerie A. Arboleda

Sriram Sankararaman的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Sriram Sankararaman', 18)}}的其他基金

CAREER: Flexible and efficient mixed models to infer the genetic architecture of complex phenotypes
职业:灵活高效的混合模型来推断复杂表型的遗传结构
  • 批准号:
    1943497
  • 财政年份:
    2020
  • 资助金额:
    $ 97.52万
  • 项目类别:
    Continuing Grant

相似海外基金

Collaborative Research: III: Medium: Algorithms for scalable inference and phylodynamic analysis of tumor haplotypes using low-coverage single cell sequencing data
合作研究:III:中:使用低覆盖率单细胞测序数据对肿瘤单倍型进行可扩展推理和系统动力学分析的算法
  • 批准号:
    2415562
  • 财政年份:
    2023
  • 资助金额:
    $ 97.52万
  • 项目类别:
    Standard Grant
III: Medium: CARE: Interactive Systems for Scalable, Causal Data Science
III:媒介:CARE:可扩展因果数据科学的交互式系统
  • 批准号:
    2312561
  • 财政年份:
    2023
  • 资助金额:
    $ 97.52万
  • 项目类别:
    Continuing Grant
Collaborative Research: III: Medium: Algorithms for scalable inference and phylodynamic analysis of tumor haplotypes using low-coverage single cell sequencing data
合作研究:III:中:使用低覆盖率单细胞测序数据对肿瘤单倍型进行可扩展推理和系统动力学分析的算法
  • 批准号:
    2341725
  • 财政年份:
    2023
  • 资助金额:
    $ 97.52万
  • 项目类别:
    Standard Grant
Collaborative Research: III: Medium: Algorithms for scalable inference and phylodynamic analysis of tumor haplotypes using low-coverage single cell sequencing data
合作研究:III:中:使用低覆盖率单细胞测序数据对肿瘤单倍型进行可扩展推理和系统动力学分析的算法
  • 批准号:
    2212508
  • 财政年份:
    2022
  • 资助金额:
    $ 97.52万
  • 项目类别:
    Standard Grant
Collaborative Research: III: Medium: Algorithms for scalable inference and phylodynamic analysis of tumor haplotypes using low-coverage single cell sequencing data
合作研究:III:中:使用低覆盖率单细胞测序数据对肿瘤单倍型进行可扩展推理和系统动力学分析的算法
  • 批准号:
    2212512
  • 财政年份:
    2022
  • 资助金额:
    $ 97.52万
  • 项目类别:
    Standard Grant
Collaborative Research: III: Medium: Algorithms for scalable inference and phylodynamic analysis of tumor haplotypes using low-coverage single cell sequencing data
合作研究:III:中:使用低覆盖率单细胞测序数据对肿瘤单倍型进行可扩展推理和系统动力学分析的算法
  • 批准号:
    2212511
  • 财政年份:
    2022
  • 资助金额:
    $ 97.52万
  • 项目类别:
    Standard Grant
III: Medium: Scalable Evolutionary Analysis of SNVs and CNAs in Cancer Using Single-Cell DNA Sequencing Data
III:中:使用单细胞 DNA 测序数据对癌症中的 SNV 和 CNA 进行可扩展的进化分析
  • 批准号:
    2106837
  • 财政年份:
    2021
  • 资助金额:
    $ 97.52万
  • 项目类别:
    Continuing Grant
III: Medium: Collaborative Research: Towards Scalable and Interpretable Graph Neural Networks
III:媒介:协作研究:迈向可扩展和可解释的图神经网络
  • 批准号:
    1955285
  • 财政年份:
    2020
  • 资助金额:
    $ 97.52万
  • 项目类别:
    Standard Grant
III: Medium: Collaborative Research: Towards Scalable and Interpretable Graph Neural Networks
III:媒介:协作研究:迈向可扩展和可解释的图神经网络
  • 批准号:
    1955189
  • 财政年份:
    2020
  • 资助金额:
    $ 97.52万
  • 项目类别:
    Standard Grant
III: Medium: Collaborative Research: Towards Scalable and Interpretable Graph Neural Networks
III:媒介:协作研究:迈向可扩展和可解释的图神经网络
  • 批准号:
    1955851
  • 财政年份:
    2020
  • 资助金额:
    $ 97.52万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了