III: Small: Randomized Matrix-Sketching Approaches for the Analysis of Massive Human Genomics Data

III:小:用于分析大量人类基因组数据的随机矩阵草图方法

基本信息

  • 批准号:
    2006929
  • 负责人:
  • 金额:
    $ 49.95万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Continuing Grant
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-09-01 至 2024-08-31
  • 项目状态:
    已结题

项目摘要

Researchers in human genetics have now access to unprecedented amounts of genetic information characterizing how truly different we are from one another. From a Computer Science and Applied Mathematics perspective, the resulting datasets can be thought of as matrices, with the rows representing individuals and the columns representing loci in the genome that correspond to common or rare polymorphisms. Analyzing such datasets, Genome Wide Association Studies (GWAS) have reported over 10,000 strong associations between genetic variants and complex traits. However, tools that allow efficient analysis of very large scale datasets are still missing. Extracting useful information from such datasets promotes the progress of science and, at the same time, advances public health, prosperity, and welfare. This project will bridge the gap between state-of-the-art algorithms developed in the theoretical computer science community and the application of such algorithms to the analysis of the increasingly larger volume of datasets in the human genetics community.This project will explore how randomized linear algebra, from a theoretical and practical standpoint, can be used to speed human genetics data analytics. The first research direction will investigate Linear Mixed Models or LMMs: LMMs form a linear model of the genetic effects on the phenotype of interest. Randomized linear algebra tools will be used to speed up the solution of the resulting optimization problem, without sacrificing accuracy. The second research direction will investigate Polygenic Risk Scores (PRS), which typically operate by first selecting a large number of genetic markers (often in the tens of thousands) out of all available markers (often in the many millions) using single marker significance tests. This feature selection stage is followed by building regression models on the selected markers to predict phenotypes. Randomized linear algebra tools will be used to speed up PRS approaches, while preserving generalization accuracy. Finally, the third research direction, will explore how the particular structure of population genetics datasets can be leveraged in order to design improved randomized linear algebra tools for the analysis of human genetics datasets. The investigators will disseminate their results to a broad community of applied mathematicians, theoretical computer scientists, and population geneticists. They both participate in population genetics conferences and workshops and publish in high-profile journals in population genetics, as well as in conferences and workshops in Computer Science. The investigators will additionally disseminate this knowledge to graduates and undergraduates. They will involve under-represented groups in their research activities, leveraging their prior track record of involving such groups in cutting-edge research.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
人类遗传学的研究人员现在可以获得前所未有的大量遗传信息,这些信息描述了我们彼此之间的真正差异。从计算机科学和应用数学的角度来看,所得到的数据集可以被认为是矩阵,行代表个体,列代表基因组中与常见或罕见多态性相对应的基因座。通过分析这些数据集,全基因组关联研究(GWAS)报告了遗传变异和复杂性状之间超过10,000种强关联。然而,允许非常大规模数据集的有效分析的工具仍然缺失。从这些数据集中提取有用的信息,促进了科学的进步,同时也促进了公共健康、繁荣和福利。本项目将弥合理论计算机科学领域开发的最先进算法与将这些算法应用于分析人类遗传学领域日益庞大的数据集之间的差距。本项目将从理论和实践的角度探讨如何使用随机线性代数来加速人类遗传学数据分析。第一个研究方向将研究线性混合模型或LIFE:LIFE形成对感兴趣表型的遗传效应的线性模型。随机线性代数工具将用于加速优化问题的解决方案,而不会牺牲精度。第二个研究方向将研究多基因风险评分(PRS),其通常通过首先使用单标记显著性检验从所有可用标记(通常为数百万)中选择大量遗传标记(通常为数万)来操作。该特征选择阶段之后是在所选标记上构建回归模型以预测表型。随机线性代数工具将用于加快PRS方法,同时保持泛化精度。最后,第三个研究方向,将探索如何利用群体遗传学数据集的特定结构,以设计改进的随机线性代数工具来分析人类遗传学数据集。研究人员将把他们的结果传播给广泛的应用数学家、理论计算机科学家和人口遗传学家。他们都参加人口遗传学会议和研讨会,并在人口遗传学的知名期刊上发表文章,以及在计算机科学的会议和研讨会上发表文章。研究人员还将向毕业生和本科生传播这些知识。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(6)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Investigating Shared Genetic Basis Across Tourette Syndrome and Comorbid Neurodevelopmental Disorders Along the Impulsivity-Compulsivity Spectrum.
  • DOI:
    10.1016/j.biopsych.2020.12.028
  • 发表时间:
    2021-09-01
  • 期刊:
  • 影响因子:
    10.6
  • 作者:
    Yang, Zhiyu;Wu, Hanrui;Lee, Phil H.;Tsetsos, Fotis;Davis, Lea K.;Yu, Dongmei;Lee, Sang Hong;Dalsgaard, Soren;Haavik, Jan;Barta, Csaba;Zayats, Tetyana;Eapen, Valsamma;Wray, Naomi R.;Devlin, Bernie;Daly, Mark;Neale, Benjamin;Borglum, Anders D.;Crowley, James J.;Scharf, Jeremiah;Mathews, Carol A.;V. Faraone, Stephen;Franke, Barbara;Mattheisen, Manuel;Smoller, Jordan W.;Paschou, Peristera
  • 通讯作者:
    Paschou, Peristera
Synaptic processes and immune-related pathways implicated in Tourette syndrome.
  • DOI:
    10.1038/s41398-020-01082-z
  • 发表时间:
    2021-01-18
  • 期刊:
  • 影响因子:
    6.8
  • 作者:
    Tsetsos F;Yu D;Sul JH;Huang AY;Illmann C;Osiecki L;Darrow SM;Hirschtritt ME;Greenberg E;Muller-Vahl KR;Stuhrmann M;Dion Y;Rouleau GA;Aschauer H;Stamenkovic M;Schlögelhofer M;Sandor P;Barr CL;Grados MA;Singer HS;Nöthen MM;Hebebrand J;Hinney A;King RA;Fernandez TV;Barta C;Tarnok Z;Nagy P;Depienne C;Worbe Y;Hartmann A;Budman CL;Rizzo R;Lyon GJ;McMahon WM;Batterson JR;Cath DC;Malaty IA;Okun MS;Berlin C;Woods DW;Lee PC;Jankovic J;Robertson MM;Gilbert DL;Brown LW;Coffey BJ;Dietrich A;Hoekstra PJ;Kuperman S;Zinner SH;Wagner M;Knowles JA;Jeremy Willsey A;Tischfield JA;Heiman GA;Cox NJ;Freimer NB;Neale BM;Davis LK;Coppola G;Mathews CA;Scharf JM;Paschou P;Tourette Association of America International Consortium for Genetics;Barr CL;Batterson JR;Berlin C;Budman CL;Cath DC;Coppola G;Cox NJ;Darrow S;Davis LK;Dion Y;Freimer NB;Grados MA;Greenberg E;Hirschtritt ME;Huang AY;Illmann C;King RA;Kurlan R;Leckman JF;Lyon GJ;Malaty IA;Mathews CA;McMahon WM;Neale BM;Okun MS;Osiecki L;Robertson MM;Rouleau GA;Sandor P;Scharf JM;Singer HS;Smit JH;Sul JH;Yu D;Gilles de la Tourette GWAS Replication Initiative;Aschauer HAH;Barta C;Budman CL;Cath DC;Depienne C;Hartmann A;Hebebrand J;Konstantinidis A;Mathews CA;Müller-Vahl K;Nagy P;Nöthen MM;Paschou P;Rizzo R;Rouleau GA;Sandor P;Scharf JM;Schlögelhofer M;Stamenkovic M;Stuhrmann M;Tsetsos F;Tarnok Z;Wolanczyk T;Worbe Y;Tourette International Collaborative Genetics Study;Brown L;Cheon KA;Coffey BJ;Dietrich A;Fernandez TV;Garcia-Delgar B;Gilbert D;Grice DE;Hagstrøm J;Hedderly T;Heiman GA;Heyman I;Hoekstra PJ;Huyser C;Kim YK;Kim YS;King RA;Koh YJ;Kook S;Kuperman S;Leventhal BL;Madruga-Garrido M;Mir P;Morer A;Münchau A;Plessen KJ;Roessner V;Shin EY;Song DH;Song J;Tischfield JA;Willsey AJ;Zinner S;Psychiatric Genomics Consortium Tourette Syndrome Working Group;Aschauer H;Barr CL;Barta C;Batterson JR;Berlin C;Brown L;Budman CL;Cath DC;Coffey BJ;Coppola G;Cox NJ;Darrow S;Davis LK;Depienne C;Dietrich A;Dion Y;Fernandez T;Freimer NB;Gilbert D;Grados MA;Greenberg E;Hartmann A;Hebebrand J;Heiman G;Hirschtritt ME;Hoekstra P;Huang AY;Illmann C;Jankovic J;King RA;Kuperman S;Lee PC;Lyon GJ;Malaty IA;Mathews CA;McMahon WM;Müller-Vahl K;Nagy P;Neale BM;Nöthen MM;Okun MS;Osiecki L;Paschou P;Rizzo R;Robertson MM;Rouleau GA;Sandor P;Scharf JM;Schlögelhofer M;Singer HS;Stamenkovic M;Stuhrmann M;Sul JH;Tarnok Z;Tischfield J;Tsetsos F;Willsey AJ;Woods D;Worbe Y;Yu D;Zinner S
  • 通讯作者:
    Zinner S
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Peristera Paschou其他文献

Neuropathology-based approach reveals novel Alzheimer's Disease genes and highlights female-specific pathways and causal links to disrupted lipid metabolism: insights into a vicious cycle
  • DOI:
    10.1186/s40478-024-01909-6
  • 发表时间:
    2025-01-04
  • 期刊:
  • 影响因子:
    5.700
  • 作者:
    Yin Jin;Apostolia Topaloudi;Sudhanshu Shekhar;Guangxin Chen;Alicia Nicole Scott;Bryce David Colon;Petros Drineas;Chris Rochet;Peristera Paschou
  • 通讯作者:
    Peristera Paschou
79. DECIPHERING THE GENETIC ARCHITECTURE OF TOURETTE SYNDROME: A LARGE-SCALE GWAS REVEALS NOVEL RISK LOCI AND HIGHLIGHTS BRAIN REGION SPECIFICITY
解读图雷特综合征的遗传结构:一项大规模全基因组关联研究揭示了新的风险位点并突出了大脑区域特异性
  • DOI:
    10.1016/j.euroneuro.2024.08.193
  • 发表时间:
    2024-10-01
  • 期刊:
  • 影响因子:
    6.700
  • 作者:
    Dongmei Yu;Matt Halvorsen;Nora Strom;Sudhanshu Shekhar;Miao Tang;Zachary Gerring;Tyne Miller-Fleming;Kari Stefansson;Lea Davis;Michael Gandal;James Crowley;Manuel Mattheisen;Peristera Paschou;Carol Mathews;Jeremiah Scharf
  • 通讯作者:
    Jeremiah Scharf
The Genetics of Gilles de la Tourette Syndrome: a Common Aetiological Basis with Comorbid Disorders?
  • DOI:
    10.1007/s40473-016-0088-z
  • 发表时间:
    2016-07-05
  • 期刊:
  • 影响因子:
    1.000
  • 作者:
    Iordanis Karagiannidis;Fotis Tsetsos;Shanmukha Sampath Padmanabhuni;John Alexander;Marianthi Georgitsi;Peristera Paschou
  • 通讯作者:
    Peristera Paschou
Cloud Types and Geometrical Properties Observed above PANGEA Observatory in the Eastern Mediterranean
东地中海盘古大陆天文台上空观测到的云类型和几何特性
F50. INTEGRATIVE ANALYSIS OF TRANSCRIPTOME AND PROTEOME-WIDE ASSOCIATION STUDY IDENTIFIES NOVEL GENES IMPLICATED IN TOURETTE'S SYNDROME
F50. 转录组和蛋白质组全关联研究的综合分析确定了与图雷特综合征有关的新基因
  • DOI:
    10.1016/j.euroneuro.2023.08.438
  • 发表时间:
    2023-10-01
  • 期刊:
  • 影响因子:
    6.700
  • 作者:
    Sudhanshu Shekhar;Apostolia Topaloudi;Dongmei Yu;Paola Giusti-Rodriguez;Matthew Halvorsen;Nora Strom;Pritesh Jain;Tyne Miller-Fleming;Lea Davis;Manuel Mattheisen;James Crowley;Jeremiah Scharf;Carol Mathews;Peristera Paschou
  • 通讯作者:
    Peristera Paschou

Peristera Paschou的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

相似国自然基金

昼夜节律性small RNA在血斑形成时间推断中的法医学应用研究
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
tRNA-derived small RNA上调YBX1/CCL5通路参与硼替佐米诱导慢性疼痛的机制研究
  • 批准号:
    n/a
  • 批准年份:
    2022
  • 资助金额:
    10.0 万元
  • 项目类别:
    省市级项目
Small RNA调控I-F型CRISPR-Cas适应性免疫性的应答及分子机制
  • 批准号:
    32000033
  • 批准年份:
    2020
  • 资助金额:
    24.0 万元
  • 项目类别:
    青年科学基金项目
Small RNAs调控解淀粉芽胞杆菌FZB42生防功能的机制研究
  • 批准号:
    31972324
  • 批准年份:
    2019
  • 资助金额:
    58.0 万元
  • 项目类别:
    面上项目
变异链球菌small RNAs连接LuxS密度感应与生物膜形成的机制研究
  • 批准号:
    81900988
  • 批准年份:
    2019
  • 资助金额:
    21.0 万元
  • 项目类别:
    青年科学基金项目
基于small RNA 测序技术解析鸽分泌鸽乳的分子机制
  • 批准号:
    31802058
  • 批准年份:
    2018
  • 资助金额:
    26.0 万元
  • 项目类别:
    青年科学基金项目
肠道细菌关键small RNAs在克罗恩病发生发展中的功能和作用机制
  • 批准号:
    31870821
  • 批准年份:
    2018
  • 资助金额:
    56.0 万元
  • 项目类别:
    面上项目
Small RNA介导的DNA甲基化调控的水稻草矮病毒致病机制
  • 批准号:
    31772128
  • 批准年份:
    2017
  • 资助金额:
    60.0 万元
  • 项目类别:
    面上项目
基于small RNA-seq的针灸治疗桥本甲状腺炎的免疫调控机制研究
  • 批准号:
    81704176
  • 批准年份:
    2017
  • 资助金额:
    20.0 万元
  • 项目类别:
    青年科学基金项目
水稻OsSGS3与OsHEN1调控small RNAs合成及其对抗病性的调节
  • 批准号:
    91640114
  • 批准年份:
    2016
  • 资助金额:
    85.0 万元
  • 项目类别:
    重大研究计划

相似海外基金

NSF-BSF: AF: Collaborative Research: Small: Randomized preconditioning of iterative processes: Theory and practice
NSF-BSF:AF:协作研究:小型:迭代过程的随机预处理:理论与实践
  • 批准号:
    2209510
  • 财政年份:
    2022
  • 资助金额:
    $ 49.95万
  • 项目类别:
    Standard Grant
NSF-BSF: AF: Collaborative Research: Small: Randomized preconditioning of iterative processes: Theory and practice
NSF-BSF:AF:协作研究:小型:迭代过程的随机预处理:理论与实践
  • 批准号:
    2209509
  • 财政年份:
    2022
  • 资助金额:
    $ 49.95万
  • 项目类别:
    Standard Grant
RI: Small: Accelerating Machine Learning via Randomized Automatic Differentiation
RI:小型:通过随机自动微分加速机器学习
  • 批准号:
    2007278
  • 财政年份:
    2020
  • 资助金额:
    $ 49.95万
  • 项目类别:
    Standard Grant
RI: Small: Robust Autonomy for Uncertain Systems using Randomized Trees
RI:小型:使用随机树实现不确定系统的鲁棒自治
  • 批准号:
    2008686
  • 财政年份:
    2020
  • 资助金额:
    $ 49.95万
  • 项目类别:
    Continuing Grant
Feasibility and Acceptability of The Equus Effect: A Small Randomized Controlled Pilot Study of an Equine-facilitated Therapy
马科效应的可行性和可接受性:马科促进疗法的小型随机对照试点研究
  • 批准号:
    10060753
  • 财政年份:
    2019
  • 资助金额:
    $ 49.95万
  • 项目类别:
Feasibility and Acceptability of The Equus Effect: A Small Randomized Controlled Pilot Study of an Equine-facilitated Therapy
马科效应的可行性和可接受性:马科促进疗法的小型随机对照试点研究
  • 批准号:
    10553614
  • 财政年份:
    2019
  • 资助金额:
    $ 49.95万
  • 项目类别:
Feasibility and Acceptability of The Equus Effect: A Small Randomized Controlled Pilot Study of an Equine-facilitated Therapy
马科效应的可行性和可接受性:马科促进疗法的小型随机对照试点研究
  • 批准号:
    9889268
  • 财政年份:
    2019
  • 资助金额:
    $ 49.95万
  • 项目类别:
Feasibility and Acceptability of The Equus Effect: A Small Randomized Controlled Pilot Study of an Equine-facilitated Therapy
马科效应的可行性和可接受性:马科促进疗法的小型随机对照试点研究
  • 批准号:
    10394705
  • 财政年份:
    2019
  • 资助金额:
    $ 49.95万
  • 项目类别:
CCF-BSF: AF: Small: New Randomized Approaches in Approximation Algorithms
CCF-BSF:AF:小:近似算法中的新随机方法
  • 批准号:
    1717947
  • 财政年份:
    2017
  • 资助金额:
    $ 49.95万
  • 项目类别:
    Standard Grant
Workplace Wellness climate and Substance Abuse Prevention in Small Businesses: A Cluster Randomized Trial and Psychometric Analysis
小型企业的工作场所健康氛围和药物滥用预防:集群随机试验和心理测量分析
  • 批准号:
    9544129
  • 财政年份:
    2017
  • 资助金额:
    $ 49.95万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了