Systematic characterization of tandem repeat variants contributing to complex traits
导致复杂性状的串联重复变异的系统表征
基本信息
- 批准号:10265508
- 负责人:
- 金额:$ 70.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-09-17 至 2024-07-31
- 项目状态:已结题
- 来源:
- 关键词:Automobile DrivingBioinformaticsBiologicalBiological AssayBipolar DisorderBloodCatalogsClustered Regularly Interspaced Short Palindromic RepeatsComplexCopy Number PolymorphismDNAData SetExhibitsFailureGene ExpressionGene Expression RegulationGenesGenetic VariationGenomeGenomicsGenotypeGenotype-Tissue Expression ProjectHaplotypesHeightHeritabilityHumanHuman GeneticsIn SituIndividualKnowledgeLengthMalariaMedicalMinisatellite RepeatsMolecular BiologyMutationNon-linear ModelsNucleosomesPhenotypePopulationPositioning AttributePublishingRNARNA SplicingRepetitive SequenceReporterResistanceResourcesRoleSNP arraySamplingSchizophreniaShort Tandem RepeatSignal TransductionSingle Nucleotide PolymorphismSourceStructureTandem Repeat SequencesTechniquesTestingTissuesTranscriptVariantbasecancer riskcausal variantexperiencegenetic architecturegenetic variantgenome editinggenome wide association studygenome-widegenomic locusmolecular phenotypenext generation sequencingnovelpredictive teststatisticstechnology developmenttooltraitweb app
项目摘要
SUMMARY ABSTRACT
Genome-wide association studies (GWAS) have identified thousands of genetic loci associated with
complex traits, but determining the causal variants, target genes, and biological mechanisms responsible for
each signal has proven challenging. Furthermore, standard GWAS based on single nucleotide polymorphisms
(SNPs) have been limited by failure to explain the majority of heritability for most traits studied and an inability to
capture multi-allelic variants such as copy number variants (CNVs) and repeats not tagged by SNPs.
We focus on the role of genetic variation at repetitive regions of the genome. Specifically, we consider
two repeat types: short tandem repeats (STRs), consisting of repeated motifs of 1-6bp; and variable number
tandem repeats (VNTRs), with motifs of 7+bp. We collectively refer to STRs and VNTRs as tandem repeats
(TRs). TRs encompass approximately 2 million loci comprising over 3% of the genome. They exhibit rapid
mutation rates and are one of the largest sources of genetic variation. Growing evidence suggests that TRs are
likely to account for part of the “missing heritability” of GWAS. However, due to bioinformatic and experimental
challenges in studying repeats, the genome-wide role of TRs in human traits remains mostly unexplored.
We hypothesize that TR variants are key drivers of complex traits. We recently identified thousands
of STRs predicted to causally regulate gene expression (termed expression STRs, or eSTRs) and revealed that
eSTRs potentially act through a variety of mechanisms including modulating nucleosome positioning and DNA
or RNA secondary structure. We additionally identified specific eSTRs likely underlying published GWAS signals
for height and schizophrenia. Furthermore, other groups have recently discovered TRs as causal drivers of
complex traits including malaria resistance, cancer risk, and bipolar disorder.
While these findings offer intriguing evidence that thousands of TRs contribute to human phenotypes,
they have several limitations. These include: the range of TRs that can be accurately genotyped from next-
generation sequencing (NGS); a lack of sufficiently large NGS datasets for most traits for performing association
analyses; and limited understanding of the potential mechanisms by which TRs participate in gene regulation.
Here, we leverage (i) our recently developed TR genotyping tools and (ii) our published haplotype panel allowing
imputation of TRs into available SNP-array datasets, to systematically evaluate the contribution of TRs to gene
regulation and complex traits in humans. We will first generate a comprehensive catalog of TRs associated with
gene regulation (Aim 1) and establish a framework for validating TR effects using massively parallel reporter
assays and genome editing (Aim 2). We will then impute more than 2 million TRs into large existing GWAS
datasets and perform fine-mapping to identify TRs associated with a range of complex traits and deeply
characterize several TRs predicted to be causal drivers of GWAS signals (Aim 3). This project will fill an
important gap in our knowledge of the genetic architecture of complex traits.
摘要
全基因组关联研究(GWAS)已经确定了数千个与遗传相关的基因位点。
复杂的性状,但确定因果变异,靶基因和生物机制负责
每个信号都证明是具有挑战性的。此外,基于单核苷酸多态性的标准GWAS
单核苷酸多态性(SNPs)的研究受到了限制,因为它不能解释大多数研究性状的大部分遗传力,
捕获多等位基因变体,例如拷贝数变体(CNV)和未被SNP标记的重复序列。
我们专注于基因组重复区域的遗传变异的作用。具体来说,我们认为
两种重复类型:短串联重复序列(STR),由1- 6 bp的重复基序组成;
串联重复序列(VNTRs),具有7+bp的基序。我们将STR和VNTR统称为串联重复序列
(TRs)。TR包含约200万个基因座,占基因组的3%以上。他们表现出快速的
突变率,是遗传变异的最大来源之一。越来越多的证据表明,
这可能解释了GWAS的“缺失遗传性”的一部分。然而,由于生物信息学和实验
尽管研究重复序列存在挑战,但TRs在人类性状中的全基因组作用仍然大部分未被探索。
我们假设TR变异是复杂性状的关键驱动因素。我们最近确认了数千名
的STR预测因果调节基因表达(称为表达STR,或eSTR),并揭示,
eSTR可能通过多种机制发挥作用,包括调节核小体定位和DNA
或RNA二级结构。我们还确定了可能是已发表的GWAS信号基础的特定eSTR
身高和精神分裂症此外,其他研究小组最近发现TRs是
包括抗疟疾、癌症风险和双相情感障碍在内的复杂特征。
虽然这些发现提供了有趣的证据,表明成千上万的TR有助于人类表型,
但是它们有几个局限性。这些包括:可以从下一个准确基因分型的TR的范围-
世代测序(NGS);缺乏足够大的NGS数据集,用于大多数性状进行关联
分析;以及对TR参与基因调控的潜在机制的了解有限。
在这里,我们利用(i)我们最近开发的TR基因分型工具和(ii)我们发表的单倍型面板,
将TR插补到可用的SNP阵列数据集中,以系统地评估TR对基因的贡献。
调节和复杂的特征。我们将首先生成与以下各项相关的TR的综合目录:
基因调控(目标1),并建立一个框架,验证TR的影响,使用大规模平行报告
分析和基因组编辑(Aim 2)。然后,我们将把200多万TR估算到现有的大型GWAS中
数据集并执行精细映射,以识别与一系列复杂性状相关的TR,
描述几个被预测为GWAS信号的因果驱动因素的TR(目标3)。该项目将填补
这是我们对复杂性状遗传结构的认识中的一个重要空白。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Alon Goren其他文献
Alon Goren的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Alon Goren', 18)}}的其他基金
Novel SETD5-based Molecular Mechanisms and Therapeutic Tools to Understand and Revert Neuronal Dysfunction Associated with Intellectual disability and Autism
基于 SETD5 的新型分子机制和治疗工具来理解和恢复与智力障碍和自闭症相关的神经元功能障碍
- 批准号:
10446957 - 财政年份:2022
- 资助金额:
$ 70.5万 - 项目类别:
Novel SETD5-based Molecular Mechanisms and Therapeutic Tools to Understand and Revert Neuronal Dysfunction Associated with Intellectual disability and Autism
基于 SETD5 的新型分子机制和治疗工具来理解和恢复与智力障碍和自闭症相关的神经元功能障碍
- 批准号:
10585929 - 财政年份:2022
- 资助金额:
$ 70.5万 - 项目类别:
Systematic characterization of tandem repeat variants contributing to complex traits
导致复杂性状的串联重复变异的系统表征
- 批准号:
10671075 - 财政年份:2020
- 资助金额:
$ 70.5万 - 项目类别:
Systematic characterization of tandem repeat variants contributing to complex traits
导致复杂性状的串联重复变异的系统表征
- 批准号:
10052847 - 财政年份:2020
- 资助金额:
$ 70.5万 - 项目类别:
Systematic characterization of tandem repeat variants contributing to complex traits
导致复杂性状的串联重复变异的系统表征
- 批准号:
10459499 - 财政年份:2020
- 资助金额:
$ 70.5万 - 项目类别:
Development of a novel method to chart genomic localization of protein complexes in vivo
开发一种绘制蛋白质复合物体内基因组定位图的新方法
- 批准号:
9511383 - 财政年份:2018
- 资助金额:
$ 70.5万 - 项目类别:
Interrogating regulatory variants by multiplexed genome editing
通过多重基因组编辑询问调控变异
- 批准号:
9761568 - 财政年份:2018
- 资助金额:
$ 70.5万 - 项目类别:
相似海外基金
Collaborative Research: IIBR: Innovation: Bioinformatics: Linking Chemical and Biological Space: Deep Learning and Experimentation for Property-Controlled Molecule Generation
合作研究:IIBR:创新:生物信息学:连接化学和生物空间:属性控制分子生成的深度学习和实验
- 批准号:
2318829 - 财政年份:2023
- 资助金额:
$ 70.5万 - 项目类别:
Continuing Grant
Analysis of biological small molecule mixtures using multiple modes of mass spectrometric fragmentation coupled with new bioinformatics workflows
使用多种质谱裂解模式结合新的生物信息学工作流程分析生物小分子混合物
- 批准号:
BB/X019802/1 - 财政年份:2023
- 资助金额:
$ 70.5万 - 项目类别:
Research Grant
Collaborative Research: IIBR: Innovation: Bioinformatics: Linking Chemical and Biological Space: Deep Learning and Experimentation for Property-Controlled Molecule Generation
合作研究:IIBR:创新:生物信息学:连接化学和生物空间:属性控制分子生成的深度学习和实验
- 批准号:
2318830 - 财政年份:2023
- 资助金额:
$ 70.5万 - 项目类别:
Continuing Grant
Collaborative Research: IIBR: Innovation: Bioinformatics: Linking Chemical and Biological Space: Deep Learning and Experimentation for Property-Controlled Molecule Generation
合作研究:IIBR:创新:生物信息学:连接化学和生物空间:属性控制分子生成的深度学习和实验
- 批准号:
2318831 - 财政年份:2023
- 资助金额:
$ 70.5万 - 项目类别:
Continuing Grant
Bioinformatics-powered genetic characterization of the impact of biological systems on Alzheimer's disease and neurodegeneration
基于生物信息学的生物系统对阿尔茨海默病和神经退行性疾病影响的遗传表征
- 批准号:
484699 - 财政年份:2022
- 资助金额:
$ 70.5万 - 项目类别:
Operating Grants
REU Site: Bioinformatics Research and Interdisciplinary Training Experience in Analysis and Interpretation of Information-Rich Biological Data Sets (REU-BRITE)
REU网站:信息丰富的生物数据集分析和解释的生物信息学研究和跨学科培训经验(REU-BRITE)
- 批准号:
1949968 - 财政年份:2020
- 资助金额:
$ 70.5万 - 项目类别:
Standard Grant
REU Site: Bioinformatics Research and Interdisciplinary Training Experience in Analysis and Interpretation of Information-Rich Biological Data Sets (REU-BRITE)
REU网站:信息丰富的生物数据集分析和解释的生物信息学研究和跨学科培训经验(REU-BRITE)
- 批准号:
1559829 - 财政年份:2016
- 资助金额:
$ 70.5万 - 项目类别:
Continuing Grant
Bioinformatics Tools to Design and Optimize Biological Sensor Systems
用于设计和优化生物传感器系统的生物信息学工具
- 批准号:
416848-2011 - 财政年份:2011
- 资助金额:
$ 70.5万 - 项目类别:
University Undergraduate Student Research Awards
ABI Development: bioKepler: A Comprehensive Bioinformatics Scientific Workflow Module for Distributed Analysis of Large-Scale Biological Data
ABI 开发:bioKepler:用于大规模生物数据分布式分析的综合生物信息学科学工作流程模块
- 批准号:
1062565 - 财政年份:2011
- 资助金额:
$ 70.5万 - 项目类别:
Continuing Grant
Bioinformatics-based hypothesis generation with biological validation for plant stress biology
基于生物信息学的假设生成和植物逆境生物学的生物验证
- 批准号:
261818-2006 - 财政年份:2010
- 资助金额:
$ 70.5万 - 项目类别:
Discovery Grants Program - Individual