Predicting 3D physical gene-enhancer interactions through integration of GTEx and 4DN data
通过整合 GTEx 和 4DN 数据预测 3D 物理基因增强子相互作用
基本信息
- 批准号:10776871
- 负责人:
- 金额:$ 29.82万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-09-20 至 2024-09-19
- 项目状态:已结题
- 来源:
- 关键词:3-DimensionalATAC-seqAddressAffectAlgorithmsBiological AssayCRISPR interferenceCellsChIP-seqChromatinChromatin ModelingChromosomesComputer AnalysisDNADNase I hypersensitive sites sequencingDataDatabasesElementsEnhancersEpigenetic ProcessFrequenciesFundingGene ExpressionGene Expression RegulationGene TargetingGenesGenomeGenomicsGenotypeGenotype-Tissue Expression ProjectHealthHi-CHumanInvestigationLinkLocationMachine LearningMapsMethodsModelingMolecular ConformationPolymersPrincipal InvestigatorQuantitative Trait LociRegulatory ElementResourcesStructureTechniquesTestingTissuesTrainingTrustUntranslated RNAValidationVariantcausal variantcell typechromosome conformation capturecomputerized toolscostdata integrationdata resourcedata standardsdeep learningdeep learning modelepigenomicsgene interactiongenetic variantgenome sequencinggenome wide association studygenome-widegenome-wide analysishistone modificationimprovedinnovationinsightlarge scale simulationmachine learning predictionprogramsrisk variantsimulationtooltranscriptome sequencingtrustworthinesswhole genome
项目摘要
Program Director/Principal Investigator (Liang, Jie):
PROJECT SUMMARY/ABSTRACT
We will develop computational tools that facilitate investigation of the fundamental relationship
between gene expression and genome topology. Specifically, we will develop machine learning tools
that can link enhancer and its targeted gene at genome wide scale. The ability of establishing
relationship between enhancers and their target genes is critically important, as it will aid in our
understanding of gene regulation and in establishing the relationship between noncoding risk variants
from GWAS studies to potential causal genes. Our approach will be based on 3D polymer models of
chromatin interactions derived from Hi-C data in the common fund 4D Nucleome (4DN) database,
and will integrate data from the common fund supported Genotype-Tissue Expression (GTEx)
databaseas, as well as data from ENCODE database. We will 1) construct a database of trusted high-
quality database of candidate enhancer-gene target pairs. We will then 2) use this database to train a
machine learning predictor that can predict enhancer-gene target pairs at genome wide scale. For 1),
we will develop a pipeline to identify a small set of critical specific chromatin 3D interactions through
simulation of large scale folding of 3D chromatin ensembles. The small set of specific interactions will
be tested for sufficiency of chromatin folding. We will then identify computationally enhancers based
on epigenetic histone modifications and chromatin accessibility data from ENCODE as well as the
Roadmap Epigenomics Project. We will then select enhancers containing eQTLs from the GTEx
databases, which are known to affect the expression of the target gene. The end result will be a high-
quality and trustworthy database of enhance-gene pairs, which will be provided by the predicted
critical specific 3D physical chromatin interactions connecting the eQTL-containing enhancer and the
target gene. For 2), we will develop a machine-learning predictor that predicts enhancer-gene
interactions from genomic, epigenomic, and Hi-C data at genome-wide scale. We will combine
epigenetic data with genomic information (such as sequence motifs of TFs) as features. We will then
train a machine learning predictor through hold-outs and cross-validations of the constructed
database of enhancer-target gene pairs from 1). The efficacy of the predictor will then be assessed
with the gold-standard of the CRISPRi-FlowFISH data. We will then carry out large scale
computational and will construct databases of predicted enhancer-gene relationship for selected cell
types. Overall, we will demonstrate significant added-power of integrating two important Common
Fund data resources and will provide tools to facilitate understanding the relationship between
genome topology and gene expression. Our computational tools will lead to new insight into the
relationship of genome structure and genome function important for improving human health.
0925-0001 (Rev. 03/16) Page Continuation Format Page
计划总监/首席研究员(Liang,Jie):
项目摘要/摘要
我们将开发计算工具,以促进调查基本关系
在基因表达和基因组拓扑之间。具体来说,我们将开发机器学习工具
这可以在基因组大规模上将增强子及其靶向基因联系起来。建立能力
增强子与其靶基因之间的关系至关重要,因为它将有助于我们
了解基因调节并在建立非编码风险变体之间的关系时
从GWAS研究到潜在的因果基因。我们的方法将基于3D聚合物模型
染色质相互作用来自普通基金4D核心(4DN)数据库中的HI-C数据,
并将整合来自普通基金支持的基因型组织表达(GTEX)的数据
databaseas以及来自编码数据库的数据。我们将1)构建一个可信赖的数据库
候选者 - 基因目标对的质量数据库。然后,我们将2)使用此数据库训练
机器学习预测指标可以预测基因组大规模的增强剂基因目标对。 1),
我们将开发一条管道,以通过
模拟3D染色质集合的大规模折叠。一组特定的互动将
可以测试染色质折叠的充分性。然后,我们将确定基于计算增强器的
关于表观遗传组蛋白的修饰和染色质访问性数据,
路线图表观基因组学项目。然后,我们将选择来自GTEX的EQTL的增强器
数据库,已知会影响靶基因的表达。最终结果将是一个高
增强基因对的质量和值得信赖的数据库,这将由预测提供
关键的特异性3D物理染色质相互作用,连接含EQTL的增强剂和
靶基因。对于2),我们将开发一个机器学习预测变量,以预测增强器基因
基因组,表观基因组和HI-C数据的相互作用在全基因组范围内。我们将结合
具有基因组信息(例如TFS的序列基序)的表观遗传数据作为特征。然后我们会
训练机器学习预测指标,通过固定和交叉验证
1)的增强子目标基因对数据库。然后将评估预测变量的功效
带有CRISPRI-FLOWFISH数据的金标准。然后,我们将进行大规模
计算,将构建针对选定单元格的预测增强子基因关系的数据库
类型。总体而言,我们将展示综合两个重要常见的重大添加功能
资助数据资源,并将提供工具以促进了解
基因组拓扑和基因表达。我们的计算工具将导致对
基因组结构和基因组的关系对于改善人类健康很重要。
0925-0001(修订版03/16)页面延续格式页面
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Jie Liang其他文献
Jie Liang的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Jie Liang', 18)}}的其他基金
Constructing High-Resolution Ensemble Models of 3D Single-Cell Chromatin Conformations of eQTL Loci from Integrated Analysis of 4DN-GTEx Data towards Structural Basis of Differential Gene Expression
从 4DN-GTEx 数据的集成分析构建 eQTL 位点 3D 单细胞染色质构象的高分辨率整体模型,以构建差异基因表达的结构基础
- 批准号:
10357063 - 财政年份:2021
- 资助金额:
$ 29.82万 - 项目类别:
Models and Algorithms for Beta-Barrel Membrane Proteins and Stochastic Networks
β-桶膜蛋白和随机网络的模型和算法
- 批准号:
9923024 - 财政年份:2018
- 资助金额:
$ 29.82万 - 项目类别:
Models and Algorithms for Beta-Barrel Membrane Proteins and Stochastic Networks
β-桶膜蛋白和随机网络的模型和算法
- 批准号:
10395949 - 财政年份:2018
- 资助金额:
$ 29.82万 - 项目类别:
Constructing Ensembles of 3D Structures of Igh Locus and Predicting Novel Chromosomal Interactions
构建 Igh 基因座 3D 结构的集合并预测新的染色体相互作用
- 批准号:
9317936 - 财政年份:2017
- 资助金额:
$ 29.82万 - 项目类别:
Computational Assembly of Beta Barrel Membrane Protein
β 桶膜蛋白的计算组装
- 批准号:
8546506 - 财政年份:2007
- 资助金额:
$ 29.82万 - 项目类别:
Computational Assembly of Beta Barrel Membrane Protein
β 桶膜蛋白的计算组装
- 批准号:
7586266 - 财政年份:2007
- 资助金额:
$ 29.82万 - 项目类别:
Computational Assembly of Beta Barrel Membrane Protein
β 桶膜蛋白的计算组装
- 批准号:
7213136 - 财政年份:2007
- 资助金额:
$ 29.82万 - 项目类别:
Computational Assembly of Beta Barrel Membrane Protein
β 桶膜蛋白的计算组装
- 批准号:
8918774 - 财政年份:2007
- 资助金额:
$ 29.82万 - 项目类别:
Computational Assembly of Beta Barrel Membrane Protein
β 桶膜蛋白的计算组装
- 批准号:
7356031 - 财政年份:2007
- 资助金额:
$ 29.82万 - 项目类别:
Computational Assembly of Beta Barrel Membrane Protein
β 桶膜蛋白的计算组装
- 批准号:
8034791 - 财政年份:2007
- 资助金额:
$ 29.82万 - 项目类别:
相似国自然基金
面向图神经网络ATAC-seq模体识别的最小间隔单细胞聚类研究
- 批准号:62302218
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
基于ATAC-seq策略挖掘穿心莲基因组中调控穿心莲内酯合成的增强子
- 批准号:82260745
- 批准年份:2022
- 资助金额:33.00 万元
- 项目类别:地区科学基金项目
基于ATAC-seq策略挖掘穿心莲基因组中调控穿心莲内酯合成的增强子
- 批准号:
- 批准年份:2022
- 资助金额:33 万元
- 项目类别:地区科学基金项目
基于单细胞ATAC-seq技术的C4光合调控分子机制研究
- 批准号:32100438
- 批准年份:2021
- 资助金额:24.00 万元
- 项目类别:青年科学基金项目
基于单细胞ATAC-seq技术的C4光合调控分子机制研究
- 批准号:
- 批准年份:2021
- 资助金额:30 万元
- 项目类别:青年科学基金项目
相似海外基金
Functional Landscape of Glycosylation in Skin Cancer
皮肤癌中糖基化的功能景观
- 批准号:
10581094 - 财政年份:2023
- 资助金额:
$ 29.82万 - 项目类别:
Characterization of Epstein-Barr Virus Subversion of the Host SMC5/6 Restriction Pathway
Epstein-Barr 病毒颠覆宿主 SMC5/6 限制途径的特征
- 批准号:
10679118 - 财政年份:2023
- 资助金额:
$ 29.82万 - 项目类别:
Project 2: Impact of H1/H2 haplotypes on cellular disease-associated phenotypes driven by FTD-causing MAPT mutations
项目 2:H1/H2 单倍型对 FTD 引起的 MAPT 突变驱动的细胞疾病相关表型的影响
- 批准号:
10834336 - 财政年份:2023
- 资助金额:
$ 29.82万 - 项目类别: