Systematic Identification of Core Regulatory Circuitry from ENCODE Data
从 ENCODE 数据系统识别核心监管电路
基本信息
- 批准号:10238262
- 负责人:
- 金额:$ 57.23万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2017
- 资助国家:美国
- 起止时间:2017-02-01 至 2023-01-31
- 项目状态:已结题
- 来源:
- 关键词:Base PairingBase SequenceBinding SitesBiological AssayCell Differentiation processCellsChromatinCollaborationsCommunitiesCoupledDNA SequenceDataData SetDiseaseEncapsulatedEnhancersGeneticGenomicsHumanLuciferasesMachine LearningMapsMethodsModelingMouse Cell LineMusNucleic Acid Regulatory SequencesRegulatory ElementReporterResolutionTestingTissuesTrainingUntranslated RNAValidationVariantVocabularyWeightblindcell typecomputerized toolsdesignepigenomicsexperimental studyprograms
项目摘要
While much progress has been made generating high quality chromatin state and accessibility data from the
ENCODE and Roadmap consortia, accurately identifying cell-type specific enhancers from these data remains a
significant challenge. We have recently developed a computational approach (gkmSVM) to predict regulatory
elements from DNA sequence, and we have shown that when gkmSVM is trained on DHS data from each
of the human and mouse ENCODE and Roadmap cells and tissues, it can predict both cell specific enhancer
activity and the impact of regulatory variants (deltaSVM) with greater precision than alternative approaches.
The gkmSVM model encapsulates a set of cell-type specific weights describing the regulatory binding
site vocabulary controlling chromatin accessibility in each cell type. A striking observation is that the significant
gkmSVM weights are generally identifiable with a small (~20) set of TF binding sites which vary by cell-type,
consistent with the hypothesis that cell-type specific expression programs are controlled by a small set
of core factors tightly coupled in mutually interacting regulatory circuits. Perturbations of these core regulators
enable transitions between stable differentiated cell-type states of this genetic circuit. Here, we will use
gkmSVM to systematically identify the core regulatory circuitry in all existing ENCODE and Roadmap human
and mouse cell lines and tissues, and produce DNA sequence based genomic regulatory maps and fine-scale
predictions of core regulator binding sites within predicted regulatory regions. We will generate binding
site models for core regulators in each cell type, assess the accuracy of our predictions through direct
experimental validation. The value of this map critically depends on its accuracy, so we demonstrate that
gkmSVM predictions consistently outperform alternative methods in massively parallel enhancer reporter and
luciferase validation assays, in blind community assessments of regulatory element predictions (CAGI), and in
predicting validated causal disease associated variants. In contrast, we show that methods using PWM
descriptions of TF binding sites are significantly less accurate. We will produce base-pair resolution predictions
of the cell specific TF binding sites (TFBS) within broader regulatory regions detected by multiple ENCODE
epigenomic Mapping datasets, and to test these TFBS predictions in collaboration with Functional
Characterization Centers (FCC). Our regulatory maps will help design and inform focused experiments
probing regulatory mechanisms, and aid in the interpretation of disease associated non-coding variants.
虽然已经取得了很大进展,但从染色质组生成高质量的染色质状态和可及性数据,
ENCODE和Roadmap联盟,从这些数据中准确识别细胞类型特异性增强子仍然是一个重要的研究方向。
重大挑战。我们最近开发了一种计算方法(gkmSVM)来预测调控
我们已经证明,当gkmSVM在来自每个DNA序列的DHS数据上训练时,
人和小鼠的ENCODE和Roadmap细胞和组织,它可以预测细胞特异性增强子
活性和调节变体(deltaSVM)的影响,具有比替代方法更高的精度。
gkmSVM模型封装了一组描述调控绑定的细胞类型特定权重
控制每种细胞类型中染色质可及性的位点词汇。一个惊人的观察是,
gkmSVM权重通常可通过一小组(~20)TF结合位点识别,所述TF结合位点随细胞类型而变化,
这与细胞类型特异性表达程序由一小部分细胞控制的假设一致
核心因素紧密耦合在相互作用的调节回路中。这些核心调节器的扰动
使这种遗传电路的稳定分化细胞类型状态之间的转换成为可能。在这里,我们将使用
gkmSVM系统地识别所有现有的ENCODE和Roadmap人体中的核心调节电路
和小鼠细胞系和组织,并产生基于DNA序列的基因组调控图谱和精细尺度
预测的调控区域内的核心调控结合位点。我们将生成绑定
每个细胞类型中核心调节子的位点模型,通过直接
实验验证这张地图的价值主要取决于它的准确性,所以我们证明,
gkmSVM预测在大规模并行增强子报告基因和
荧光素酶验证试验,调节元件预测的盲态社区评估(CAGI),以及
预测经验证的致病性疾病相关变异。相反,我们表明,使用PWM的方法
TF结合位点的描述明显不准确。我们将产生碱基对分辨率预测
在更广泛的调控区域内的细胞特异性TF结合位点(TFBS),
表观基因组图谱数据集,并与Functional
表征中心(FCC)。我们的监管地图将有助于设计和告知重点实验
探索调控机制,并有助于解释疾病相关的非编码变体。
项目成果
期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Loop competition and extrusion model predicts CTCF interaction specificity.
- DOI:10.1038/s41467-021-21368-0
- 发表时间:2021-02-16
- 期刊:
- 影响因子:16.6
- 作者:Xi W;Beer MA
- 通讯作者:Beer MA
Embryonic loss of human females with partial trisomy 19 identifies region critical for the single active X.
- DOI:10.1371/journal.pone.0170403
- 发表时间:2017
- 期刊:
- 影响因子:3.7
- 作者:Migeon BR;Beer MA;Bjornsson HT
- 通讯作者:Bjornsson HT
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Michael A Beer其他文献
Machine Learning Sequence Modeling Identifies Gene Regulatory Responses to Bone Marrow Stromal Interactions in Multiple Myeloma
- DOI:
10.1182/blood-2023-186981 - 发表时间:
2023-11-02 - 期刊:
- 影响因子:
- 作者:
Milad Razavi-Mohseni;Dustin Shigaki;Michael A Beer - 通讯作者:
Michael A Beer
Michael A Beer的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Michael A Beer', 18)}}的其他基金
Sequence-based Machine Learning for Inference of Dynamic Cell State Gene Network Models
基于序列的机器学习用于动态细胞状态基因网络模型的推理
- 批准号:
10665735 - 财政年份:2022
- 资助金额:
$ 57.23万 - 项目类别:
Genomic control of gene regulatory networks governing early human lineage decisions
控制早期人类谱系决策的基因调控网络的基因组控制
- 批准号:
10297375 - 财政年份:2021
- 资助金额:
$ 57.23万 - 项目类别:
Genomic control of gene regulatory networks governing early human lineagedecisions
控制早期人类谱系决定的基因调控网络的基因组控制
- 批准号:
10833813 - 财政年份:2021
- 资助金额:
$ 57.23万 - 项目类别:
Genomic control of gene regulatory networks governing early human lineage decisions
控制早期人类谱系决策的基因调控网络的基因组控制
- 批准号:
10471939 - 财政年份:2021
- 资助金额:
$ 57.23万 - 项目类别:
Genomic control of gene regulatory networks governing early human lineagedecisions
控制早期人类谱系决定的基因调控网络的基因组控制
- 批准号:
10840531 - 财政年份:2021
- 资助金额:
$ 57.23万 - 项目类别:
Genomic control of gene regulatory networks governing early human lineage decisions
控制早期人类谱系决策的基因调控网络的基因组控制
- 批准号:
10630157 - 财政年份:2021
- 资助金额:
$ 57.23万 - 项目类别:
SVM-based Analysis of the Fine Scale Structure of Regulatory Elements
基于支持向量机的监管要素精细尺度结构分析
- 批准号:
9097757 - 财政年份:2013
- 资助金额:
$ 57.23万 - 项目类别:
SVM-based Analysis of the Fine Scale Structure of Regulatory Elements
基于支持向量机的监管要素精细尺度结构分析
- 批准号:
9304811 - 财政年份:2013
- 资助金额:
$ 57.23万 - 项目类别:
SVM-based Analysis of the Fine Scale Structure of Regulatory Elements
基于支持向量机的监管要素精细尺度结构分析
- 批准号:
8889287 - 财政年份:2013
- 资助金额:
$ 57.23万 - 项目类别:
SVM-based Analysis of the Fine Scale Structure of Regulatory Elements
基于支持向量机的监管要素精细尺度结构分析
- 批准号:
8556758 - 财政年份:2013
- 资助金额:
$ 57.23万 - 项目类别:
相似海外基金
Quantum chemical challenge to elucidate the functional mechanism of base sequence specificity deciding removal of the DNA damage
量子化学挑战阐明碱基序列特异性决定去除 DNA 损伤的功能机制
- 批准号:
19K22903 - 财政年份:2019
- 资助金额:
$ 57.23万 - 项目类别:
Grant-in-Aid for Challenging Research (Exploratory)
Theoretical Study on Relation of Base sequence and Electronic Structures toward Elucidation of Mechanism of DNA Electric Conductivity.
碱基序列与电子结构关系的理论研究,阐明DNA导电机制。
- 批准号:
16K05666 - 财政年份:2016
- 资助金额:
$ 57.23万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Prediction and control of base sequence recognition ability for nucleic acid binding proteins by using computer experiments.
利用计算机实验预测和控制核酸结合蛋白的碱基序列识别能力。
- 批准号:
14598001 - 财政年份:2002
- 资助金额:
$ 57.23万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
FLANKING BASE SEQUENCE ON MUTAGENICITY OF 8 OXOGUANINE
8 氧鸟嘌呤致突变性的侧翼碱基序列
- 批准号:
6362773 - 财政年份:2001
- 资助金额:
$ 57.23万 - 项目类别:
FLANKING BASE SEQUENCE ON MUTAGENICITY OF 8 OXOGUANINE
8 氧鸟嘌呤致突变性的侧翼碱基序列
- 批准号:
6137753 - 财政年份:2000
- 资助金额:
$ 57.23万 - 项目类别:
GROWTH HOROMON LOCALIZATION AND ITS BASE SEQUENCE IN BOVINE PANCREATIC
牛胰腺生长激素定位及其碱基序列
- 批准号:
10460134 - 财政年份:1998
- 资助金额:
$ 57.23万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
DNA BASE SEQUENCE EFFECTS IN CHEMICAL CARCINOGENESIS
DNA 碱基序列在化学致癌作用中的作用
- 批准号:
2488608 - 财政年份:1997
- 资助金额:
$ 57.23万 - 项目类别:
DNA BASE SEQUENCE EFFECTS IN CHEMICAL CARCINOGENESIS
DNA 碱基序列在化学致癌作用中的作用
- 批准号:
6475917 - 财政年份:1997
- 资助金额:
$ 57.23万 - 项目类别:
DNA BASE SEQUENCE EFFECTS IN CHEMICAL CARCINOGENESIS
DNA 碱基序列在化学致癌作用中的作用
- 批准号:
6329024 - 财政年份:1997
- 资助金额:
$ 57.23万 - 项目类别:
DNA BASE SEQUENCE EFFECTS IN CHEMICAL CARCINOGENESIS
DNA 碱基序列在化学致癌作用中的作用
- 批准号:
6124462 - 财政年份:1997
- 资助金额:
$ 57.23万 - 项目类别: