New methods for quantitative modeling of protein-DNA interactions
蛋白质-DNA 相互作用定量建模的新方法
基本信息
- 批准号:9546780
- 负责人:
- 金额:$ 34.88万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2015
- 资助国家:美国
- 起止时间:2015-09-25 至 2020-08-31
- 项目状态:已结题
- 来源:
- 关键词:AddressAffectAffinityBase PairingBindingBinding ProteinsBinding SitesBiological AssayCell physiologyCellsChIP-seqChromatinComplexComputer SimulationDNADNA BindingDNA Binding DomainDNA StructureDNA-Protein InteractionDataData QualityDependenceDevelopmentDiseaseEnhancersEnvironmentEventExperimental DesignsGene ExpressionGene Expression RegulationGeneticGenetic PolymorphismGenetic TranscriptionGenetic VariationGenomeGenomic SegmentGenomicsGoalsHumanHuman GeneticsHuman GenomeIn VitroInvestigationLeadLearningLinkMeasurementMechanicsMethodsModelingMorphologic artifactsMutationNoiseNucleotidesPatternPhenotypePositioning AttributeProteinsRegulator GenesRoleSingle Nucleotide PolymorphismSiteSoftware ToolsSpecificityStatistical MethodsStatistical ModelsTestingTimeTrainingUntranslated RNAVariantWeightWorkbasecofactorcost effectivedesignfallsflexibilitygenetic varianthuman diseaseimprovedin vitro Assayin vitro Modelin vivonovelpreventprogramstraittranscription factoruser friendly softwarevector
项目摘要
ABSTRACT
Accurate predictions of transcription factor (TF)-DNA interactions across the human genome are critical for
deciphering transcriptional regulatory networks in healthy and diseased cells, as well as for understanding the
phenotypic effects of polymorphisms in non-coding genomic regions. However, the most widely used model of
TF-DNA binding affinity, the position weight matrix (PWM), is known to provide only an approximation of the
true sequence specificity of TFs, because it assumes independence among the base pairs in TF binding sites.
More complex binding models have been proposed, but their improvement over PWMs was marginal, either
because of limitations of the training data (i.e. due to strong biases, noise, artifacts, or confounding factors) or
because the models were not flexible enough to capture complex dependencies in TF binding sites. As a
result, current DNA binding models have a limited ability to predict the effects of non-coding genetic variation
on TF binding, and they cannot be used to resolve functional differences between closely related TFs with
similar DNA binding domains but distinct regulatory roles in the cell. The objective of this application is to
overcome these limitations by generating high quality data that will be used to train flexible statistical models to
generate TF-DNA binding affinity predictions with accuracies similar to experimental in vitro assays. The
central hypothesis, based on preliminary results and previous work, is that both better affinity data and better
statistical models are needed in order to predict TF-DNA interactions in human cells with significantly higher
accuracy than current models. High quality binding affinity data for 40 human TFs will be generated in Aim 1
using a unique combination of in vitro assays carefully designed to minimizes bias and noise, thus making the
data ideal for training complex models. Novel TF-DNA binding models will be developed in Aim 2 using state-
of-the-art statistical methods: support vector regression, nonparametric Bayes modeling, and conditional tensor
factorization. The models will be tested experimentally in vitro, and by leveraging in vivo data from the
ENCODE project. In Aim 3, the new binding models will be used in two applications: 1) to predict the
quantitative effects of non-coding single nucleotide polymorphisms on TF binding affinities and TF binding
levels, and 2) to predict differential in vivo DNA binding of closely related TFs with similar DNA binding
domains but distinct regulatory functions in the cell. Such applications are not possible using current models.
Overall, we anticipate that the binding affinity models developed in this project will allow for much more
accurate predictions of regulatory TF-DNA interactions than possible using current models, which is significant
because it will lead to a better understanding of gene regulatory programs and their misregulation during
disease, including understanding the cascade of events that link genetic variation to human disease.
抽象的
准确预测整个人类基因组中的转录因子 (TF)-DNA 相互作用对于
破译健康和患病细胞中的转录调控网络,以及了解
非编码基因组区域多态性的表型效应。然而,最广泛使用的模型
已知 TF-DNA 结合亲和力、位置权重矩阵 (PWM) 仅提供近似值
TF 的真正序列特异性,因为它假定 TF 结合位点的碱基对之间是独立的。
更复杂的绑定模型已经被提出,但它们相对于 PWM 的改进也是微不足道的
由于训练数据的限制(即由于强烈的偏差、噪声、伪影或混杂因素)或
因为模型不够灵活,无法捕获 TF 结合位点中的复杂依赖关系。作为一个
结果,当前的 DNA 结合模型预测非编码遗传变异影响的能力有限
TF 绑定,并且它们不能用于解决密切相关的 TF 之间的功能差异
相似的 DNA 结合域,但在细胞中的调节作用不同。该应用程序的目的是
通过生成高质量数据来克服这些限制,这些数据将用于训练灵活的统计模型
生成 TF-DNA 结合亲和力预测,其准确度与体外实验分析相似。这
基于初步结果和之前的工作的中心假设是,更好的亲和力数据和更好的
为了预测人类细胞中 TF-DNA 相互作用,需要统计模型
精度高于当前模型。目标 1 将生成 40 个人类 TF 的高质量结合亲和力数据
使用精心设计的独特体外测定组合,最大限度地减少偏差和噪音,从而使
非常适合训练复杂模型的数据。新的 TF-DNA 结合模型将在目标 2 中使用状态开发
最先进的统计方法:支持向量回归、非参数贝叶斯建模和条件张量
因式分解。这些模型将在体外进行实验测试,并利用来自体内的数据
编码项目。在目标 3 中,新的结合模型将用于两个应用:1)预测
非编码单核苷酸多态性对 TF 结合亲和力和 TF 结合的定量影响
水平,以及 2) 预测具有相似 DNA 结合的密切相关 TF 的差异体内 DNA 结合
域但在细胞中具有不同的调节功能。使用当前模型无法实现此类应用。
总的来说,我们预计该项目中开发的结合亲和力模型将允许更多
与使用现有模型相比,可以更准确地预测 TF-DNA 调控相互作用,这一点非常重要
因为这将导致更好地理解基因调控程序及其在过程中的错误调控
疾病,包括了解将遗传变异与人类疾病联系起来的一系列事件。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Raluca Gordan其他文献
Raluca Gordan的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Raluca Gordan', 18)}}的其他基金
The role of transcription factor proteins in mutagenesis at regulatory sites
转录因子蛋白在调控位点诱变中的作用
- 批准号:
10552569 - 财政年份:2020
- 资助金额:
$ 34.88万 - 项目类别:
The role of transcription factor proteins in mutagenesis at regulatory sites
转录因子蛋白在调控位点诱变中的作用
- 批准号:
10092203 - 财政年份:2020
- 资助金额:
$ 34.88万 - 项目类别:
The role of transcription factor proteins in mutagenesis at regulatory sites
转录因子蛋白在调控位点诱变中的作用
- 批准号:
10333272 - 财政年份:2020
- 资助金额:
$ 34.88万 - 项目类别:
New methods for quantitative modeling of protein-DNA interactions
蛋白质-DNA 相互作用定量建模的新方法
- 批准号:
9150688 - 财政年份:2015
- 资助金额:
$ 34.88万 - 项目类别:
相似海外基金
How Does Particle Material Properties Insoluble and Partially Soluble Affect Sensory Perception Of Fat based Products
不溶性和部分可溶的颗粒材料特性如何影响脂肪基产品的感官知觉
- 批准号:
BB/Z514391/1 - 财政年份:2024
- 资助金额:
$ 34.88万 - 项目类别:
Training Grant
BRC-BIO: Establishing Astrangia poculata as a study system to understand how multi-partner symbiotic interactions affect pathogen response in cnidarians
BRC-BIO:建立 Astrangia poculata 作为研究系统,以了解多伙伴共生相互作用如何影响刺胞动物的病原体反应
- 批准号:
2312555 - 财政年份:2024
- 资助金额:
$ 34.88万 - 项目类别:
Standard Grant
RII Track-4:NSF: From the Ground Up to the Air Above Coastal Dunes: How Groundwater and Evaporation Affect the Mechanism of Wind Erosion
RII Track-4:NSF:从地面到沿海沙丘上方的空气:地下水和蒸发如何影响风蚀机制
- 批准号:
2327346 - 财政年份:2024
- 资助金额:
$ 34.88万 - 项目类别:
Standard Grant
Graduating in Austerity: Do Welfare Cuts Affect the Career Path of University Students?
紧缩毕业:福利削减会影响大学生的职业道路吗?
- 批准号:
ES/Z502595/1 - 财政年份:2024
- 资助金额:
$ 34.88万 - 项目类别:
Fellowship
感性個人差指標 Affect-X の構築とビスポークAIサービスの基盤確立
建立个人敏感度指数 Affect-X 并为定制人工智能服务奠定基础
- 批准号:
23K24936 - 财政年份:2024
- 资助金额:
$ 34.88万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Insecure lives and the policy disconnect: How multiple insecurities affect Levelling Up and what joined-up policy can do to help
不安全的生活和政策脱节:多种不安全因素如何影响升级以及联合政策可以提供哪些帮助
- 批准号:
ES/Z000149/1 - 财政年份:2024
- 资助金额:
$ 34.88万 - 项目类别:
Research Grant
How does metal binding affect the function of proteins targeted by a devastating pathogen of cereal crops?
金属结合如何影响谷类作物毁灭性病原体靶向的蛋白质的功能?
- 批准号:
2901648 - 财政年份:2024
- 资助金额:
$ 34.88万 - 项目类别:
Studentship
Investigating how double-negative T cells affect anti-leukemic and GvHD-inducing activities of conventional T cells
研究双阴性 T 细胞如何影响传统 T 细胞的抗白血病和 GvHD 诱导活性
- 批准号:
488039 - 财政年份:2023
- 资助金额:
$ 34.88万 - 项目类别:
Operating Grants
New Tendencies of French Film Theory: Representation, Body, Affect
法国电影理论新动向:再现、身体、情感
- 批准号:
23K00129 - 财政年份:2023
- 资助金额:
$ 34.88万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
The Protruding Void: Mystical Affect in Samuel Beckett's Prose
突出的虚空:塞缪尔·贝克特散文中的神秘影响
- 批准号:
2883985 - 财政年份:2023
- 资助金额:
$ 34.88万 - 项目类别:
Studentship