CAREER: Predicting transcription factor binding dynamics across cell types and species
职业:预测跨细胞类型和物种的转录因子结合动态
基本信息
- 批准号:2045500
- 负责人:
- 金额:$ 83.96万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-06-01 至 2026-05-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
The health and function of any given cell type depend on various sets of proteins interacting with one another and the genome’s DNA to regulate gene activities. Experimental approaches can profile where specific proteins attach to the genome, providing insight into regulatory relationships between proteins and genes in a given cell type. Such experiments are expensive and laborious, however, and provide limited insights into gene regulatory activities in the large portion of the genome that is composed of repetitive DNA. This project will develop new machine learning software methods that will predict where regulatory proteins bind to the genome in currently problematic settings. Specifically, these investigators will train neural network methods to recognize features of gene regulatory sites from existing experimental data and transfer that knowledge to predict gene regulatory sites in the genomes of other species, repetitive DNA areas, and in other cell types. Our new software methods will therefore unlock new layers of insight into gene regulation in healthy and diseased cells. All software produced by this project will be made freely available and accessible to the general research community. This project directly supports computationally intensive training and research opportunities in machine learning for graduate and undergraduate students who are working at the interface of computer science and biology. Strong efforts will be made to recruit students from under-represented groups. The education goals of this project will support the development of broader education initiatives in bioinformatics and genomics. This project will develop discovery-oriented bioinformatics research modules for use in teaching genetics and developmental biology concepts in high-school science classes. These research modules will be implemented in collaboration with Pennsylvanian high-school science teachers and students and will offer a new way to engage students in inquiry-based science. The PI will also develop curriculum proposals for a new degree program in bioinformatics at Penn State University. This project will develop neural network-based transfer learning approaches that predict transcription factor (TF) binding sites across three domains where TF binding activities are difficult to assay. Aim 1 will focus on predicting TF binding sites across species. Neural networks will be trained on observed TF binding data from one species, and used to predict where the same TF binds in the same cell type in other species. A new domain adaptation strategy will be developed that addresses systematic biases resulting from shifts in the genomic makeup of different species. Transferring TF binding information across species will enable the study of regulatory evolution and innovation in many species without the need for expensive TF ChIP-seq experiments. Aim 2 will apply related domain adaptation approaches to predict TF binding sites within transposable elements and other repetitive regions. In this application, neural networks will be trained on observed TF binding data from uniquely mappable portions of the genome and will be applied to impute binding sites from partial signals in low-mappability regions. Predicting TF binding in low-mappability regions will provide a new way to study the regulatory contributions of transposable elements and other currently ignored parts of the genome. Finally, Aim 3 will predict where a TF would bind if it were expressed in a new chromatin environment. This last application differs from approaches that aim to impute unobserved TF binding signals from concurrent chromatin features; the goal is rather to use information from a preexisting chromatin environment to predict the future binding patterns of an induced TF. Developing the first principled approach for predicting where a TF would bind in new chromatin environments will be the first step towards predicting which regulatory perturbations can be used to transform cellular phenotypes. Predictions from all three aims will be tested in ongoing collaborations focused on understanding TF-driven cell identity specification in hematopoiesis and neuronal differentiation, thus providing new insights into how TFs select their regulatory targets during development. The results of the project will be available from http://mahonylab.org.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
任何给定细胞类型的健康和功能都取决于相互作用的各种蛋白质组和基因组DNA来调节基因活动。实验方法可以分析特定蛋白质附着在基因组上的位置,从而深入了解给定细胞类型中蛋白质和基因之间的调控关系。然而,这样的实验既昂贵又费力,并且对由重复DNA组成的大部分基因组中的基因调控活动提供了有限的见解。该项目将开发新的机器学习软件方法,以预测在当前有问题的环境中调控蛋白与基因组结合的位置。具体来说,这些研究人员将训练神经网络方法,从现有的实验数据中识别基因调控位点的特征,并将这些知识转移到预测其他物种、重复DNA区域和其他细胞类型的基因组中的基因调控位点。因此,我们的新软件方法将为健康和患病细胞的基因调控提供新的见解。该项目制作的所有软件将免费提供给一般研究界。该项目直接支持计算机科学和生物学接口工作的研究生和本科生在机器学习方面的计算密集型培训和研究机会。将大力招收代表性不足群体的学生。该项目的教育目标将支持生物信息学和基因组学领域更广泛的教育计划的发展。该项目将开发以发现为导向的生物信息学研究模块,用于在高中科学课程中教授遗传学和发育生物学概念。这些研究模块将与宾夕法尼亚州高中科学教师和学生合作实施,并将提供一种新的方式让学生参与基于探究的科学。PI还将为宾夕法尼亚州立大学的生物信息学新学位课程制定课程建议。该项目将开发基于神经网络的迁移学习方法,预测转录因子(TF)结合活性难以测定的三个域的结合位点。目标1将侧重于预测跨物种的TF结合位点。神经网络将根据观察到的来自一个物种的TF结合数据进行训练,并用于预测相同TF在其他物种的相同细胞类型中的结合位置。将开发一种新的域适应策略,以解决不同物种基因组组成变化所导致的系统性偏差。跨物种转移TF结合信息将使许多物种的监管进化和创新研究成为可能,而无需昂贵的TF ChIP-seq实验。目的2将应用相关结构域适应方法预测转座因子和其他重复区域内的TF结合位点。在本申请中,神经网络将根据观察到的来自基因组的独特可映射部分的TF结合数据进行训练,并将应用于从低可映射区域的部分信号中估算结合位点。预测TF结合在低映射性区域将提供一种新的方法来研究转座因子和其他目前被忽视的基因组部分的调节作用。最后,目标3将预测如果TF在新的染色质环境中表达,它将结合在哪里。这最后一个应用程序不同于旨在从并发染色质特征中估算未观察到的TF结合信号的方法;目标是使用来自预先存在的染色质环境的信息来预测诱导的TF的未来结合模式。开发用于预测TF将在新的染色质环境中结合的第一个原则性方法将是预测哪些调控扰动可用于转化细胞表型的第一步。来自所有三个目标的预测将在正在进行的合作中进行测试,重点是了解TF驱动的造血和神经元分化中的细胞身份规范,从而为TF在发育过程中如何选择其调控靶点提供新的见解。该项目的结果将在www.example.com上公布http://mahonylab.org.This奖项反映了NSF的法定使命,并被认为值得通过使用基金会的知识价值和更广泛的影响审查标准进行评估来支持。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
The ENCODE Imputation Challenge: a critical assessment of methods for cross-cell type imputation of epigenomic profiles.
- DOI:10.1186/s13059-023-02915-y
- 发表时间:2023-04-18
- 期刊:
- 影响因子:12.3
- 作者:
- 通讯作者:
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Shaun Mahony其他文献
Title Transcription factor binding site identification using the Self-Organizing Map
标题 使用自组织图识别转录因子结合位点
- DOI:
- 发表时间:
2017 - 期刊:
- 影响因子:0
- 作者:
Shaun Mahony;D. Hendrix;A. Golden;Terry J. Smith;D. Rokhsar - 通讯作者:
D. Rokhsar
Intragenomic conflict underlies extreme phenotypic plasticity in queen-worker caste determination in honey bees (Apis mellifera)
蜜蜂(Apis mellifera)蜂王-工蜂种姓决定中的极端表型可塑性是基因组内冲突的基础
- DOI:
- 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
S. Bresnahan;Shaun Mahony;Kate Anton;B. Harpur;C. M. Grozinger - 通讯作者:
C. M. Grozinger
New Life for Savannah River Site
萨凡纳河遗址的新生命
- DOI:
- 发表时间:
1996 - 期刊:
- 影响因子:10.4
- 作者:
Shaun Mahony;P. Benos - 通讯作者:
P. Benos
Intragenomic conflict associated with extreme phenotypic plasticity in queen-worker caste determination in honey bees (Apis mellifera)
- DOI:
10.1186/s13059-025-03628-0 - 发表时间:
2025-06-18 - 期刊:
- 影响因子:9.400
- 作者:
Sean T. Bresnahan;Shaun Mahony;Kate Anton;Brock Harpur;Christina M. Grozinger - 通讯作者:
Christina M. Grozinger
Systematic integration of GATA transcription factors and epigenomes via IDEAS paints the regulatory landscape of mouse hematopoietic cells
通过 IDEAS 系统整合 GATA 转录因子和表观基因组描绘了小鼠造血细胞的调控景观
- DOI:
10.1101/730358 - 发表时间:
2019 - 期刊:
- 影响因子:0
- 作者:
R. Hardison;Yu Zhang;C. Keller;Guanjue Xiang;Elisabeth F. Heuston;Lin An;J. Lichtenberg;B. Giardine;D. Bodine;Shaun Mahony;Qunhua Li;Feng Yue;M. Weiss;G. Blobel;James Taylor;J. Hughes;D. Higgs;Berthold Gottgens - 通讯作者:
Berthold Gottgens
Shaun Mahony的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Shaun Mahony', 18)}}的其他基金
ABI INNOVATION: Characterizing protein-DNA interactions from high-resolution assays
ABI 创新:通过高分辨率测定表征蛋白质-DNA 相互作用
- 批准号:
1564466 - 财政年份:2016
- 资助金额:
$ 83.96万 - 项目类别:
Standard Grant
相似海外基金
Integrative analysis of the stochasticity of single-cell omics data for predicting pioneerness of transcription factors
单细胞组学数据随机性的综合分析用于预测转录因子的先驱性
- 批准号:
23K14165 - 财政年份:2023
- 资助金额:
$ 83.96万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Predicting Cardiovascular Outcomes Using Diabetes-Induced Transcriptomic Networks
使用糖尿病诱导的转录组网络预测心血管结果
- 批准号:
10679593 - 财政年份:2023
- 资助金额:
$ 83.96万 - 项目类别:
Dissecting and Predicting Lethal Prostate Cancer using Biologically Informed Artificial Intelligence
使用生物学信息人工智能剖析和预测致命性前列腺癌
- 批准号:
10628274 - 财政年份:2023
- 资助金额:
$ 83.96万 - 项目类别:
Evaluating prostate cancer phenotype and genotype classification from circulating tumor DNA as biomarkers for predicting treatment outcomes
根据循环肿瘤 DNA 评估前列腺癌表型和基因型分类作为预测治疗结果的生物标志物
- 批准号:
10804464 - 财政年份:2023
- 资助金额:
$ 83.96万 - 项目类别:
Identifying diagnostic biomarkers for Delirium and predicting cognitive Outcomes in hospitalized older adults using automated Speech Analysis (IDOSA)
使用自动语音分析 (IDOSA) 识别谵妄的诊断生物标志物并预测住院老年人的认知结果
- 批准号:
10806491 - 财政年份:2023
- 资助金额:
$ 83.96万 - 项目类别:
Predicting Tissue Specific Gli3 Regulatory Activity Using Hand2
使用 Hand2 预测组织特异性 Gli3 调节活动
- 批准号:
10647737 - 财政年份:2022
- 资助金额:
$ 83.96万 - 项目类别:
A systems immunology approach for predicting poor responses to Hepatitis B vaccination
预测乙型肝炎疫苗接种反应不良的系统免疫学方法
- 批准号:
10365479 - 财政年份:2021
- 资助金额:
$ 83.96万 - 项目类别:
Predicting context-specific molecular and phenotypic effects of genetic variation through the lens of the cis-regulatory code
通过顺式调控密码的视角预测遗传变异的特定背景分子和表型效应
- 批准号:
10659170 - 财政年份:2021
- 资助金额:
$ 83.96万 - 项目类别:
Predicting the Impact of Genomic Variation on Cellular States
预测基因组变异对细胞状态的影响
- 批准号:
10294338 - 财政年份:2021
- 资助金额:
$ 83.96万 - 项目类别:
Predicting the impact of genetic variants, genes and pathways on human Disease
预测遗传变异、基因和途径对人类疾病的影响
- 批准号:
10296867 - 财政年份:2021
- 资助金额:
$ 83.96万 - 项目类别: