A Comprehensive Genomic Community Resource of Transcriptional Regulation

转录调控的综合基因组群落资源

基本信息

  • 批准号:
    10842047
  • 负责人:
  • 金额:
    $ 20.28万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-06-01 至 2027-03-31
  • 项目状态:
    未结题

项目摘要

Project Summary/Abstract The Human Genome Project (HGP) completed the first draft human genome sequence two decades ago. The HGP revealed that human complexity arises from only approximately 20,000 coding genes, roughly the same number as much simpler organisms such as nematodes. Intricate patterns of transcriptional regulation mediated by non-coding regulatory elements specify the myriad cell types and states required for human complexity. Genome-wide association studies have subsequently identified thousands of disease-associated variants, many of which interrupt the function of these non-coding elements to disrupt transcriptional regulation. Thus, in order to better understand human physiology and pathophysiology, comprehensive atlases of regulatory elements are essential. Many previous efforts, including the International Human Epigenome Consortium (IHEC), the FANTOM Consortium, the Roadmap Epigenomics Project, and the ENCODE Project, have aimed to build comprehensive collections of regulatory elements, as well as computational models to better predict regulatory activity and understand the sequence features underlying regulatory function. ENCODE (2003-2022) is a large- scale consortium effort which aims to annotate every functional non-coding element of the human genome; during our work on the project, we built a Registry of approximately 1 million human candidate cis-regulatory elements (cCREs). We further developed deep-learning approaches which model the transcription factor motif syntax that underlies element function at base-pair resolution and built two web-based resources, SCREEN and Factorbook, to make our results accessible to the scientific community. Here, we propose to extend this framework to build the Community Resource for Transcriptional Regulation (CRTR), a comprehensive atlas of non-coding regulatory elements and machine-learning models which will encompass community and consortium deep-sequencing data, both bulk and single cell, across a broad array of cell types and states. Our project has five aims. First, we aim to curate community and consortium data for inclusion in CRTR and perform uniform processing and quality control. Second, we aim to train deep-learning sequence models on bulk epigenetic datasets to identify transcription factor motif syntax driving regulatory element activity in distinct tissues and cell types. Third, we aim to train sequence models on single cell datasets to identify transcription factor motif syntax driving transcriptional regulation in high-resolution cell states and during cell state transitions. Fourth, we aim to use the aforementioned results to build comprehensive benchmark datasets and machine-learning model collections, which will aid future analysts in designing new models to predict regulatory readouts. Fifth, we aim to build a state-of-the-art web-based user interface to enable users to perform integrative analyses and in silico experimentation with CRTR, and hold workshops and other outreach to maximize the impact of the resource and its accessibility to the broader scientific community.
项目摘要/摘要 人类基因组计划(HGP)在20年前完成了第一个人类基因组序列草案。这个 HGP揭示,人类的复杂性仅由大约2万个编码基因产生,大致相同 像线虫这样简单得多的生物的数量。错综复杂的转录调控模式 通过非编码,调节元件指定了人类复杂性所需的无数细胞类型和状态。 全基因组关联研究随后确定了数千种与疾病相关的变异,其中许多 它们干扰这些非编码元件的功能,从而扰乱转录调控。因此,按照顺序 为了更好地了解人体生理学和病理生理学,调控元素的综合图集包括 必不可少的。许多以前的努力,包括国际人类表观基因组联合会(IHEC), 幻影联盟、路线图表观基因组学项目和ENCODE项目的目标是建立 全面的监管要素集合,以及更好地预测监管的计算模型 活动,并了解潜在的调控功能的序列特征。Encode(2003-2022)是一个大型- Scale联盟的努力,旨在注释人类基因组的每个功能非编码元件; 在我们对该项目的工作中,我们建立了一个约有100万名人类候选人顺式监管的注册表 元素(CCRE)。我们进一步开发了对转录因子基序建模的深度学习方法 以碱基对分辨率作为元素函数基础的语法,并构建了两个基于Web的资源:Screen和 Factorbook,让科学界了解我们的成果。在这里,我们建议延长这一期限 建立转录调控社区资源的框架,这是一本全面的地图集 非编码监管元素和机器学习模式,将涵盖社区和联盟 对大量细胞类型和状态的大量和单个细胞进行深度测序。我们的项目已经 五个目标。首先,我们的目标是整理社区和财团数据,以纳入CRTR并执行统一 加工和质量控制。第二,我们的目标是训练大量表观遗传的深度学习序列模型 识别在不同组织和细胞中驱动调节元件活性的转录因子基序语法的数据集 类型。第三,我们的目标是在单细胞数据集上训练序列模型来识别转录因子基序语法 在高分辨率细胞状态和细胞状态转换期间驱动转录调控。第四,我们的目标是 使用上述结果构建全面的基准数据集和机器学习模型 这将帮助未来的分析师设计新的模型来预测监管读数。第五,我们的目标是 构建最先进的基于Web的用户界面,使用户能够在Silico中执行综合分析 试验社区研究与培训项目,举办讲习班和其他外联活动,以最大限度地发挥资源和资源的影响 其对更广泛的科学界的可及性。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Anshul Kundaje其他文献

Anshul Kundaje的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Anshul Kundaje', 18)}}的其他基金

Multi-Omics DACC: The Data Analysis and Coordination Center for the collaborative multi-omics for health and disease initiative
多组学 DACC:健康和疾病协作多组学计划的数据分析和协调中心
  • 批准号:
    10744561
  • 财政年份:
    2023
  • 资助金额:
    $ 20.28万
  • 项目类别:
A Comprehensive Genomic Community Resource of Transcriptional Regulation
转录调控的综合基因组群落资源
  • 批准号:
    10411262
  • 财政年份:
    2022
  • 资助金额:
    $ 20.28万
  • 项目类别:
A Comprehensive Genomic Community Resource of Transcriptional Regulation
转录调控的综合基因组群落资源
  • 批准号:
    10625529
  • 财政年份:
    2022
  • 资助金额:
    $ 20.28万
  • 项目类别:
Identifying causal genetic variants and molecular mechanisms impacting mental health
识别影响心理健康的因果遗传变异和分子机制
  • 批准号:
    10571911
  • 财政年份:
    2021
  • 资助金额:
    $ 20.28万
  • 项目类别:
Identifying causal genetic variants and molecular mechanisms impacting mental health
识别影响心理健康的因果遗传变异和分子机制
  • 批准号:
    10380573
  • 财政年份:
    2021
  • 资助金额:
    $ 20.28万
  • 项目类别:
Predicting context-specific molecular and phenotypic effects of genetic variation through the lens of the cis-regulatory code
通过顺式调控密码的视角预测遗传变异的特定背景分子和表型效应
  • 批准号:
    10659170
  • 财政年份:
    2021
  • 资助金额:
    $ 20.28万
  • 项目类别:
Predicting context-specific molecular and phenotypic effects of genetic variation through the lens of the cis-regulatory code
通过顺式调控密码的视角预测遗传变异的特定背景分子和表型效应
  • 批准号:
    10297562
  • 财政年份:
    2021
  • 资助金额:
    $ 20.28万
  • 项目类别:
Predicting context-specific molecular and phenotypic effects of genetic variation through the lens of the cis-regulatory code
通过顺式调控密码的视角预测遗传变异的特定背景分子和表型效应
  • 批准号:
    10474459
  • 财政年份:
    2021
  • 资助金额:
    $ 20.28万
  • 项目类别:
Multi-omic functional assessment of novel AD variants using high-throughput and single-cell technologies
使用高通量和单细胞技术对新型 AD 变体进行多组学功能评估
  • 批准号:
    10684210
  • 财政年份:
    2021
  • 资助金额:
    $ 20.28万
  • 项目类别:
Multi-omic functional assessment of novel AD variants using high-throughput and single-cell technologies
使用高通量和单细胞技术对新型 AD 变体进行多组学功能评估
  • 批准号:
    10217784
  • 财政年份:
    2021
  • 资助金额:
    $ 20.28万
  • 项目类别:

相似海外基金

DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
  • 批准号:
    EP/Y029089/1
  • 财政年份:
    2024
  • 资助金额:
    $ 20.28万
  • 项目类别:
    Research Grant
CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
  • 批准号:
    2337776
  • 财政年份:
    2024
  • 资助金额:
    $ 20.28万
  • 项目类别:
    Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
  • 批准号:
    2338816
  • 财政年份:
    2024
  • 资助金额:
    $ 20.28万
  • 项目类别:
    Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
  • 批准号:
    2338846
  • 财政年份:
    2024
  • 资助金额:
    $ 20.28万
  • 项目类别:
    Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
  • 批准号:
    2348261
  • 财政年份:
    2024
  • 资助金额:
    $ 20.28万
  • 项目类别:
    Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
  • 批准号:
    2348346
  • 财政年份:
    2024
  • 资助金额:
    $ 20.28万
  • 项目类别:
    Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
  • 批准号:
    2348457
  • 财政年份:
    2024
  • 资助金额:
    $ 20.28万
  • 项目类别:
    Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
  • 批准号:
    2404989
  • 财政年份:
    2024
  • 资助金额:
    $ 20.28万
  • 项目类别:
    Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
  • 批准号:
    2339310
  • 财政年份:
    2024
  • 资助金额:
    $ 20.28万
  • 项目类别:
    Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
  • 批准号:
    2339669
  • 财政年份:
    2024
  • 资助金额:
    $ 20.28万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了