Reliable post hoc interpretations of deep learning in genomics
基因组学深度学习的可靠事后解释
基本信息
- 批准号:10638753
- 负责人:
- 金额:$ 38.4万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-08-01 至 2027-04-30
- 项目状态:未结题
- 来源:
- 关键词:ATAC-seqAccelerationAddressAffectBenchmarkingBindingBiologicalBiological AssayBiological SciencesBiologyChromatinCommunitiesComplexComputer softwareComputing MethodologiesDNADNA SequenceDataDevelopmentDiseaseEnsureFutureGenetic TranscriptionGenomicsGoalsIndividualKnowledgeLearningMachine LearningMapsMethodsModelingNoiseNucleotidesPerformancePopulationPositioning AttributeRecurrenceRegulationRegulatory ElementResolutionSeriesSignal TransductionSingle Nucleotide PolymorphismSpecific qualifier valueStructureTensorFlowTimeTrainingTranscriptional RegulationTranslatingUntranslated RNAbasebioinformatics toolbiological systemscell typecomputerized toolsdeep learningdeep neural networkdirect applicationfunctional genomicsgenomic datageometric structurehuman diseaseimprovedinnovationinsightmachine learning methodnew technologyopen sourceperformance testspredictive modelingsyntaxtechnology research and developmenttooltranscription factoruser-friendly
项目摘要
PROJECT SUMMARY
Understanding how the coordination of transcription factors bind to non-coding DNA provides mechanistic
insights into transcriptional regulation. Recent developments in deep neural networks (DNNs) have
revolutionized our ability to study regulatory genomics. While they have demonstrated improved predictions
compared to previous methods based on traditional computational genomics, their low interpretability has earned
them a reputation as a black box. To address this gap, post hoc model interpretability methods have emerged
to interrogate important features that the network has learned. Of these, attribution maps have demonstrated
promise, providing importance scores for each nucleotide in a given sequence; these have a natural
interpretation as single-nucleotide variant effects. In principle, attribution maps should contain information to
identify motifs that are important for cell-type specific regulatory functions and annotate their positions at base-
resolution. However, attribution maps are often noisy in practice; in addition to motifs, they contain spurious
importance scores for arbitrary nucleotides for reasons that are not well established. Despite their promise,
interpreting a DNN through attribution maps remains challenging. Here we propose three complementary aims
that serve to maximize the biological insights that we can achieve from attribution maps for genomic DNNs. In
Aim 1, we will develop a model selection framework to identify the optimal DNN from a set of candidate DNNs
that yields high generalization performance and interpretable attribution maps. In Aim 2, we will develop robust
training strategies based on regularization and data augmentations tailored for genomics, with the broader aim
of ensuring that DNNs yield high-quality attribution maps and high generalization. In Aim 3, we will develop and
employ interpretable computational methods to directly analyze attribution maps to facilitate discovery of
functional motifs and annotate their positions. Each aim will be implemented as open-source software in
TensorFlow and PyTorch. As the number of deep learning applications in genomics is rising quickly, the
biomedical community will greatly benefit from these user-friendly computational tools by enabling the
deployment of robust training and interpretability analysis for any DNN trained on functional genomics assays.
This, in turn, will drive new discoveries in cis-regulatory biology across the many biological systems that deep
learning has already been applied to and the new applications that will continue to emerge in the future.
项目摘要
了解转录因子如何与非编码DNA结合,
对转录调控的深入了解。深度神经网络(DNN)的最新发展
彻底改变了我们研究调控基因组学的能力。虽然他们已经证明了预测的改进
与以前基于传统计算基因组学的方法相比,
他们被称为黑匣子。为了解决这一差距,出现了事后模型可解释性方法
来询问网络学习到的重要特征。其中,属性地图显示,
promise,为给定序列中的每个核苷酸提供重要性分数;这些具有自然的
解释为单核苷酸变体效应。原则上,属性图应包含信息,
鉴定对细胞类型特异性调节功能重要的基序,并在碱基处注释它们的位置,
分辨率然而,归因图在实践中常常是嘈杂的;除了图案之外,它们还包含虚假的
任意核苷酸的重要性分数,其原因还没有很好地确定。尽管他们的承诺,
通过属性地图解释DNN仍然具有挑战性。在这里,我们提出三个互补的目标
这有助于最大限度地提高我们从基因组DNN的属性图中获得的生物学见解。在
目标1,我们将开发一个模型选择框架,从一组候选DNN中识别最佳DNN
这产生了高泛化性能和可解释的属性图。在目标2中,我们将开发强大的
基于规则化和为基因组学量身定制的数据扩充的培训策略,具有更广泛的目标
确保DNN产生高质量的属性图和高泛化能力。在目标3中,我们将开发和
使用可解释的计算方法直接分析属性图,以促进发现
功能基序并注释它们的位置。每个目标都将作为开源软件实现,
TensorFlow和PyTorch。随着基因组学中深度学习应用的数量迅速增加,
生物医学界将大大受益于这些用户友好的计算工具,使
为任何经过功能基因组学分析培训的DNN部署强大的培训和可解释性分析。
反过来,这将推动顺式调节生物学在许多生物系统中的新发现,
学习已经被应用到,而且新的应用将在未来继续出现。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Peter K Koo其他文献
Peter K Koo的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Peter K Koo', 18)}}的其他基金
Interpretable Computational Models of Functional Genomics Data
功能基因组数据的可解释计算模型
- 批准号:
10698090 - 财政年份:2022
- 资助金额:
$ 38.4万 - 项目类别:
Interpretable Computational Models of Functional Genomics Data
功能基因组数据的可解释计算模型
- 批准号:
10453055 - 财政年份:2022
- 资助金额:
$ 38.4万 - 项目类别:
相似海外基金
EXCESS: The role of excess topography and peak ground acceleration on earthquake-preconditioning of landslides
过量:过量地形和峰值地面加速度对滑坡地震预处理的作用
- 批准号:
NE/Y000080/1 - 财政年份:2024
- 资助金额:
$ 38.4万 - 项目类别:
Research Grant
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328975 - 财政年份:2024
- 资助金额:
$ 38.4万 - 项目类别:
Continuing Grant
SHINE: Origin and Evolution of Compressible Fluctuations in the Solar Wind and Their Role in Solar Wind Heating and Acceleration
SHINE:太阳风可压缩脉动的起源和演化及其在太阳风加热和加速中的作用
- 批准号:
2400967 - 财政年份:2024
- 资助金额:
$ 38.4万 - 项目类别:
Standard Grant
Market Entry Acceleration of the Murb Wind Turbine into Remote Telecoms Power
默布风力涡轮机加速进入远程电信电力市场
- 批准号:
10112700 - 财政年份:2024
- 资助金额:
$ 38.4万 - 项目类别:
Collaborative R&D
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328973 - 财政年份:2024
- 资助金额:
$ 38.4万 - 项目类别:
Continuing Grant
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328972 - 财政年份:2024
- 资助金额:
$ 38.4万 - 项目类别:
Continuing Grant
Collaborative Research: FuSe: R3AP: Retunable, Reconfigurable, Racetrack-Memory Acceleration Platform
合作研究:FuSe:R3AP:可重调、可重新配置、赛道内存加速平台
- 批准号:
2328974 - 财政年份:2024
- 资助金额:
$ 38.4万 - 项目类别:
Continuing Grant
Collaborative Research: A new understanding of droplet breakup: hydrodynamic instability under complex acceleration
合作研究:对液滴破碎的新认识:复杂加速下的流体动力学不稳定性
- 批准号:
2332916 - 财政年份:2024
- 资助金额:
$ 38.4万 - 项目类别:
Standard Grant
Collaborative Research: A new understanding of droplet breakup: hydrodynamic instability under complex acceleration
合作研究:对液滴破碎的新认识:复杂加速下的流体动力学不稳定性
- 批准号:
2332917 - 财政年份:2024
- 资助金额:
$ 38.4万 - 项目类别:
Standard Grant
Study of the Particle Acceleration and Transport in PWN through X-ray Spectro-polarimetry and GeV Gamma-ray Observtions
通过 X 射线光谱偏振法和 GeV 伽马射线观测研究 PWN 中的粒子加速和输运
- 批准号:
23H01186 - 财政年份:2023
- 资助金额:
$ 38.4万 - 项目类别:
Grant-in-Aid for Scientific Research (B)














{{item.name}}会员




