Implicit generative modeling for computational genomics

计算基因组学的隐式生成模型

基本信息

  • 批准号:
    RGPIN-2020-06770
  • 负责人:
  • 金额:
    $ 2.91万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2020
  • 资助国家:
    加拿大
  • 起止时间:
    2020-01-01 至 2021-12-31
  • 项目状态:
    已结题

项目摘要

Genomics is the study of DNA sequences in our cells and how they drive function or dysfunction. Genomics data also reveals the flow of information within cells. By comparing genomics data captured under different conditions (mutations or drug treatments) we can observe how they affect cell function. Machine learning can do more than merely observe: it can learn the complex "rules" by which DNA drives function, directly from data. Once learned, these rules can predict which mutations are likely to cause a gene to stop functioning and cause disease. Genomics data can increasingly be generated by automated laboratories, including remote cloud laboratories, which offer better accuracy and consistency than traditional laboratory work. This combination of information-rich data and automated experimentation is finally setting the stage for "self-driven" laboratories. In a self-driven laboratory, experiments are designed by machine learning systems to systematically interrogate an outcome of interest, such as the sequence-function relationships in human cells. Data and models are improved in a 24/7 feedback loop. The proposed research aims to make tractable progress towards this vision. This vision is important because it represents an opportunity for drug discovery. Today, 90% of all drug candidates fail at the clinical stage, after years of development and tens or hundreds of millions of dollars have been spent. Better predictive models of how cells work will help to identify and design compounds for genetic disorders up front, with a higher final success rate. The proposed research focuses on four objectives. The first is to make it easier for machine learning scientists to participate in advancing this vision. Genomics data is notoriously difficult to understand and to work with, so the goal is to build a code library and data repository that enables the machine learning community to contribute to genomics research. The second objective is to improve the accuracy with which genomics data is interpreted, specifically data from RNA sequencing, or RNA-seq. RNA-seq captures a "snapshot" of the RNA molecules in a sample, and is the penultimate step in countless protocols for interrogating cells. However, the snapshot is fragmented. Algorithms are needed to infer (to reconstruct) what RNA molecules were actually present in the sample. The proposed research will investigate a new, more accurate approach to inference based on deep learning. The third objective is to build more accurate models of how RNA is processed in cells by new deep learning techniques. Models of RNA processing are important for predicting the effects of mutations and of potential therapies. (Such models are also based on RNA-seq data, and may benefit from advances in the second objective.) The fourth and final objective is to create algorithms that can automatically design experiments, generating the most valuable data possible from which to build models of RNA biology.
基因组学是对细胞中 DNA 序列及其如何驱动功能或功能障碍的研究。基因组数据还揭示了细胞内的信息流。通过比较在不同条件(突变或药物治疗)下捕获的基因组数据,我们可以观察它们如何影响细胞功能。机器学习不仅仅可以观察:它可以直接从数据中学习 DNA 驱动功能的复杂“规则”。一旦了解了这些规则,就可以预测哪些突变可能导致基因停止功能并引发疾病。基因组数据越来越多地由自动化实验室生成,包括远程云实验室,它们比传统实验室工作提供更好的准确性和一致性。信息丰富的数据和自动化实验的结合最终为“自我驱动”实验室奠定了基础。在自我驱动的实验室中,机器学习系统设计实验来系统地询问感兴趣的结果,例如人类细胞中的序列功能关系。数据和模型在 24/7 反馈循环中得到改进。拟议的研究旨在为实现这一愿景取得易于处理的进展。这一愿景很重要,因为它代表了药物发现的机会。如今,经过多年的开发和花费数千万或数亿美元之后,90% 的候选药物在临床阶段都失败了。更好的细胞工作预测模型将有助于预先识别和设计用于遗传性疾病的化合物,并具有更高的最终成功率。 拟议的研究重点关注四个目标。首先是让机器学习科学家更容易参与推进这一愿景。众所周知,基因组数据难以理解和使用,因此我们的目标是构建一个代码库和数据存储库,使机器学习社区能够为基因组学研究做出贡献。第二个目标是提高基因组数据解读的准确性,特别是来自 RNA 测序或 RNA-seq 的数据。 RNA-seq 捕获样本中 RNA 分子的“快照”,是无数细胞询问方案中的倒数第二步。然而,快照是碎片化的。需要使用算法来推断(重建)样本中实际存在的 RNA 分子。拟议的研究将研究一种基于深度学习的新的、更准确的推理方法。第三个目标是通过新的深度学习技术建立更准确的 RNA 在细胞中加工的模型。 RNA 加工模型对于预测突变和潜在疗法的影响非常重要。 (此类模型也基于 RNA-seq 数据,并可能受益于第二个目标的进展。)第四个也是最后一个目标是创建可以自动设计实验的算法,生成尽可能有价值的数据来构建 RNA 生物学模型。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Delong, Andrew其他文献

Interactive Segmentation with Super-Labels.
Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning
  • DOI:
    10.1038/nbt.3300
  • 发表时间:
    2015-08-01
  • 期刊:
  • 影响因子:
    46.9
  • 作者:
    Alipanahi, Babak;Delong, Andrew;Frey, Brendan J.
  • 通讯作者:
    Frey, Brendan J.
An integral solution to surface evolution PDEs via geo-cuts
Fast Approximate Energy Minimization with Label Costs
  • DOI:
    10.1007/s11263-011-0437-z
  • 发表时间:
    2012-01-01
  • 期刊:
  • 影响因子:
    19.5
  • 作者:
    Delong, Andrew;Osokin, Anton;Boykov, Yuri
  • 通讯作者:
    Boykov, Yuri

Delong, Andrew的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Delong, Andrew', 18)}}的其他基金

Implicit generative modeling for computational genomics
计算基因组学的隐式生成模型
  • 批准号:
    RGPIN-2020-06770
  • 财政年份:
    2022
  • 资助金额:
    $ 2.91万
  • 项目类别:
    Discovery Grants Program - Individual
Implicit generative modeling for computational genomics
计算基因组学的隐式生成模型
  • 批准号:
    RGPIN-2020-06770
  • 财政年份:
    2021
  • 资助金额:
    $ 2.91万
  • 项目类别:
    Discovery Grants Program - Individual
Implicit generative modeling for computational genomics
计算基因组学的隐式生成模型
  • 批准号:
    DGECR-2020-00323
  • 财政年份:
    2020
  • 资助金额:
    $ 2.91万
  • 项目类别:
    Discovery Launch Supplement
Learning and Inference for Hierarchical Models in Computer Vision
计算机视觉中分层模型的学习和推理
  • 批准号:
    421338-2012
  • 财政年份:
    2013
  • 资助金额:
    $ 2.91万
  • 项目类别:
    Postdoctoral Fellowships
Learning and Inference for Hierarchical Models in Computer Vision
计算机视觉中分层模型的学习和推理
  • 批准号:
    421338-2012
  • 财政年份:
    2012
  • 资助金额:
    $ 2.91万
  • 项目类别:
    Postdoctoral Fellowships
Discrete optimisation methods in vision and biomedical imaging
视觉和生物医学成像中的离散优化方法
  • 批准号:
    348322-2007
  • 财政年份:
    2009
  • 资助金额:
    $ 2.91万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Discrete optimisation methods in vision and biomedical imaging
视觉和生物医学成像中的离散优化方法
  • 批准号:
    348322-2007
  • 财政年份:
    2008
  • 资助金额:
    $ 2.91万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Discrete optimisation methods in vision and biomedical imaging
视觉和生物医学成像中的离散优化方法
  • 批准号:
    348322-2007
  • 财政年份:
    2007
  • 资助金额:
    $ 2.91万
  • 项目类别:
    Postgraduate Scholarships - Doctoral

相似海外基金

CIF: Small: Towards a Control Framework for Neural Generative Modeling
CIF:小:走向神经生成建模的控制框架
  • 批准号:
    2348624
  • 财政年份:
    2024
  • 资助金额:
    $ 2.91万
  • 项目类别:
    Standard Grant
SG: Species Distribution Modeling on the A.I. frontier: Deep generative models for powerful, general and accessible SDM
SG:人工智能上的物种分布建模
  • 批准号:
    2329701
  • 财政年份:
    2024
  • 资助金额:
    $ 2.91万
  • 项目类别:
    Standard Grant
CAREER: Generative Physical Modeling for Computational Imaging Systems
职业:计算成像系统的生成物理建模
  • 批准号:
    2239687
  • 财政年份:
    2023
  • 资助金额:
    $ 2.91万
  • 项目类别:
    Continuing Grant
Information-Theoretic Surprise-Driven Approach to Enhance Decision Making in Healthcare
信息论惊喜驱动方法增强医疗保健决策
  • 批准号:
    10575550
  • 财政年份:
    2023
  • 资助金额:
    $ 2.91万
  • 项目类别:
Proteasomal recruiters of PAX3-FOXO1 Designed via Sequence-Based Generative Models
通过基于序列的生成模型设计的 PAX3-FOXO1 蛋白酶体招募剂
  • 批准号:
    10826068
  • 财政年份:
    2023
  • 资助金额:
    $ 2.91万
  • 项目类别:
Characterizing the generative mechanisms underlying the cortical tracking of natural speech
表征自然语音皮质跟踪背后的生成机制
  • 批准号:
    10710717
  • 财政年份:
    2023
  • 资助金额:
    $ 2.91万
  • 项目类别:
Geles: A Novel Imaging Informatics System for Generalizable Lesion Identification in Neuroendocrine Tumors
Geles:一种用于神经内分泌肿瘤普遍病变识别的新型影像信息学系统
  • 批准号:
    10740578
  • 财政年份:
    2023
  • 资助金额:
    $ 2.91万
  • 项目类别:
AI-based Cardiac CT
基于人工智能的心脏CT
  • 批准号:
    10654259
  • 财政年份:
    2023
  • 资助金额:
    $ 2.91万
  • 项目类别:
Neural Operator Learning to Predict Aneurysmal Growth and Outcomes
神经算子学习预测动脉瘤的生长和结果
  • 批准号:
    10636358
  • 财政年份:
    2023
  • 资助金额:
    $ 2.91万
  • 项目类别:
Cardiac MRI for Reperfusion Spatial Mapping to Improve Heart Failure Outcomes
心脏 MRI 进行再灌注空间测绘以改善心力衰竭的预后
  • 批准号:
    10717212
  • 财政年份:
    2023
  • 资助金额:
    $ 2.91万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了