A predictive model of mRNA stability and translation for variant interpretation and mRNA therapeutics

用于变异解释和 mRNA 治疗的 mRNA 稳定性和翻译的预测模型

基本信息

  • 批准号:
    9894822
  • 负责人:
  • 金额:
    $ 47.31万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2018
  • 资助国家:
    美国
  • 起止时间:
    2018-06-05 至 2021-03-31
  • 项目状态:
    已结题

项目摘要

The leading and trailing untranslated regions (UTRs) of an mRNA, along with the coding sequence (CDS), control protein production by modulating translation and mRNA stability. However, although we have identified a vast number of regulatory features in these regions, we are still far from being able to predict, for example, whether and how a sequence variant affects the levels of protein being made. Here, we propose to combine high-throughput experimental characterization of protein expression in synthetic libraries with machine learning to create predictive models of translation and mRNA stability, addressing an urgent need. Recent progress in machine vision, voice recognition and other fields of computer science has been driven by the availability of enormous data sets on which to train models. Machine learning approaches have also had remarkable impact in biology, but biological data sets often are comparatively small, limiting the quality of models that can be learned. For example, there are only around 20,000 genes in the human genome, a restrictively small set of examples for training a predictive model that captures the full extent of the genome’s “regulatory code.” In this proposal, we aim to overcome this data size limitation by training predictive models of protein expression on data from millions of synthetic constructs -- a data set several orders of magnitude larger than the number of genes in the genome. Specifically, we will create libraries of in vitro transcribed mRNA with targeted variation in the UTRs and CDS and will assay protein expression of each library member by performing high-throughput polysome profiling, ribosome profiling, and mRNA stability assays. We will then use neural network approaches to learn predictive models of the relationship between mRNA sequence and levels of protein production. We will apply our models to three applications of practical importance: first, we expect to uncover novel biology, for example identifying regulatory sequence elements and interactions between them. Second, we will validate our models through the de novo design and experimental testing of sequences that result in higher levels or protein production than any of the millions of randomly generated members of the original library or than the endogenous UTR sequences currently used in biotechnology. Such stable and highly translating mRNA constructs would be of particular value for the field or mRNA therapeutics. Third, we will predict the functional consequences of genetic variation in UTRs on protein production and we will validate these predictions experimentally. We are far from understanding which genetic variants compromise gene regulatory function in ways that may contribute to disease, making such a comprehensive and quantitative analysis of variants valuable.
mRNA 的前导和尾随非翻译区 (UTR) 以及编码序列 (CDS), 通过调节翻译和 mRNA 稳定性来控制蛋白质的产生。然而,尽管我们已经确定 这些地区的大量监管特征,我们还远远无法预测,例如, 序列变异是否以及如何影响所产生的蛋白质水平。在这里,我们建议结合 通过机器学习对合成文库中蛋白质表达进行高通量实验表征 创建翻译和 mRNA 稳定性的预测模型,解决迫切需求。最近的进展 机器视觉、语音识别和计算机科学的其他领域一直受到以下因素的推动: 用于训练模型的巨大数据集。机器学习方法也产生了显着的影响 在生物学中,但生物数据集通常相对较小,限制了可建模模型的质量 学到了。例如,人类基因组中只有大约 20,000 个基因,这是一个有限的小集合。 训练预测模型的示例,该模型捕获基因组“监管代码”的全部范围。在这个 建议,我们的目标是通过训练蛋白质表达的预测模型来克服这种数据大小的限制 来自数百万个合成结构的数据——一个比合成结构的数量大几个数量级的数据集 基因组中的基因。具体来说,我们将创建具有目标变异的体外转录 mRNA 文库 在 UTR 和 CDS 中,并将通过执行高通量分析来分析每个文库成员的蛋白质表达 多核糖体分析、核糖体分析和 mRNA 稳定性测定。然后我们将使用神经网络 学习 mRNA 序列和蛋白质水平之间关系的预测模型的方法 生产。我们将把我们的模型应用到三个具有实际重要性的应用中:首先,我们期望发现 新颖的生物学,例如识别调控序列元件及其之间的相互作用。第二, 我们将通过从头设计和序列实验测试来验证我们的模型 比原始数百万个随机生成的成员中的任何一个都具有更高的水平或蛋白质产量 文库或当前生物技术中使用的内源UTR序列。如此稳定且高度 翻译 mRNA 构建体对于 mRNA 治疗领域具有特殊价值。第三,我们将 预测 UTR 遗传变异对蛋白质生产的功能影响,我们将验证 这些预测是通过实验得出的。我们还远未了解哪些遗传变异会损害基因 调节功能可能会导致疾病,从而使这种全面和定量的 有价值的变体分析。

项目成果

期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Georg Seelig其他文献

Georg Seelig的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Georg Seelig', 18)}}的其他基金

Engineering cell type-specific splicing regulation
工程细胞类型特异性剪接调控
  • 批准号:
    10633765
  • 财政年份:
    2023
  • 资助金额:
    $ 47.31万
  • 项目类别:
Joint receptor and protein expression immunophenotyping through split-pool barcoding
通过分池条形码进行联合受体和蛋白质表达免疫表型
  • 批准号:
    10625987
  • 财政年份:
    2021
  • 资助金额:
    $ 47.31万
  • 项目类别:
Joint receptor and protein expression immunophenotyping through split-pool barcoding
通过分池条形码进行联合受体和蛋白质表达免疫表型
  • 批准号:
    10375354
  • 财政年份:
    2021
  • 资助金额:
    $ 47.31万
  • 项目类别:
High-resolution spatial transcriptomics through light patterning
通过光图案化的高分辨率空间转录组学
  • 批准号:
    9886581
  • 财政年份:
    2020
  • 资助金额:
    $ 47.31万
  • 项目类别:
High-resolution spatial transcriptomics through light patterning
通过光图案化的高分辨率空间转录组学
  • 批准号:
    10341212
  • 财政年份:
    2020
  • 资助金额:
    $ 47.31万
  • 项目类别:
A massively parallel reporter assay for measuring chromatin effects on alternative splicing
用于测量染色质对选择性剪接的影响的大规模并行报告分析
  • 批准号:
    10161803
  • 财政年份:
    2020
  • 资助金额:
    $ 47.31万
  • 项目类别:
A massively parallel reporter assay for measuring chromatin effects on alternative splicing
用于测量染色质对选择性剪接的影响的大规模并行报告分析
  • 批准号:
    9977420
  • 财政年份:
    2020
  • 资助金额:
    $ 47.31万
  • 项目类别:
High-resolution spatial transcriptomics through light patterning
通过光图案化进行高分辨率空间转录组学
  • 批准号:
    10112854
  • 财政年份:
    2020
  • 资助金额:
    $ 47.31万
  • 项目类别:
Predictive Modeling of Alternative Splicing and Polyadenylation from Millions of Random Sequences
数百万随机序列的选择性剪接和聚腺苷酸化的预测模型
  • 批准号:
    9306648
  • 财政年份:
    2017
  • 资助金额:
    $ 47.31万
  • 项目类别:

相似海外基金

Impact of alternative polyadenylation of 3'-untranslated regions in the PI3K/AKT cascade on microRNA
PI3K/AKT 级联中 3-非翻译区的替代多聚腺苷酸化对 microRNA 的影响
  • 批准号:
    573541-2022
  • 财政年份:
    2022
  • 资助金额:
    $ 47.31万
  • 项目类别:
    University Undergraduate Student Research Awards
How do untranslated regions of cannabinoid receptor type 1 mRNA determine receptor subcellular localisation and function?
1 型大麻素受体 mRNA 的非翻译区如何决定受体亚细胞定位和功能?
  • 批准号:
    2744317
  • 财政年份:
    2022
  • 资助金额:
    $ 47.31万
  • 项目类别:
    Studentship
MICA:Synthetic untranslated regions for direct delivery of therapeutic mRNAs
MICA:用于直接递送治疗性 mRNA 的合成非翻译区
  • 批准号:
    MR/V010948/1
  • 财政年份:
    2021
  • 资助金额:
    $ 47.31万
  • 项目类别:
    Research Grant
Translational Control by 5'-untranslated regions
5-非翻译区域的翻译控制
  • 批准号:
    10019570
  • 财政年份:
    2019
  • 资助金额:
    $ 47.31万
  • 项目类别:
Translational Control by 5'-untranslated regions
5-非翻译区域的翻译控制
  • 批准号:
    10223370
  • 财政年份:
    2019
  • 资助金额:
    $ 47.31万
  • 项目类别:
Translational Control by 5'-untranslated regions
5-非翻译区域的翻译控制
  • 批准号:
    10455108
  • 财政年份:
    2019
  • 资助金额:
    $ 47.31万
  • 项目类别:
Synergistic microRNA-binding sites, and 3' untranslated regions: a dialogue of silence
协同的 microRNA 结合位点和 3 非翻译区:沉默的对话
  • 批准号:
    255762
  • 财政年份:
    2012
  • 资助金额:
    $ 47.31万
  • 项目类别:
    Operating Grants
Analysis of long untranslated regions in Nipah virus genome
尼帕病毒基因组长非翻译区分析
  • 批准号:
    20790351
  • 财政年份:
    2008
  • 资助金额:
    $ 47.31万
  • 项目类别:
    Grant-in-Aid for Young Scientists (B)
Search for mRNA elements involved in the compatibility between 5' untranslated regions and coding regions in chloroplast translation
寻找参与叶绿体翻译中 5 非翻译区和编码区之间兼容性的 mRNA 元件
  • 批准号:
    19370021
  • 财政年份:
    2007
  • 资助金额:
    $ 47.31万
  • 项目类别:
    Grant-in-Aid for Scientific Research (B)
Post-transcriptional Regulation of PPAR-g Expression by 5'-Untranslated Regions
5-非翻译区对 PPAR-g 表达的转录后调控
  • 批准号:
    7131841
  • 财政年份:
    2006
  • 资助金额:
    $ 47.31万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了