Imputing single cell RNA sequencing data: Mathematical, statistical and computational challenges

估算单细胞 RNA 测序数据:数学、统计和计算挑战

基本信息

  • 批准号:
    9902859
  • 负责人:
  • 金额:
    $ 25万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2019
  • 资助国家:
    美国
  • 起止时间:
    2019-09-23 至 2022-08-31
  • 项目状态:
    已结题

项目摘要

Novel single cell RNA sequencing (scRNA-seq) technologies can simultaneously measure the expression levels of all 30,000 genes over thousands to millions of individual cells. The analysis of scRNA-seq data has already led to fundamental advances in biology, including discovery of new cell types, detection of subtle differences between similar cells, and reconstruction of cellular developmental trajectories. Single- cell measurements involve amplification of tiny amounts of RNA and result in extremely sparse data matrices with many zeros, While some of these zeros are due to missing data (dropouts), others represent true biological inactivity. Yet, many scRNA-seq imputation methods treat all observed zero entries identically, leading to imputed matrices that often overestimate transcriptional activity. Other methods that do attempt to distinguish biological zeros from dropouts lack rigorous theoretical guarantees. The goals of this proposal are to develop models, supporting mathematical theory, and computational tools that explicitly take the existence of true biological zeros into account. Matrix imputation under this constraint involves both computational challenges as well as theoretical questions in random matrix theory and high dimensional statistics. These include rank estimation and low rank sparse matrix recovery from partially observed data, and biclustering in the presence of dropouts and zeros, We plan to develop novel approaches based on non-smooth continuous optimization, and derive accompanying statistical guarantees, We also plan to develop ensemble learning approaches that cleverly combine the outputs of multiple imputation algorithms. Finally, we hope to gain important insights regarding recovery from such data via a study of minimax rates and information lower bounds. To address these challenges, we will build on our promising preliminary results and the joint expertise of the investigators in spectral methods, high dimensional statistics, matrix analysis, numerical optimization, and genomics.
新型单细胞rna测序(scrna-seq)技术可以同时检测所有 在数千到数百万个单个细胞上有30,000个基因。对scRNA-seq数据分析已经导致 生物学方面的基本进展,包括发现新的细胞类型,检测 相似的细胞,以及细胞发育轨迹的重建。单细胞测量包括 对微量RNA的放大,导致极稀疏的具有许多零的数据矩阵,而一些 这些零是由于数据缺失(辍学),其他的代表真正的生物不活跃。然而,许多scRNA-seq 推算方法对所有观察到的零条目一视同仁,导致推算矩阵经常被高估 转录活性。其他试图区分生物学上的零和辍学的方法缺乏严谨。 理论上的保证。该提案的目标是开发模型,支持数学理论,以及 明确考虑真生物零点存在的计算工具。下的矩阵插补 这种约束既涉及计算挑战,也涉及随机矩阵理论中的理论问题 高维统计数据。其中包括秩估计和从部分稀疏矩阵中恢复低阶秩稀疏矩阵。 我们计划开发基于以下条件的新方法: 非光滑连续优化,并推导出伴随的统计保证,我们还计划开发 集成学习方法,巧妙地组合多个补偿算法的输出。最后,我们希望 通过对极小极大速率和信息较低的研究,获得关于从这些数据中恢复的重要见解 有界。为了应对这些挑战,我们将在我们可喜的初步成果和双方的共同专长的基础上再接再厉 从事光谱方法、高维统计、矩阵分析、数值优化和基因组学研究。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Eric C Chi其他文献

Eric C Chi的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Eric C Chi', 18)}}的其他基金

Imputing Single Cell Rna Sequencing Data: Mathematical, Statistical And Computational Challenges
估算单细胞 RNA 测序数据:数学、统计和计算挑战
  • 批准号:
    10577202
  • 财政年份:
    2019
  • 资助金额:
    $ 25万
  • 项目类别:
Imputing single cell RNA sequencing data: Mathematical, statistical and computational challenges
估算单细胞 RNA 测序数据:数学、统计和计算挑战
  • 批准号:
    10021696
  • 财政年份:
    2019
  • 资助金额:
    $ 25万
  • 项目类别:
Imputing single cell RNA sequencing data: Mathematical, statistical and computational challenges
估算单细胞 RNA 测序数据:数学、统计和计算挑战
  • 批准号:
    10242066
  • 财政年份:
    2019
  • 资助金额:
    $ 25万
  • 项目类别:

相似海外基金

CAREER: Transferring biological networks emergent principles to drone swarm collaborative algorithms
职业:将生物网络新兴原理转移到无人机群协作算法
  • 批准号:
    2339373
  • 财政年份:
    2024
  • 资助金额:
    $ 25万
  • 项目类别:
    Continuing Grant
Point-of-care optical spectroscopy platform and novel ratio-metric algorithms for rapid and systematic functional characterization of biological models in vivo
即时光学光谱平台和新颖的比率度量算法,可快速、系统地表征体内生物模型的功能
  • 批准号:
    10655174
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
Statistical Inference from Multiscale Biological Data: theory, algorithms, applications
多尺度生物数据的统计推断:理论、算法、应用
  • 批准号:
    EP/Y037375/1
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Research Grant
Analysis of words: algorithms for biological sequences, music and texts
单词分析:生物序列、音乐和文本的算法
  • 批准号:
    RGPIN-2016-03661
  • 财政年份:
    2021
  • 资助金额:
    $ 25万
  • 项目类别:
    Discovery Grants Program - Individual
Analysis of words: algorithms for biological sequences, music and texts
单词分析:生物序列、音乐和文本的算法
  • 批准号:
    RGPIN-2016-03661
  • 财政年份:
    2019
  • 资助金额:
    $ 25万
  • 项目类别:
    Discovery Grants Program - Individual
Building flexible biological particle detection algorithms for emerging real-time instrumentation
为新兴实时仪器构建灵活的生物颗粒检测算法
  • 批准号:
    2278799
  • 财政年份:
    2019
  • 资助金额:
    $ 25万
  • 项目类别:
    Studentship
CAREER: Microscopy Image Analysis to Aid Biological Discovery: Optics, Algorithms, and Community
职业:显微镜图像分析有助于生物发现:光学、算法和社区
  • 批准号:
    2019967
  • 财政年份:
    2019
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Analysis of words: algorithms for biological sequences, music and texts
单词分析:生物序列、音乐和文本的算法
  • 批准号:
    RGPIN-2016-03661
  • 财政年份:
    2018
  • 资助金额:
    $ 25万
  • 项目类别:
    Discovery Grants Program - Individual
Analysis of words: algorithms for biological sequences, music and texts
单词分析:生物序列、音乐和文本的算法
  • 批准号:
    RGPIN-2016-03661
  • 财政年份:
    2017
  • 资助金额:
    $ 25万
  • 项目类别:
    Discovery Grants Program - Individual
Analysis of words: algorithms for biological sequences, music and texts
单词分析:生物序列、音乐和文本的算法
  • 批准号:
    RGPIN-2016-03661
  • 财政年份:
    2016
  • 资助金额:
    $ 25万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了