Geometric structures guided learning model and algorithms for bulk RNAseq data analysis

用于批量 RNAseq 数据分析的几何结构引导学习模型和算法

基本信息

  • 批准号:
    10710214
  • 负责人:
  • 金额:
    $ 18.8万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-09-28 至 2025-07-31
  • 项目状态:
    未结题

项目摘要

Discovering potential drugs and treatments of many diseases heavily depends on identifying differentially expressed (DE) genes in disease conditions within individual cell types. While it is possible to experimentally sort out cells of individual cell types for DE analysis, computationally leveraging bulk tissue data has the advantage of greater availability, lower expenses, and less human handling. A critical step toward this research is to (completely) deconvolute gene expressions in specific cell types from the heterogeneous bulk tissues. Complete deconvolution can be viewed as a nonnegative matrix factorization (NMF) problem, however, NMF is strongly ill-posed, and its non-separable solutions give great challenges in data interpretability. These challenges vary in different applications, so if no special treatment is taken, results from complete deconvolution of gene expression data will make accurate DE analysis almost impossible. In this proposal, a mathematical model and associated computational algorithms will be established for the fundamental research of bulk tissue RNAseq analysis, for better data interpretability, reliability, and efficiency. To tackle this challenge, the geometric structure of the given bulk tissue data set will be explored first to identify marker genes for the constituent cell types. Then the model is established by (1) enforcing the weak solvability condition (because of noises) of NMF and (2) performing geometrical constraints on the data space of knowns. This work is motivated by the common characteristics of many biological data, in which expression levels across sample tissues exhibit strong correlations among certain genes. For massive amount of biological data, stochastic fast computational algorithms will be developed. After validation and benchmarking, the proposed model will be applied to DE analysis for various datasets. This proposed new model is important to decipher cellular transcriptional alterations in many diseases. In modeling strategies, this research provides a new perspective of observing topological/geometric structures of data, enforcing the corresponding constraints to enhance problem solvability and data interpretability. In computation, this research develops nonlinear graph Laplacian regularized optimization associated with stochastic compression algorithms, which can process massive data with low storage. requirement, low complexity, and adapt to modern structure of computer hardware. As
发现许多疾病的潜在药物和治疗方法在很大程度上取决于鉴别差异 表达(DE)基因在单个细胞类型内的疾病状况中。虽然可以 实验性地分选出用于DE分析的单个细胞类型的细胞, 数据具有更大的可用性、更低的费用和更少的人工处理的优点。一个关键步骤 这项研究的目的是(完全)从细胞中的特定细胞类型中解卷积基因表达。 异质大块组织。完全反褶积可以看作是一个非负矩阵分解 (NMF)然而,NMF问题是强不适定的,它的不可分解带来了很大的挑战 数据的可解释性。这些挑战在不同的应用中有所不同,因此如果不采取特殊处理, 基因表达数据完全去卷积的结果将使精确的DE分析几乎 不可能的在该提议中,将建立数学模型和相关的计算算法。 建立用于大量组织RNAseq分析的基础研究,以获得更好的数据可解释性, 可靠性和效率。为了应对这一挑战,给定的批量组织数据集的几何结构 将首先探索以鉴定组成细胞类型的标记基因。然后建立模型 通过(1)强制NMF的弱可解性条件(由于噪声)和(2)执行几何 对已知数据空间的约束。这项工作的动机是许多共同的特点, 生物学数据,其中样品组织中的表达水平在某些生物学数据之间表现出强烈的相关性。 基因.对于海量的生物数据,将开发随机快速计算算法。 经过验证和基准测试后,该模型将被应用于不同数据集的DE分析。 这一新的模型对于解读许多疾病中的细胞转录改变具有重要意义。在 建模策略,该研究提供了一个新的视角观察拓扑/几何 数据结构,强制执行相应的约束以增强问题的可解决性和数据 可解释性在计算方面,本研究发展了非线性图拉普拉斯正则化优化 与随机压缩算法相关联,其可以以低存储处理大量数据。 要求高、复杂度低,适应现代计算机硬件结构。 作为

项目成果

期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
GEOMETRIC STRUCTURE GUIDED MODEL AND ALGORITHMS FOR COMPLETE DECONVOLUTION OF GENE EXPRESSION DATA.
  • DOI:
    10.3934/fods.2022013
  • 发表时间:
    2022-02
  • 期刊:
  • 影响因子:
    2.3
  • 作者:
    Duan Chen;Shaoyu Li;Xue Wang
  • 通讯作者:
    Duan Chen;Shaoyu Li;Xue Wang
A hybrid stochastic interpolation and compression method for kernel matrices.
核矩阵的混合随机插值和压缩方法。
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Duan Chen其他文献

Duan Chen的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Duan Chen', 18)}}的其他基金

Geometric structures guided learning model and algorithms for bulk RNAseq data analysis
用于批量 RNAseq 数据分析的几何结构引导学习模型和算法
  • 批准号:
    10592460
  • 财政年份:
    2022
  • 资助金额:
    $ 18.8万
  • 项目类别:

相似海外基金

CAREER: Blessing of Nonconvexity in Machine Learning - Landscape Analysis and Efficient Algorithms
职业:机器学习中非凸性的祝福 - 景观分析和高效算法
  • 批准号:
    2337776
  • 财政年份:
    2024
  • 资助金额:
    $ 18.8万
  • 项目类别:
    Continuing Grant
CAREER: From Dynamic Algorithms to Fast Optimization and Back
职业:从动态算法到快速优化并返回
  • 批准号:
    2338816
  • 财政年份:
    2024
  • 资助金额:
    $ 18.8万
  • 项目类别:
    Continuing Grant
CAREER: Structured Minimax Optimization: Theory, Algorithms, and Applications in Robust Learning
职业:结构化极小极大优化:稳健学习中的理论、算法和应用
  • 批准号:
    2338846
  • 财政年份:
    2024
  • 资助金额:
    $ 18.8万
  • 项目类别:
    Continuing Grant
CRII: SaTC: Reliable Hardware Architectures Against Side-Channel Attacks for Post-Quantum Cryptographic Algorithms
CRII:SaTC:针对后量子密码算法的侧通道攻击的可靠硬件架构
  • 批准号:
    2348261
  • 财政年份:
    2024
  • 资助金额:
    $ 18.8万
  • 项目类别:
    Standard Grant
CRII: AF: The Impact of Knowledge on the Performance of Distributed Algorithms
CRII:AF:知识对分布式算法性能的影响
  • 批准号:
    2348346
  • 财政年份:
    2024
  • 资助金额:
    $ 18.8万
  • 项目类别:
    Standard Grant
CRII: CSR: From Bloom Filters to Noise Reduction Streaming Algorithms
CRII:CSR:从布隆过滤器到降噪流算法
  • 批准号:
    2348457
  • 财政年份:
    2024
  • 资助金额:
    $ 18.8万
  • 项目类别:
    Standard Grant
EAGER: Search-Accelerated Markov Chain Monte Carlo Algorithms for Bayesian Neural Networks and Trillion-Dimensional Problems
EAGER:贝叶斯神经网络和万亿维问题的搜索加速马尔可夫链蒙特卡罗算法
  • 批准号:
    2404989
  • 财政年份:
    2024
  • 资助金额:
    $ 18.8万
  • 项目类别:
    Standard Grant
CAREER: Efficient Algorithms for Modern Computer Architecture
职业:现代计算机架构的高效算法
  • 批准号:
    2339310
  • 财政年份:
    2024
  • 资助金额:
    $ 18.8万
  • 项目类别:
    Continuing Grant
CAREER: Improving Real-world Performance of AI Biosignal Algorithms
职业:提高人工智能生物信号算法的实际性能
  • 批准号:
    2339669
  • 财政年份:
    2024
  • 资助金额:
    $ 18.8万
  • 项目类别:
    Continuing Grant
DMS-EPSRC: Asymptotic Analysis of Online Training Algorithms in Machine Learning: Recurrent, Graphical, and Deep Neural Networks
DMS-EPSRC:机器学习中在线训练算法的渐近分析:循环、图形和深度神经网络
  • 批准号:
    EP/Y029089/1
  • 财政年份:
    2024
  • 资助金额:
    $ 18.8万
  • 项目类别:
    Research Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了