Bayesian Differential Causal Network and Clustering Methods for Single-Cell Data
单细胞数据的贝叶斯差分因果网络和聚类方法
基本信息
- 批准号:10592720
- 负责人:
- 金额:$ 30.44万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-09-21 至 2026-08-31
- 项目状态:未结题
- 来源:
- 关键词:AddressBayesian ModelingBiologicalCell Differentiation processCellsComparative StudyDataDependenceDevelopmentEtiologyFeedbackGene Expression RegulationGenesInterventionJointsKnowledgeMeasuresMethodsModelingMolecularNational Institute of General Medical SciencesNatureNeighborhoodsPathologicPreventionProceduresRegulator GenesResearchSample SizeSamplingTechnologyTestingTimeTissuesTranslatingUncertaintyWorkbasecausal variantcell typedifferential expressiondisease diagnosisexperimental groupgene regulatory networkimprovednetwork modelsnon-Gaussian modelsingle-cell RNA sequencingtooltranscriptome sequencingtreatment group
项目摘要
Project Description
DMS/NIGMS 2: Bayesian Differential Causal Network and Clustering Methods for Single-Cell Data
A Significance
A.1 Importance of the Problem to Be Addressed
Single-cell RNA-sequencing (scRNA-seq) technologies have facilitated new biological discoveries that were
impossible with bulk RNA-seq, such as discovering at the single-cell level new gene regulatory activities and
cell types. However, in order to translate the fundamental biological knowledge advanced by the scRNA-
seq to improved disease diagnosis, treatment, and prevention, new methods are required to comparatively
study the molecular differences between normal and pathological cells/tissues, and between control and
case/treatment groups. Although identification of differentially expressed genes across two sample groups
has been extensively studied, to date, the vast majority of the existing methods for identifying gene regu-
latory networks (GRNs) and cell types have, so far, focused on scRNA-seq data generated under a single
experimental condition. In principle, these methods can be applied to one experimental condition at a time,
based on which post hoc comparisons can be made in order to find the differences caused by experimental
interventions. However, compared to joint modeling approaches, this two-step procedure is deemed less
efficient and more susceptible to false discoveries due to lack of proper uncertainty propagation from the
first step to the second. Moreover, most scRNA-seq network models are correlative in nature and do not
infer causal gene regulatory relationships. There is, therefore, a critical need to develop new models for
identifying the effects of experimental interventions on causal gene regulation and cell composition by jointly
modeling scRNA-seq data across experimental groups. In the absence of such tools, mechanistically un-
derstanding gene regulation and cell differentiation, and fully realizing the translational values of scRNA-seq
studies will likely remain difficult.
A.2 Rigor of Prior Research
Aim 1. Many existing scRNA-seq network approaches adapt standard association measures to zero-
inflated scRNA-seq data, e.g. Pearson correlation [1] and mutual information [2]. A common limitation
of these methods is that they only quantify marginal dependencies, which is susceptible to spurious indirect
associations [3]. Graphical models which deal with conditional associations are powerful alternatives to
the marginal association measures. Numerous methods have been proposed for general purposes [4, 5]
including the development on non-Gaussian data [6–9]. Specifically for scRNA-seq data, two undirected
graphical models including Co-I Cai's work [10, 11] were recently proposed based on neighborhood selec-
tion which, however, do not infer causal gene regulation. To identify causal relationships, several alternative
methods [12, 13] were developed. However, these methods either ignore the count nature of scRNA-seq
data, require a known pseudotime (which is rarely known in real scRNA-seq data), or do not theoretically in-
vestigate causal identifiability for cross-sectional observations. For differential networks, many approaches
[14–18] including the PI's prior work [19] have been developed for bulk RNA-seq data which showed great
advantages of joint analyses over independent analyses. However, there exist much fewer differential net-
work methods for scRNA-seq data, e.g., PT [20] and scdNet [21] . The common limitation of PT and
scdNet is that they only consider marginal dependence (hence susceptible to false discoveries) and do not
discover causality. Results from our preliminary results (§C.1) demonstrate that the proposed Bayesian
network model is capable of identifying causal gene regulatory relationships in cross-sectional scRNA-seq
data and often outperforms the state-of-the-art alternative methods.
Aim 2. Very few methods are available to construct cell-specific networks because it is difficult to estimate
networks with, in essence, sample size one. Recently, a hypothesis testing approach [22] was developed
to estimate cell-specific networks. The method makes approximate network inference of each cell based
on its neighbors. However, it only considers symmetric (undirected) marginal dependence, and therefore
cannot infer causal regulatory relationships and is susceptible to spurious associations. The PI's prior work
[23] addressed the "sample-size-one" problem in bulk RNA-seq data assuming the causal networks are
smooth functions of additional covariates. However, the method is not applicable without covariates and
does not allow feedback loops, a common motif in GRN. Existing work [24, 25] including the PI's [19] has
1
项目说明
DMS/NIGMS 2:单细胞数据的贝叶斯差异因果网络和聚类方法
A SignifiCance
A.1要解决的问题的重要性
单细胞rna测序(scrna-seq)技术促进了新的生物发现,这些发现
使用批量RNA-seq是不可能的,例如在单细胞水平上发现新的基因调节活性和
单元类型。然而,为了翻译由scRNA提出的基本生物学知识-
为了提高疾病的诊断、治疗和预防水平,需要比较新的方法
研究正常和病理细胞/组织之间以及对照和对照之间的分子差异
病例/治疗组。尽管两个样本组间差异表达基因的同源性fi阳离子
到目前为止,绝大多数现有的识别基因调控的方法都得到了广泛的研究。
到目前为止,乳房网络(GRN)和细胞类型一直专注于在单个
实验条件。原则上,这些方法可以一次应用于一种实验条件,
在此基础上,可以进行事后比较,以fi和实验引起的差异
干预措施。然而,与联合建模方法相比,这种两步过程被认为是较少的
EFfi很有效,由于缺乏适当的不确定性传播,因此更容易发生错误发现
fi第一步到第二步。此外,大多数scrna-seq网络模型本质上是相关的,而不是。
推断因果基因调控关系。因此,迫切需要开发新的模式,以
通过联合确定实验干预对因果基因调控和细胞组成的影响
对各试验组的scRNA-seq数据进行建模。在没有这样的工具的情况下,机械地不-
了解基因调控和细胞分化,充分认识scRNA-seq的翻译价值
研究可能仍然是不受fi崇拜的。
A.2先前研究的严格性
目的1.许多现有的scRNA-seq网络方法将标准的关联度量改进为零关联度量。
在fl的scRNA-seq数据中,例如皮尔逊相关[1]和互信息[2]。一个常见的限制
这些方法中的一种是它们只量化边际依赖,这很容易受到虚假间接的影响
协会[3]。处理条件关联的图形模型是强大的替代方案
边际关联度衡量的是。许多方法已被提出用于一般目的[4,5]
包括对非高斯数据的发展[6-9]。特殊fi用于scRNA-seq数据,两个非定向
最近提出了基于邻域选择的图形模型,其中包括Co-I Cai的工作[10,11]。
然而,这并不能推断出因果基因调控。为了确定因果关系,有几种选择
方法发展了文献[12,13]。然而,这些方法要么忽略了scRNA-seq的计数性质
数据,需要已知的伪时间(在真实的scRNA-seq数据中很少知道),或者理论上不在-
研究横断面观察的因果关系识别fi能力。对于差分网络,有许多方法
[14-18],包括PI以前的工作[19]都是为大量RNA-Seq数据开发的,这表明
联合分析相对于独立分析的优势。然而,差异网的存在要少得多。
ScRNA-seq数据的工作方法,例如PT[20]和scdNet[21]。PT和PT的共同局限性
SCDNet的问题是,他们只考虑边际依赖(因此容易受到错误发现的影响),而不
发现因果关系。我们的初步结果(§C.1)的结果表明,所提出的贝叶斯
网络模型能够识别横截面scRNA-seq中的因果基因调控关系。
数据,而且往往比最先进的替代方法性能更好。
目的2.由于很难估计fi-fi网络的大小,所以目前很少有方法可以用来构造小区特定的网络
从本质上讲,网络样本量为1。最近,一种假设检验方法[22]被开发出来
以估计小区特定的fic网络。该方法对每个单元格进行近似网络推理
它的邻居。然而,它只考虑对称的(无方向的)边际依赖,因此
不能推断因果监管关系,容易出现虚假关联。私家侦探之前的工作
[23]解决了大量rna-seq数据中的“样本量一”问题,假设因果网络是
附加协变量的光滑函数。然而,该方法不适用于没有协变量和
不允许反馈循环,这是GRN中的一个常见主题。包括PI的[19]在内的现有工作[24,25]具有
1
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Yang Ni其他文献
Yang Ni的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Yang Ni', 18)}}的其他基金
Bayesian Differential Causal Network and Clustering Methods for Single-Cell Data
单细胞数据的贝叶斯差分因果网络和聚类方法
- 批准号:
10707494 - 财政年份:2022
- 资助金额:
$ 30.44万 - 项目类别:
相似海外基金
Bayesian Modeling and Inference for High-Dimensional Disease Mapping and Boundary Detection"
用于高维疾病绘图和边界检测的贝叶斯建模和推理”
- 批准号:
10568797 - 财政年份:2023
- 资助金额:
$ 30.44万 - 项目类别:
Bayesian modeling of multivariate mixed longitudinal responses with scale mixtures of multivariate normal distributions
具有多元正态分布尺度混合的多元混合纵向响应的贝叶斯建模
- 批准号:
10730714 - 财政年份:2023
- 资助金额:
$ 30.44万 - 项目类别:
Bayesian Modeling and Scalable Inference for Big Data Streams
大数据流的贝叶斯建模和可扩展推理
- 批准号:
RGPIN-2019-03962 - 财政年份:2022
- 资助金额:
$ 30.44万 - 项目类别:
Discovery Grants Program - Individual
Bayesian modeling on ethical consumption and its empirical application for behavior modification
道德消费的贝叶斯模型及其在行为矫正中的实证应用
- 批准号:
21K18559 - 财政年份:2021
- 资助金额:
$ 30.44万 - 项目类别:
Grant-in-Aid for Challenging Research (Exploratory)
Utilizing Bayesian modeling to improve mutational signature inference in large-scale datasets
利用贝叶斯建模改进大规模数据集中的突变特征推断
- 批准号:
10684720 - 财政年份:2021
- 资助金额:
$ 30.44万 - 项目类别:
Bayesian Modeling and Scalable Inference for Big Data Streams
大数据流的贝叶斯建模和可扩展推理
- 批准号:
RGPIN-2019-03962 - 财政年份:2021
- 资助金额:
$ 30.44万 - 项目类别:
Discovery Grants Program - Individual
Bayesian Modeling of Mass-Spec Proteomics Data to Advance Studies of the Genetic Regulation of Proteins
质谱蛋白质组数据的贝叶斯建模推进蛋白质遗传调控的研究
- 批准号:
10391171 - 财政年份:2021
- 资助金额:
$ 30.44万 - 项目类别:
Utilizing Bayesian modeling to improve mutational signature inference in large-scale datasets
利用贝叶斯建模改进大规模数据集中的突变特征推断
- 批准号:
10490301 - 财政年份:2021
- 资助金额:
$ 30.44万 - 项目类别:
Utilizing Bayesian modeling to improve mutational signature inference in large-scale datasets
利用贝叶斯建模改进大规模数据集中的突变特征推断
- 批准号:
10305242 - 财政年份:2021
- 资助金额:
$ 30.44万 - 项目类别:
Bayesian Modeling and Scalable Inference for Big Data Streams
大数据流的贝叶斯建模和可扩展推理
- 批准号:
RGPIN-2019-03962 - 财政年份:2020
- 资助金额:
$ 30.44万 - 项目类别:
Discovery Grants Program - Individual














{{item.name}}会员




