Unsupervised and Semisupervised Heterogeneity Analysis Based on Gaussian Graphical Models
基于高斯图模型的无监督和半监督异质性分析
基本信息
- 批准号:2209685
- 负责人:
- 金额:$ 19.97万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2022
- 资助国家:美国
- 起止时间:2022-09-01 至 2026-08-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
Many complex diseases such as cancer are heterogeneous, with seemingly similar patients having different clinical behaviors and varying responses to treatment. To better understand disease biology and more effectively describe and treat diseases, it is of essential importance to accurately model disease heterogeneity, which has been made possible by the fast accumulation of omics data. The existing studies are limited by analyzing simple data distributional properties. This project will advance the paradigm of disease heterogeneity analysis by accommodating how omics measurements are connected. Additionally, the investigator will comprehensively study multiple data scenarios, including when disease outcome (for example, survival) is completely unknown or known for some patients, and when additional data (for instance, on demographics and clinical history) is also available. The investigator will develop a set of leading-edge statistical methods and conduct rigorous theoretical and numerical investigations to compare with existing approaches. This project will fundamentally advance multiple subfields of statistics, including heterogeneity analysis, analysis of high-dimensional data, model selection, and optimization with high-dimensional data. Equally importantly, applications of the developed methods will lead to more accurate identification of heterogeneous patient groups and their omics characteristics for multiple cancer types. This will facilitate the identification of disease subtypes, treatment selection, and prediction of disease paths, having a direct and profound impact on clinical decision-making. Taking advantage of TCGA (The Cancer Genome Atlas) data, the investigator will deliver important heterogeneity models for lung and skin cancer, valuable to basic science and clinical researchers. Additionally, this project will benefit the education and training of undergraduate and graduate students at Yale University, and foster additional collaborations.Heterogeneity analysis plays an important role in statistics and biomedicine. The development of high-throughput profiling has made it possible to conduct more informative analysis but has also brought numerous statistical challenges. Many commonly used methods are limited to marginal measures especially mean and variance. In this project, building on a recent successful GGM (Gaussian Graphical Model)-based heterogeneity analysis, the investigator will systematically develop GGM-based unsupervised and semisupervised heterogeneity analysis. In particular, the investigator will examine the complicated scenarios with the presence of latent effects and regulating effects as well as heterogeneity analysis under a hierarchy. A series of leading-edge methods built on the penalized fusion technique will be developed. The consistency properties of developed methods will be established under ultrahigh-dimensional settings. The project will also develop efficient computational algorithms and conduct extensive simulations and comparisons. The investigator plans to analyze the TCGA (The Cancer Genome Atlas) data on lung and skin cancer and deliver heterogeneity models along with variable selection and model estimation results. Statistical investigations under this project will broadly shed insight into high-dimensional statistics, heterogeneity modeling, penalization, and network-based analysis. Data analysis will significantly move the field of cancer omics.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
许多复杂的疾病,如癌症,是异质性的,看似相似的患者有不同的临床行为和不同的治疗反应。为了更好地理解疾病生物学,更有效地描述和治疗疾病,准确地建模疾病异质性是至关重要的,这已经通过组学数据的快速积累成为可能。现有的研究局限于分析简单的数据分布特性。该项目将通过适应组学测量如何连接来推进疾病异质性分析的范式。此外,研究者将全面研究多个数据场景,包括某些患者的疾病结局(例如,生存期)完全未知或已知,以及其他数据(例如,人口统计学和临床病史)也可用。研究人员将开发一套领先的统计方法,并进行严格的理论和数值研究,以与现有的方法进行比较。该项目将从根本上推进统计学的多个子领域,包括异质性分析,高维数据分析,模型选择和高维数据优化。同样重要的是,所开发的方法的应用将导致更准确地识别异质性患者群体及其多种癌症类型的组学特征。这将有助于识别疾病亚型、治疗选择和预测疾病路径,对临床决策产生直接而深远的影响。利用TCGA(癌症基因组图谱)数据,研究人员将为肺癌和皮肤癌提供重要的异质性模型,这对基础科学和临床研究人员很有价值。此外,该项目将有利于耶鲁大学本科生和研究生的教育和培训,并促进更多的合作。异源性分析在统计学和生物医学中起着重要作用。高通量分析的发展使人们有可能进行更多的信息分析,但也带来了许多统计挑战。许多常用的方法都局限于边缘测量,特别是均值和方差。在本项目中,研究人员将在最近成功的基于GGM(高斯图形模型)的异质性分析的基础上,系统地开发基于GGM的无监督和半监督异质性分析。特别是,研究者将研究存在潜在效应和调节效应的复杂情景以及层次结构下的异质性分析。将开发一系列基于惩罚聚变技术的前沿方法。所开发方法的一致性特性将在超高维环境下建立。该项目还将开发高效的计算算法,并进行广泛的模拟和比较。研究者计划分析肺癌和皮肤癌的TCGA(癌症基因组图谱)数据,并提供异质性模型沿着变量选择和模型估计结果。该项目下的统计调查将广泛深入了解高维统计,异质性建模,惩罚和基于网络的分析。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Shuangge Ma其他文献
Book review: Tsiatis, A.A. 2006: Semiparametric Theory and Missing Data. Springer
- DOI:
10.1177/09622802080170051002 - 发表时间:
2008-10 - 期刊:
- 影响因子:2.3
- 作者:
Shuangge Ma - 通讯作者:
Shuangge Ma
In Regard to Vaidya et al.
关于 Vaidya 等人。
- DOI:
10.1016/j.ijrobp.2016.06.2460 - 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
Henry S. Park;Shuangge Ma;L. Wilson;M. Moran - 通讯作者:
M. Moran
Collective versus Individual Effects in Survival Analysis of Multiple Failures
多重故障生存分析中的集体效应与个体效应
- DOI:
- 发表时间:
2016 - 期刊:
- 影响因子:0
- 作者:
Jialiang Li;Zhipeng Huang;Shuangge Ma;Mei - 通讯作者:
Mei
SOCIOECONOMIC STATUS MODIFIES THE EFFECT OF RACE ON LIFE EXPECTANCY AFTER ACUTE MYOCARDIAL INFARCTION
- DOI:
10.1016/s0735-1097(15)62161-1 - 发表时间:
2015-03-17 - 期刊:
- 影响因子:
- 作者:
Emily Marie Bucholz;Yun Wang;Shuangge Ma;Sharon-Lise Normand;Harlan Krumholz - 通讯作者:
Harlan Krumholz
Subgroup Analysis of Differential Networks with Latent Variables
- DOI:
10.1007/s11222-025-10681-z - 发表时间:
2025-07-02 - 期刊:
- 影响因子:1.600
- 作者:
Linxi Li;Shuangge Ma;Qingzhao Zhang - 通讯作者:
Qingzhao Zhang
Shuangge Ma的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Shuangge Ma', 18)}}的其他基金
Collaborative Research: Integrating Multi-Dimensional Omics Data for Quantifying Disease Heterogeneity
合作研究:整合多维组学数据以量化疾病异质性
- 批准号:
1916251 - 财政年份:2019
- 资助金额:
$ 19.97万 - 项目类别:
Standard Grant
Collaborative Research: Novel methods for pharmacogenomic data analysis using gene clusters
合作研究:使用基因簇进行药物基因组数据分析的新方法
- 批准号:
0904181 - 财政年份:2009
- 资助金额:
$ 19.97万 - 项目类别:
Standard Grant
Collaborative Proposal: Novel Semiparametric Two-part Models: New Theories and Applications
合作提案:新颖的半参数两部分模型:新理论和应用
- 批准号:
0805984 - 财政年份:2008
- 资助金额:
$ 19.97万 - 项目类别:
Standard Grant
相似海外基金
Semisupervised transductive classification of remotesensing images with restricted training data
训练数据受限的遥感图像半监督转导分类
- 批准号:
19K12043 - 财政年份:2019
- 资助金额:
$ 19.97万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
semisupervised coreference resolution
半监督共指消解
- 批准号:
104076539 - 财政年份:2009
- 资助金额:
$ 19.97万 - 项目类别:
Research Grants