Collaborative Research: Statistical Methods for Integrated Analysis of High-Throughput Biomedical Data
合作研究:高通量生物医学数据综合分析的统计方法
基本信息
- 批准号:1264058
- 负责人:
- 金额:$ 49万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2013
- 资助国家:美国
- 起止时间:2013-09-15 至 2018-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Technological advances have led to a rapid proliferation of high-throughput "omics" data in medicine that hold the key to clinically effective personalized medicine. To realize this goal, statistical and computational tools to mine this data and discover biomarkers, drug targets, disrupted disease networks, and disease sub-types are urgently needed. There are, however, two primary factors which make the development of such statistical tools challenging. First, many high-throughput genomic technologies produce varied heterogeneous data, which include continuous data (microarrays, methylation arrays), count data (RNA-sequencing), and binary/categorical data (SNPs, CNV). These varied data sets do not always satisfy typical distributional assumptions imposed by standard high-dimensional statistical models. Second, in order for scientists to leverage all of their data and understand the complete molecular basis of disease, these varied omics data sets need to be combined into a single multivariate statistical model. This proposal seeks to address these two issues with a new statistical framework for integrated analysis of multiple sets of high-dimensional data measured on the same group of subjects. The key statistical approach uses the theory of exponential family distributions to generalize two foundational high-dimensional statistical frameworks, principal components analysis (PCA) and graphical models, so as to jointly analyze transcriptional, epi-genomics and functional genomics data. This research will be applied to high-throughput cancer genomics data and lead to new methods to (a) discover molecular cancer sub-types along with their genomic signatures and (b) build a holistic network model of disease. By leveraging information across all the different types available of genomic biomarkers, the proposed methods will have the potential to make scientific discoveries critical for personalized medicine. The proposed work will also be broadly applicable to integrating multiple sets of "omics" data, including genomics, proteomics, metabolomics, and imaging. Beyond medicine, the theoretical framework and statistical methods will make significant advances in the theory of exponential families, statistical learning, and the emerging field of integrative analysis as well as have broad applicability in other disciplines such as engineering and security. All results will be disseminated through publications, conferences, and open-source software; this research will also provide training and educational opportunities for doctoral and postdoctoral scholars.
技术进步导致了医学中高通量“组学”数据的迅速扩散,这些数据掌握着临床上有效的个性化医学的关键。为了实现这一目标,迫切需要统计和计算工具来挖掘这些数据并发现生物标记物、药物靶标、被破坏的疾病网络和疾病亚型。然而,有两个主要因素使这类统计工具的开发具有挑战性。首先,许多高通量基因组技术产生各种不同的异质数据,包括连续数据(微阵列、甲基化阵列)、计数数据(RNA测序)和二进制/分类数据(SNPs、CNV)。这些不同的数据集并不总是满足标准高维统计模型强加的典型分布假设。其次,为了让科学家利用他们的所有数据并了解疾病的完整分子基础,这些不同的组学数据集需要合并到一个单一的多变量统计模型中。这项提议试图用一个新的统计框架来解决这两个问题,以便对同一组受试者测量的多组高维数据进行综合分析。关键统计方法使用指数族分布理论来概括主成分分析(PCA)和图形模型这两个基本的高维统计框架,从而联合分析转录、表观基因组和功能基因组数据。这项研究将应用于高通量的癌症基因组数据,并导致新的方法:(A)发现分子癌症亚型及其基因组特征;(B)建立疾病的整体网络模型。通过利用所有可用不同类型的基因组生物标记物的信息,拟议的方法将有可能做出对个性化医学至关重要的科学发现。这项拟议的工作还将广泛适用于整合多组“组学”数据,包括基因组学、蛋白质组学、代谢组学和成像。除了医学,理论框架和统计方法将在指数族理论、统计学习和新兴的综合分析领域取得重大进展,并在工程和安全等其他学科中具有广泛的适用性。所有成果将通过出版物、会议和开源软件传播;这项研究还将为博士和博士后学者提供培训和教育机会。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Genevera Allen其他文献
Breathe Easy, an automated respiratory data pipeline for waveform characteristic analysis
Breathe Easy,用于波形特征分析的自动化呼吸数据管道
- DOI:
- 发表时间:
2023 - 期刊:
- 影响因子:8.4
- 作者:
Savannah J. Lusk;Christopher Ward;Andersen Chang;Avery Twitchell‐Heyne;Shaun Fattig;Genevera Allen;Joanna Jankowsky;Russell Ray - 通讯作者:
Russell Ray
Extreme Graphical Models with Applications to Functional Neuronal Connectivity
极端图形模型及其在功能神经元连接中的应用
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Andersen Chang;Genevera Allen - 通讯作者:
Genevera Allen
Genevera Allen的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Genevera Allen', 18)}}的其他基金
Minipatch Learning for Selection, Stability, Inference, and Scalability
用于选择、稳定性、推理和可扩展性的小补丁学习
- 批准号:
2210837 - 财政年份:2022
- 资助金额:
$ 49万 - 项目类别:
Standard Grant
CAREER: New Techniques for Statistical Learning and Multivariate Analysis
职业:统计学习和多元分析新技术
- 批准号:
1554821 - 财政年份:2016
- 资助金额:
$ 49万 - 项目类别:
Continuing Grant
Multivariate Methods for High-Dimensional Transposable Data
高维转置数据的多元方法
- 批准号:
1209017 - 财政年份:2012
- 资助金额:
$ 49万 - 项目类别:
Standard Grant
相似国自然基金
Research on Quantum Field Theory without a Lagrangian Description
- 批准号:24ZR1403900
- 批准年份:2024
- 资助金额:0.0 万元
- 项目类别:省市级项目
Cell Research
- 批准号:31224802
- 批准年份:2012
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research
- 批准号:31024804
- 批准年份:2010
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Cell Research (细胞研究)
- 批准号:30824808
- 批准年份:2008
- 资助金额:24.0 万元
- 项目类别:专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
- 批准号:10774081
- 批准年份:2007
- 资助金额:45.0 万元
- 项目类别:面上项目
相似海外基金
Collaborative Research: Urban Vector-Borne Disease Transmission Demands Advances in Spatiotemporal Statistical Inference
合作研究:城市媒介传播疾病传播需要时空统计推断的进步
- 批准号:
2414688 - 财政年份:2024
- 资助金额:
$ 49万 - 项目类别:
Continuing Grant
Collaborative Research: IMR: MM-1A: Scalable Statistical Methodology for Performance Monitoring, Anomaly Identification, and Mapping Network Accessibility from Active Measurements
合作研究:IMR:MM-1A:用于性能监控、异常识别和主动测量映射网络可访问性的可扩展统计方法
- 批准号:
2319592 - 财政年份:2023
- 资助金额:
$ 49万 - 项目类别:
Continuing Grant
Collaborative Research: Enabling Hybrid Methods in the NIMBLE Hierarchical Statistical Modeling Platform
协作研究:在 NIMBLE 分层统计建模平台中启用混合方法
- 批准号:
2332442 - 财政年份:2023
- 资助金额:
$ 49万 - 项目类别:
Standard Grant
Collaborative Research: SaTC: CORE: Small: Differentially Private Data Synthesis: Practical Algorithms and Statistical Foundations
协作研究:SaTC:核心:小型:差分隐私数据合成:实用算法和统计基础
- 批准号:
2247795 - 财政年份:2023
- 资助金额:
$ 49万 - 项目类别:
Continuing Grant
Collaborative Research: SaTC: CORE: Small: Differentially Private Data Synthesis: Practical Algorithms and Statistical Foundations
协作研究:SaTC:核心:小型:差分隐私数据合成:实用算法和统计基础
- 批准号:
2247794 - 财政年份:2023
- 资助金额:
$ 49万 - 项目类别:
Continuing Grant
Collaborative Research: CIF: Medium: Statistical and Algorithmic Foundations of Distributionally Robust Policy Learning
合作研究:CIF:媒介:分布式稳健政策学习的统计和算法基础
- 批准号:
2312205 - 财政年份:2023
- 资助金额:
$ 49万 - 项目类别:
Continuing Grant
Collaborative Research: The computational and neural basis of statistical learning during musical enculturation
合作研究:音乐文化过程中统计学习的计算和神经基础
- 批准号:
2242084 - 财政年份:2023
- 资助金额:
$ 49万 - 项目类别:
Standard Grant
Collaborative Research: Conference: International Indian Statistical Association annual conference
合作研究:会议:国际印度统计协会年会
- 批准号:
2327625 - 财政年份:2023
- 资助金额:
$ 49万 - 项目类别:
Standard Grant
NSF-BSF: Collaborative Research: CIF: Small: Neural Estimation of Statistical Divergences: Theoretical Foundations and Applications to Communication Systems
NSF-BSF:协作研究:CIF:小型:统计差异的神经估计:通信系统的理论基础和应用
- 批准号:
2308445 - 财政年份:2023
- 资助金额:
$ 49万 - 项目类别:
Standard Grant
Collaborative Research: CAS-Climate: Risk Analysis for Extreme Climate Events by Combining Numerical and Statistical Extreme Value Models
合作研究:CAS-Climate:结合数值和统计极值模型进行极端气候事件风险分析
- 批准号:
2308680 - 财政年份:2023
- 资助金额:
$ 49万 - 项目类别:
Continuing Grant