Novel Statistical Inference for Biomedical Big Data
生物医学大数据的新颖统计推断
基本信息
- 批准号:10252023
- 负责人:
- 金额:$ 41.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-09-05 至 2024-08-31
- 项目状态:已结题
- 来源:
- 关键词:AddressAdoptionBehavioralBig Data MethodsBiologicalBiological AssayBiological MarkersCodeCollectionCommunitiesComputer softwareDataData SourcesDevelopmentDimensionsDiseaseElectronic Health RecordEvaluationFosteringGalaxyGenetic studyGoalsHeartImaging technologyIndividualLinear ModelsMeasurementMeasuresMedical ImagingMethodsModelingMolecularMultiomic DataOutcomePhenotypeProceduresR programming languageResearch PersonnelSample SizeScientistScreening procedureSoftware ToolsStructureSystemTestingTrans-Omics for Precision MedicineUncertaintyWorkbasebig biomedical datacomputational pipelinesdata integrationdesigndiverse dataeffective therapyexperimental studyheterogenous datahigh dimensionalityinterestmachine learning methodmembernovelopen sourcepublic health relevancescreeningsimulationstatistical and machine learningstructured datatooltreatment strategyuser friendly software
项目摘要
Project Summary
This project develops novel statistical inference procedures for biomedical big data (BBD), including data from diverse
omics platforms, various medical imaging technologies and electronic health records. Statistical inference, i.e., assess-
ing uncertainty, statistical significance and confidence, is a key step in computational pipelines that aim to discover new
disease mechanisms and develop effective treatments using BBD. However, the development of statistical inference
procedures for BBD has lagged behind technological advances. In fact, while point estimation and variable selection
procedures for BBD have matured over the past two decades, existing inference procedures are either limited to simple
methods for marginal inference and/or lack the ability to integrate biomedical data across multiple studies and plat-
forms. This paucity is, in large part, due to the challenges of statistical inference in high-dimensional models, where the
number of features is considerably larger than the number of subjects in the study. Motivated by our team's extensive
and complementary expertise in analyzing multi-omics data from heterogenous studies, including the TOPMed project
on which multiple team members currently collaborate, the current proposal aims to address these challenges. The first
aim of the project develops a novel inference procedure for conditional parameters in high-dimensional models based
on dimension reduction, which facilitates seamless integration of external biological information, as well as biomedical
data across multiple studies and platforms. To expand the application of this method to very high-dimensional models
that arise in BBD applications, the second aim develops a data-adaptive screening procedure for selecting an optimal
subset of relevant variables. The third aim develops a novel inference procedure for high-dimensional mixed linear
models. This method expands the application domain of high-dimensional inference procedures to studies with longitu-
dinal data and repeated measures, which arise commonly in biomedical applications. The fourth aim develops a novel
data-driven procedure for controlling the false discovery rate (FDR), which facilitates the integration of evidence from
multiple BBD sources, while minimizing the false negative rate (FNR) for optimal discovery. Upon evaluation using ex-
tensive simulation experiments and application to multi-omics data from the TOPMed project, the last aim implements
the proposed methods into easy-to-use open-source software tools leveraging the R programming language and the
capabilities of the Galaxy workflow system, thus providing an expandable platform for further developments for BBD
methods and tools.
项目摘要
该项目为生物医学大数据(BBD)开发了新的统计推断程序,包括来自不同领域的数据。
组学平台、各种医学成像技术和电子健康记录。统计推断,即,评估-
计算不确定性、统计显著性和置信度是计算管道中的关键一步,旨在发现新的
疾病机制和开发有效的治疗使用BBD。然而,统计推断的发展
BBD的程序已经落后于技术进步。事实上,当点估计和变量选择
在过去的二十年中,BBD的程序已经成熟,现有的推理程序要么局限于简单的
边缘推断的方法和/或缺乏跨多个研究和平台整合生物医学数据的能力,
forms.这种缺乏在很大程度上是由于高维模型中统计推断的挑战,其中
特征的数量远大于研究中的受试者数量。我们的团队有着广泛的
以及分析来自异质研究的多组学数据的补充专业知识,包括TOPMed项目
目前有多名小组成员就这些问题开展合作,本提案旨在应对这些挑战。第一个
该项目的目的是开发一种新的推理程序的条件参数在高维模型的基础上
降维,这有利于外部生物信息的无缝集成,以及生物医学
跨多个研究和平台的数据。将该方法的应用扩展到非常高维的模型
第二个目标是开发一个数据自适应筛选程序,用于选择最佳的
相关变量的子集。第三个目标是提出一种新的高维混合线性模型的推理方法
模型该方法将高维推理过程的应用范围扩展到了长距离的研究,
纵向数据和重复测量,这通常出现在生物医学应用中。第四个目标是发展一部小说
用于控制错误发现率(FDR)的数据驱动程序,这有助于整合来自
多个BBD源,同时最大限度地降低误报率(FNR),以实现最佳发现。在使用前-
通过对TOPMed项目中多组学数据的大量仿真实验和应用,实现了上述目标
将所提出的方法转化为易于使用的开源软件工具,利用R编程语言和
银河工作流系统的能力,从而为BBD的进一步发展提供了一个可扩展的平台
方法和工具。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
ALI SHOJAIE其他文献
ALI SHOJAIE的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('ALI SHOJAIE', 18)}}的其他基金
Novel Statistical Inference for Biomedical Big Data
生物医学大数据的新颖统计推断
- 批准号:
10701041 - 财政年份:2020
- 资助金额:
$ 41.5万 - 项目类别:
Machine Learning Tools for Discovery and Analysis of Active Metabolic Pathways
用于发现和分析活跃代谢途径的机器学习工具
- 批准号:
9899255 - 财政年份:2016
- 资助金额:
$ 41.5万 - 项目类别:
Statistical Methods for Network-Based Integrative Analysis of CVD Epigenetic Data
基于网络的 CVD 表观遗传数据综合分析统计方法
- 批准号:
9032704 - 财政年份:2015
- 资助金额:
$ 41.5万 - 项目类别:
相似海外基金
Behavioral Economic and Staffing Strategies To Increase Adoption of the ABCDEF Bundle in the ICU (BEST-ICU)
提高 ICU 中 ABCDEF 捆绑包采用率的行为经济和人员配置策略 (BEST-ICU)
- 批准号:
10650089 - 财政年份:2023
- 资助金额:
$ 41.5万 - 项目类别:
Changes in emergency department utilization associated with behavioral health crisis care adoption
与行为健康危机护理采用相关的急诊科利用率的变化
- 批准号:
10773819 - 财政年份:2023
- 资助金额:
$ 41.5万 - 项目类别:
Impact Evaluation of Behavioral Interventions to Encourage Long-Term Adoption of Eco-Friendly Agricultural Technologies for Small Scale Farmers in Developing Countries
鼓励发展中国家小规模农民长期采用生态友好型农业技术的行为干预措施的影响评估
- 批准号:
22K01478 - 财政年份:2022
- 资助金额:
$ 41.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Motion Sequencing for All: Pipelining, Distribution and Training to Enable Broad Adoption of a Next-Generation Platform for Behavioral and Neurobehavioral Analysis
全民运动测序:流水线、分发和培训,以实现下一代行为和神经行为分析平台的广泛采用
- 批准号:
10616517 - 财政年份:2019
- 资助金额:
$ 41.5万 - 项目类别:
Motion Sequencing for All: pipelining, distribution and training to enable broad adoption of a next-generation platform for behavioral and neurobehavioral analysis
全民运动测序:流水线、分发和培训,以实现下一代行为和神经行为分析平台的广泛采用
- 批准号:
9902565 - 财政年份:2019
- 资助金额:
$ 41.5万 - 项目类别:
Motion Sequencing for All: pipelining, distribution and training to enable broad adoption of a next-generation platform for behavioral and neurobehavioral analysis
全民运动测序:流水线、分发和培训,以实现下一代行为和神经行为分析平台的广泛采用
- 批准号:
10402238 - 财政年份:2019
- 资助金额:
$ 41.5万 - 项目类别:
Colorado Adoption/Twin Study of Lifespan behavioral development & cognitive aging (CATSLife)
科罗拉多州收养/双胞胎终身行为发展研究
- 批准号:
9530326 - 财政年份:2015
- 资助金额:
$ 41.5万 - 项目类别:
Colorado Adoption/Twin Study of Lifespan behavioral development & cognitive aging (CATSLife2)
科罗拉多州收养/双胞胎终身行为发展研究
- 批准号:
10432073 - 财政年份:2015
- 资助金额:
$ 41.5万 - 项目类别:
Colorado Adoption/Twin Study of Lifespan behavioral development & cognitive aging (CATSLife2)
科罗拉多州收养/双胞胎终身行为发展研究
- 批准号:
10260608 - 财政年份:2015
- 资助金额:
$ 41.5万 - 项目类别:
Colorado Adoption Project/Twin Study of Lifespan behavioral development & cognitive aging [CATSLife2]
科罗拉多州收养项目/终身行为发展的双胞胎研究
- 批准号:
10856816 - 财政年份:2015
- 资助金额:
$ 41.5万 - 项目类别:














{{item.name}}会员




