Novel Statistical Inference for Biomedical Big Data
生物医学大数据的新颖统计推断
基本信息
- 批准号:10701041
- 负责人:
- 金额:$ 41.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-09-05 至 2024-08-31
- 项目状态:已结题
- 来源:
- 关键词:AddressAdoptionBehavioralBig Data MethodsBiologicalBiological AssayBiological MarkersCodeCollaborationsCollectionCommunitiesComputer softwareDataData SourcesDevelopmentDimensionsDiseaseElectronic Health RecordEvaluationFosteringGalaxyGenetic studyGoalsHeartImaging technologyIndividualLinear ModelsMeasurementMeasuresMedical ImagingMethodsModelingMolecularMultiomic DataOutcomePhenotypeProceduresR programming languageResearch PersonnelSample SizeScientistScreening procedureSoftware ToolsStructureSystemTechnologyTestingTrans-Omics for Precision MedicineUncertaintyWorkbig biomedical datacomputational pipelinesdata integrationdesigndiverse dataeffective therapyexperimental studyheterogenous datahigh dimensionalityimprovedinterestmachine learning methodmembernovelopen sourcepublic health relevancescreeningsimulationstatistical and machine learningstructured datatooltreatment strategyuser friendly software
项目摘要
Project Summary
This project develops novel statistical inference procedures for biomedical big data (BBD), including data from diverse
omics platforms, various medical imaging technologies and electronic health records. Statistical inference, i.e., assess-
ing uncertainty, statistical significance and confidence, is a key step in computational pipelines that aim to discover new
disease mechanisms and develop effective treatments using BBD. However, the development of statistical inference
procedures for BBD has lagged behind technological advances. In fact, while point estimation and variable selection
procedures for BBD have matured over the past two decades, existing inference procedures are either limited to simple
methods for marginal inference and/or lack the ability to integrate biomedical data across multiple studies and plat-
forms. This paucity is, in large part, due to the challenges of statistical inference in high-dimensional models, where the
number of features is considerably larger than the number of subjects in the study. Motivated by our team's extensive
and complementary expertise in analyzing multi-omics data from heterogenous studies, including the TOPMed project
on which multiple team members currently collaborate, the current proposal aims to address these challenges. The first
aim of the project develops a novel inference procedure for conditional parameters in high-dimensional models based
on dimension reduction, which facilitates seamless integration of external biological information, as well as biomedical
data across multiple studies and platforms. To expand the application of this method to very high-dimensional models
that arise in BBD applications, the second aim develops a data-adaptive screening procedure for selecting an optimal
subset of relevant variables. The third aim develops a novel inference procedure for high-dimensional mixed linear
models. This method expands the application domain of high-dimensional inference procedures to studies with longitu-
dinal data and repeated measures, which arise commonly in biomedical applications. The fourth aim develops a novel
data-driven procedure for controlling the false discovery rate (FDR), which facilitates the integration of evidence from
multiple BBD sources, while minimizing the false negative rate (FNR) for optimal discovery. Upon evaluation using ex-
tensive simulation experiments and application to multi-omics data from the TOPMed project, the last aim implements
the proposed methods into easy-to-use open-source software tools leveraging the R programming language and the
capabilities of the Galaxy workflow system, thus providing an expandable platform for further developments for BBD
methods and tools.
项目摘要
该项目为生物医学大数据(BBD)开发了新的统计推理程序,包括来自不同领域的数据
组学平台、各种医学成像技术和电子健康记录。统计推断,即评估-
ING不确定性,统计学意义fifi,是旨在发现新数据的计算管道中的关键一步
疾病机制和利用BBD开发有效的治疗方法。然而,统计推断的发展
BBD的程序已经落后于技术进步。事实上,虽然点估计和变量选择
BBD的程序在过去的二十年里已经成熟,现有的推理程序要么局限于简单的
边缘推理的方法和/或缺乏跨多个研究和平台集成生物医学数据的能力-
表格。这种缺乏在很大程度上是由于高维模型中统计推断的挑战,在高维模型中
特征的数量远远大于研究中的受试者数量。受到我们团队广泛的
以及在分析来自异质研究的多组学数据方面的互补专业知识,包括TOPMed项目
目前的提案旨在应对这些挑战,多名团队成员目前正在就这些问题进行合作。ThefiRst
该项目的目标是开发一种新的高维模型中条件参数的推理程序,该程序基于
关于降维,这促进了外部生物信息以及生物医学的无缝集成
跨多个研究和平台的数据。将该方法的应用扩展到超高维模型
第二个目标是开发一种数据自适应筛选程序,用于选择最优的
相关变量的子集。第三个目标是开发一种新的高维混合线性系统的推理方法
模特们。这种方法将高维推理过程的应用范围扩展到经纬度的研究。
数据和重复测量,这在生物医学应用中很常见。第四个目的是发展一部小说
数据驱动的程序,用于控制错误发现率(FDR),这有助于整合来自
多个BBD源,同时最小化假阴性率(FNR)以实现最佳发现。在评估时使用EX-
对TOPMed项目的多组学数据进行了详细的模拟实验和应用,最后实现了
将建议的方法转化为易于使用的开源软件工具,利用R编程语言和
银河Workflow系统的功能,从而为BBD的进一步开发提供了一个可扩展的平台
方法和工具。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
ALI SHOJAIE其他文献
ALI SHOJAIE的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('ALI SHOJAIE', 18)}}的其他基金
Novel Statistical Inference for Biomedical Big Data
生物医学大数据的新颖统计推断
- 批准号:
10252023 - 财政年份:2020
- 资助金额:
$ 41.5万 - 项目类别:
Machine Learning Tools for Discovery and Analysis of Active Metabolic Pathways
用于发现和分析活跃代谢途径的机器学习工具
- 批准号:
9899255 - 财政年份:2016
- 资助金额:
$ 41.5万 - 项目类别:
Statistical Methods for Network-Based Integrative Analysis of CVD Epigenetic Data
基于网络的 CVD 表观遗传数据综合分析统计方法
- 批准号:
9032704 - 财政年份:2015
- 资助金额:
$ 41.5万 - 项目类别:
相似海外基金
Behavioral Economic and Staffing Strategies To Increase Adoption of the ABCDEF Bundle in the ICU (BEST-ICU)
提高 ICU 中 ABCDEF 捆绑包采用率的行为经济和人员配置策略 (BEST-ICU)
- 批准号:
10650089 - 财政年份:2023
- 资助金额:
$ 41.5万 - 项目类别:
Changes in emergency department utilization associated with behavioral health crisis care adoption
与行为健康危机护理采用相关的急诊科利用率的变化
- 批准号:
10773819 - 财政年份:2023
- 资助金额:
$ 41.5万 - 项目类别:
Impact Evaluation of Behavioral Interventions to Encourage Long-Term Adoption of Eco-Friendly Agricultural Technologies for Small Scale Farmers in Developing Countries
鼓励发展中国家小规模农民长期采用生态友好型农业技术的行为干预措施的影响评估
- 批准号:
22K01478 - 财政年份:2022
- 资助金额:
$ 41.5万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Motion Sequencing for All: Pipelining, Distribution and Training to Enable Broad Adoption of a Next-Generation Platform for Behavioral and Neurobehavioral Analysis
全民运动测序:流水线、分发和培训,以实现下一代行为和神经行为分析平台的广泛采用
- 批准号:
10616517 - 财政年份:2019
- 资助金额:
$ 41.5万 - 项目类别:
Motion Sequencing for All: pipelining, distribution and training to enable broad adoption of a next-generation platform for behavioral and neurobehavioral analysis
全民运动测序:流水线、分发和培训,以实现下一代行为和神经行为分析平台的广泛采用
- 批准号:
9902565 - 财政年份:2019
- 资助金额:
$ 41.5万 - 项目类别:
Motion Sequencing for All: pipelining, distribution and training to enable broad adoption of a next-generation platform for behavioral and neurobehavioral analysis
全民运动测序:流水线、分发和培训,以实现下一代行为和神经行为分析平台的广泛采用
- 批准号:
10402238 - 财政年份:2019
- 资助金额:
$ 41.5万 - 项目类别:
Colorado Adoption/Twin Study of Lifespan behavioral development & cognitive aging (CATSLife)
科罗拉多州收养/双胞胎终身行为发展研究
- 批准号:
9530326 - 财政年份:2015
- 资助金额:
$ 41.5万 - 项目类别:
Colorado Adoption/Twin Study of Lifespan behavioral development & cognitive aging (CATSLife2)
科罗拉多州收养/双胞胎终身行为发展研究
- 批准号:
10432073 - 财政年份:2015
- 资助金额:
$ 41.5万 - 项目类别:
Colorado Adoption/Twin Study of Lifespan behavioral development & cognitive aging (CATSLife2)
科罗拉多州收养/双胞胎终身行为发展研究
- 批准号:
10260608 - 财政年份:2015
- 资助金额:
$ 41.5万 - 项目类别:
Colorado Adoption Project/Twin Study of Lifespan behavioral development & cognitive aging [CATSLife2]
科罗拉多州收养项目/终身行为发展的双胞胎研究
- 批准号:
10856816 - 财政年份:2015
- 资助金额:
$ 41.5万 - 项目类别: