Data science tools to identify robust exposure-phenotype associations for precision medicine
数据科学工具可识别精准医学中强大的暴露-表型关联
基本信息
- 批准号:10487388
- 负责人:
- 金额:$ 65.2万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2021
- 资助国家:美国
- 起止时间:2021-09-10 至 2026-06-30
- 项目状态:未结题
- 来源:
- 关键词:AddressAll of Us Research ProgramBig DataBiologicalBiological FactorsBiological MarkersCardiologyCatalogsCenters for Disease Control and Prevention (U.S.)Cohort StudiesCommunitiesComplexCountryDataData ScienceData SetDemographic FactorsDepositionDiabetes MellitusDiet and NutritionDisadvantagedDiseaseEnvironmentEnvironmental ExposureEnvironmental Risk FactorEpidemiologyEtiologyExhibitsGoalsHealthHeart DiseasesHumanIncidenceLeadLibrariesLinkLiteratureMachine LearningMalignant NeoplasmsMeasurementMeasuresMeta-AnalysisMetadataMethodsModelingNational Health and Nutrition Examination SurveyNational Institute of Environmental Health SciencesObservational StudyPhenotypePollutionPopulationPopulation HeterogeneityProcessReproducibilityResearch DesignResearch PersonnelResourcesRisk FactorsRoleSample SizeSamplingTestingTimeTranslationsUnited States National Institutes of HealthVariantanalytical methodbasebiobankcohortdata resourcedeep learningdisease disparitydisease phenotypedisorder riskenvironmental health disparityfeature selectiongenetic risk factorhealth differencehealth disparityhypercholesterolemiamachine learning methodnovelphenomeprecision medicinescale uptoolvibration
项目摘要
Project Summary/Abstract
Phenotypic variability across demographically diverse populations are driven by environmental factors. The
overall goal of this proposal is to deploy data science approaches to drive discovery of associations between
exposures (E) and phenotypes (P) in demographically diverse populations. We lack data science methods to
associate, replicate, and prioritize exposure variables of the exposome (E) in phenotypes (P) and disease
incidence (D), required for the delivery of precision medicine. Observational studies are fraught with 4 unsolved
data science challenges. First, E-based studies are: (1) limited to associating a few hypothesized exposure-
phenotype pairs (E-P) at a time, leading to a fragmented literature of environmental associations. Machine
learning (ML) approaches for feature selection and prediction hold promise, however, (2) most extant E-based
cohorts contain missing data, challenging the use of ML to detect complex E-P associations, Third, (3) biases,
such as confounding and study design influence associations and hinder translation. Fourth, (4) there are few
well-powered data resources that systematically document longitudinal E-P and E-D associations across
massive precision medicine. It is a challenge to systematically associate a number of exposures in multiple
phenotypes and replicate these associations across cohorts. (Aim 1). The “vibration of effects”, or the degree
to which associations change as a function of study design (e.g., analytic method, sample size) and model
choice is a hidden bias in observational studies (Aim 2). Third, an outstanding question is the degree to which
environmental differences lead to health disparities. To address these challenges and gaps, we propose to Aim
1: develop and test machine learning methods to associate multiple environmental exposure indicators with
multiple phenotypes: EP-WAS. We hypothesize that exposures will explain a significant amount of variation in
phenotype in populations and will deposit all data and models in a novel EP-WAS Catalog. Aim 2: Quantitate
how study design influences associations between exposure biomarkers and phenotype. We will scale up,
extend, and test a method called “vibration of effects” (VoE) to measure how study criteria influences the
stability of associations (how reproducible associations are as a function of analytic choice). Aim 3. Leverage
EP-WAS and VoE to disentangle biological, demographic, and environmental influences of phenotypic
disparities in hypercholesterolemia. We will deploy EP-WAS and VoE packaged libraries in the largest cohort
study to partition phenotypic variation across demographic groups in factors for hypercholesterolemia. We will
equip the biomedical community with data science approaches for robust data-driven discovery and
interpretation of exposure-phenotype factors in observational datasets, required for the identification of
environmental health disparities. For the first time, investigators will ascertain the collective role of the
environment in heart disease at scale just in time for the All of Us program.
项目总结/摘要
在人口统计学上不同的人群中,表型变异是由环境因素驱动的。的
该提案的总体目标是部署数据科学方法,以推动发现
暴露(E)和表型(P)在人口统计学上不同的人群。我们缺乏数据科学方法,
将表型(P)和疾病中的干扰素组(E)的暴露变量关联、复制和优先化
发病率(D),需要提供精确的医疗。观察性研究充满了4个未解决的问题
数据科学挑战首先,基于电子的研究:(1)仅限于将一些假设的暴露-
一次产生表型对(E-P),导致环境关联文献支离破碎。机
然而,用于特征选择和预测的学习(ML)方法有希望,(2)大多数现存的基于E的
队列包含缺失的数据,挑战使用ML来检测复杂的E-P关联,第三,(3)偏差,
如混淆和研究设计影响联想和阻碍翻译。第四,(4)数量少
强大的数据资源,系统地记录纵向E-P和E-D关联,
大规模精准医疗将多个风险中的多个风险系统地联系起来是一个挑战,
表型,并在整个队列中复制这些关联。(Aim 1)。“效应的振动”,或程度
其中关联作为研究设计的函数而改变(例如,分析方法、样本量)和模型
选择是观察性研究中的一种隐藏偏倚(目标2)。第三,一个突出的问题是,
环境差异导致健康差距。为了应对这些挑战和差距,我们建议Aim
1:开发和测试机器学习方法,将多种环境暴露指标与
多种表型:EP-WAS。我们假设,暴露将解释大量的变化,
表型,并将所有数据和模型存款在一个新的EP-WAS目录。目的2:定量
研究设计如何影响暴露生物标志物和表型之间的关联。我们会扩大规模,
扩展并测试一种称为“效应振动”(VoE)的方法,以衡量研究标准如何影响
关联的稳定性(关联的可复制性如何作为分析选择的函数)。目标3.杠杆
EP-WAS和VoE来解开表型的生物学、人口统计学和环境影响,
高胆固醇血症的差异。我们将在最大的队列中部署EP-WAS和VoE打包库
研究划分高胆固醇血症因素中人口统计学群体的表型变异。我们将
为生物医学界提供数据科学方法,以实现强大的数据驱动发现,
在观察数据集中解释确定表型因素,需要识别
环境卫生差距。调查人员将首次确定
心脏病的大规模环境正好赶上我们所有人的计划。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
ARJUN KUMAR MANRAI其他文献
ARJUN KUMAR MANRAI的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('ARJUN KUMAR MANRAI', 18)}}的其他基金
Data science tools to identify robust exposure-phenotype associations for precision medicine
数据科学工具可识别精准医学中强大的暴露-表型关联
- 批准号:
10705899 - 财政年份:2022
- 资助金额:
$ 65.2万 - 项目类别:
Precision Cardiovascular Medicine for Multi-Ethnic Populations
为多民族人群提供精准心血管医学
- 批准号:
10582991 - 财政年份:2022
- 资助金额:
$ 65.2万 - 项目类别:
Data science tools to identify robust exposure-phenotype associations for precision medicine
数据科学工具可识别精准医学中强大的暴露-表型关联
- 批准号:
10653214 - 财政年份:2021
- 资助金额:
$ 65.2万 - 项目类别:
Data science tools to identify robust exposure-phenotype associations for precision medicine
数据科学工具可识别精准医学中强大的暴露-表型关联
- 批准号:
10874056 - 财政年份:2021
- 资助金额:
$ 65.2万 - 项目类别:
Data science tools to identify robust exposure-phenotype associations for precision medicine
数据科学工具可识别精准医学中强大的暴露-表型关联
- 批准号:
10095924 - 财政年份:2021
- 资助金额:
$ 65.2万 - 项目类别:
Precision Cardiovascular Medicine for Multi-Ethnic Populations
为多民族人群提供精准心血管医学
- 批准号:
9917879 - 财政年份:2018
- 资助金额:
$ 65.2万 - 项目类别:
相似海外基金
The Illinois Precision Medicine Consortium (IPMC) All of Us Research Program Site
伊利诺伊州精准医学联盟 (IPMC) All of Us 研究计划网站
- 批准号:
10872859 - 财政年份:2023
- 资助金额:
$ 65.2万 - 项目类别:
Nutrition for Precision Health, powered by the All of Us Research Program: Research Coordinating Center
精准健康营养,由“我们所有人研究计划”提供支持:研究协调中心
- 批准号:
10874354 - 财政年份:2023
- 资助金额:
$ 65.2万 - 项目类别:
All of Us Research Program Trans-America Consortium of the HCSRN
我们所有人研究计划 HCSRN 泛美联盟
- 批准号:
10871074 - 财政年份:2023
- 资助金额:
$ 65.2万 - 项目类别:
All of Us Research Program Heartland Consortium (AoURP-HC)
我们所有人研究计划中心联盟 (AoURP-HC)
- 批准号:
10871732 - 财政年份:2023
- 资助金额:
$ 65.2万 - 项目类别:
DARSaW: Developing, Assessing, and Refining Synthetic Sampling Weights to Improve Generalizability of the All of Us Research Program Data
DARSaW:开发、评估和细化合成采样权重,以提高我们所有人研究计划数据的普遍性
- 批准号:
10796237 - 财政年份:2023
- 资助金额:
$ 65.2万 - 项目类别:
Engaging Diverse Stakeholders in Genomic/Precision Medicine Research: The All of Us Research Program Engagement Core
让不同的利益相关者参与基因组/精准医学研究:我们所有人研究计划的参与核心
- 批准号:
10789515 - 财政年份:2023
- 资助金额:
$ 65.2万 - 项目类别:
Investigation of the social context and physical environment on cardiovascular disease disparities in the All of Us Research Program
“我们所有人研究计划”中心血管疾病差异的社会背景和物理环境调查
- 批准号:
10798725 - 财政年份:2023
- 资助金额:
$ 65.2万 - 项目类别:
The Participant Center: Empowering All of Us Research Program participation across the United States
参与者中心:增强我们所有人参与美国各地研究计划的能力
- 批准号:
10774158 - 财政年份:2023
- 资助金额:
$ 65.2万 - 项目类别:
Nutrition for Precision Health, powered by the All of Us Research Program: Research Coordinating Center
精准健康营养,由“我们所有人研究计划”提供支持:研究协调中心
- 批准号:
10757488 - 财政年份:2023
- 资助金额:
$ 65.2万 - 项目类别:
Multilevel analyses of oral health conditions among older adults in the All of Us Research Program
“我们所有人研究计划”中老年人口腔健康状况的多层次分析
- 批准号:
10658463 - 财政年份:2022
- 资助金额:
$ 65.2万 - 项目类别:














{{item.name}}会员




