DARSaW: Developing, Assessing, and Refining Synthetic Sampling Weights to Improve Generalizability of the All of Us Research Program Data
DARSaW:开发、评估和细化合成采样权重,以提高我们所有人研究计划数据的普遍性
基本信息
- 批准号:10796237
- 负责人:
- 金额:$ 22.45万
- 依托单位:
- 依托单位国家:美国
- 项目类别:
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-09-17 至 2025-03-31
- 项目状态:未结题
- 来源:
- 关键词:AffectAll of Us Research ProgramAmericanBaseline SurveysBiomedical ResearchCalibrationCase StudyCensusesCohort StudiesCollaborationsCommunity SurveysCompensationComplexDataData SetDisclosureDiseaseDisparityEffectivenessEthnic OriginGenderGeographic LocationsGeographyGoalsHousingHypertensionIndividualLiteratureLongitudinal cohortMasksMethodologyMethodsNational Health and Nutrition Examination SurveyObesityParticipantPhenotypePopulationPrevalenceProbabilityPublishingRaceResearch PersonnelRiskSample SizeSamplingStatistical MethodsSurveysTarget PopulationsTestingUnderrepresented PopulationsUnited StatesWeightWorkcohortdata resourcedesigndisabilityeffectiveness evaluationimprovedmachine learning methodmultimodal datarecruitresponsestatistical learningstatistics
项目摘要
Project Summary
The All of Us Research Program (All of Us) is a large-scale initiative to collect and study
multimodal data from over one million participants living in the United States (U.S.). Studies
have shown significant disparities in disease prevalence compared to the broader U.S.
population, potentially due to the overrepresentation of traditionally underrepresented groups.
The challenge that limits the representativeness of All of Us to the target U.S. population is that
the data are collected through a non-probabilistic sample design. This proposal aims to leverage
two types of external data resources from the U.S. population to construct reliable Synthetic
sampling Weights (SaW) for All of Us to mimic a probabilistic sample design and improve
generalizability. The first external data resource, National Health and Nutrition Examination
Survey (NHANES), creates a nationally representative dataset with validated sampling weights
and individual-level data made publicly available. However, NHANES’ sample size is relatively
small and can result in under-coverage. The second external data resource, the U.S. Census
and the American Community Survey (ACS), are large-scale nationwide surveys that provide
more but aggregated demographic and housing information about the U.S. population,
compensating for the limitation of NHANES. However, individual-level data are not available.
Utilizing the external data resources available in NHANES, the U.S. Census, and ACS, this
project will develop, assess, and refine Synthetic sampling Weights (DARSaW) to improve the
generalizability of All of Us to the target U.S. population. In Aim 1, we will develop the SaW for
All of Us by leveraging the individual-level data from the NHANES and rich but aggregated
summary statistics from the U.S. Census and the American Community Survey. In Aim 2, the
effectiveness of the SaW will be assessed through case studies, comparing unweighted and
SaW-weighted estimates of obesity, hypertension, and disability. We will iterate between Aims 1
and 2 to refine SaWs at the presence of discrepancy by post-calibrating to broader and deeper
aggregated statistics from the target population. The goal of this proposal is to demonstrate the
ability of the SaW to improve the generalizability of the All of Us data, enabling researchers to
draw valid conclusions about the target U.S. population.
项目摘要
我们所有的研究计划(我们所有人)都是一项大规模倡议,用于收集和学习
来自美国(美国)的一百万参与者的多模式数据(美国)。研究
与更广泛的美国相比,疾病患病率的差异很大
人口,可能是由于传统代表性不足的群体过多的代表。
限制我们所有人对目标美国人口的代表的挑战是
数据是通过非稳定样本设计收集的。该建议旨在利用
美国人群的两种类型的外部数据资源来构建可靠的合成
对我们所有人进行采样权重(SAW),以模仿概率样本设计并改进
概括性。第一个外部数据资源,国家健康和营养检查
调查(NHANES),创建一个具有经过验证权重的全国代表性数据集
和个人级别的数据公开可用。但是,NHANES的样本量相对
小,可能导致覆盖不足。第二个外部数据资源,美国人口普查
美国社区调查(ACS)是提供全国性调查的大规模调查
更多但汇总的人口和住房信息,有关美国人口的信息,
补偿NHANES的限制。但是,个人级别的数据不可用。
利用NHANES,美国人口普查和ACS可用的外部数据资源,这
项目将开发,评估和完善合成抽样权重(DARSAW),以改善
我们所有人对目标美国人群的概括性。在AIM 1中,我们将开发锯
我们所有人都通过利用NHANES的个人级别数据和丰富的数据来汇总
美国人口普查和美国社区调查的摘要统计数据。在AIM 2中
锯的有效性将通过案例研究评估,比较未加权和
对肥胖,高血压和残疾的锯估计估计值。我们将在目标1之间迭代1
和2通过降低后校准到越来越深的距
来自目标人群的总统计数据。该提议的目的是证明
锯能够提高我们所有数据的普遍性的能力,使研究人员能够
得出有关目标美国人群的有效结论。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Qingxia Chen其他文献
Qingxia Chen的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Qingxia Chen', 18)}}的其他基金
Time-dependent and bidirectional effect of oxidative stress - a missing piece of the free radical theory of cancer and its potential implications
氧化应激的时间依赖性和双向效应——癌症自由基理论的缺失部分及其潜在影响
- 批准号:
10520027 - 财政年份:2019
- 资助金额:
$ 22.45万 - 项目类别:
Time-dependent and bidirectional effect of oxidative stress - a missing piece of the free radical theory of cancer and its potential implications
氧化应激的时间依赖性和双向效应——癌症自由基理论的缺失部分及其潜在影响
- 批准号:
9887609 - 财政年份:2019
- 资助金额:
$ 22.45万 - 项目类别:
Time-dependent and bidirectional effect of oxidative stress - a missing piece of the free radical theory of cancer and its potential implications
氧化应激的时间依赖性和双向效应——癌症自由基理论的缺失部分及其潜在影响
- 批准号:
10312770 - 财政年份:2019
- 资助金额:
$ 22.45万 - 项目类别:
Time-dependent and bidirectional effect of oxidative stress - a missing piece of the free radical theory of cancer and its potential implications
氧化应激的时间依赖性和双向效应——癌症自由基理论的缺失部分及其潜在影响
- 批准号:
10063979 - 财政年份:2019
- 资助金额:
$ 22.45万 - 项目类别:
A New Approach to Correct Verification Bias Using Auxiliary Information
使用辅助信息纠正验证偏差的新方法
- 批准号:
8048932 - 财政年份:2010
- 资助金额:
$ 22.45万 - 项目类别:
A New Approach to Correct Verification Bias Using Auxiliary Information
使用辅助信息纠正验证偏差的新方法
- 批准号:
8207859 - 财政年份:2010
- 资助金额:
$ 22.45万 - 项目类别:
相似海外基金
Creating an advanced multi-ancestral resource and tools for short tandem repeat analysis in the AOURP researcher workbench
在 AOURP 研究人员工作台中创建先进的多祖先资源和工具,用于短串联重复分析
- 批准号:
10798717 - 财政年份:2023
- 资助金额:
$ 22.45万 - 项目类别:
Leveraging pleiotropy to develop polygenic risk scores for cardiometabolic diseases
利用多效性开发心脏代谢疾病的多基因风险评分
- 批准号:
10797389 - 财政年份:2023
- 资助金额:
$ 22.45万 - 项目类别:
Genetic & Social Determinants of Health: Center for Admixture Science and Technology
遗传
- 批准号:
10818088 - 财政年份:2023
- 资助金额:
$ 22.45万 - 项目类别:
Improving cross ancestry polygenic prediction of tobacco and alcohol use
改进烟草和酒精使用的跨血统多基因预测
- 批准号:
10739557 - 财政年份:2023
- 资助金额:
$ 22.45万 - 项目类别:
Identifying and Addressing Social Determinants of Health to Reduce the National Burden of and Inequities in Dementia
识别和解决健康的社会决定因素,以减轻痴呆症的国家负担和不平等现象
- 批准号:
10597433 - 财政年份:2023
- 资助金额:
$ 22.45万 - 项目类别: