Sampling designs and statistical methods for incomplete data analysis

不完全数据分析的抽样设计和统计方法

基本信息

  • 批准号:
    RGPIN-2014-04904
  • 负责人:
  • 金额:
    $ 1.09万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2016
  • 资助国家:
    加拿大
  • 起止时间:
    2016-01-01 至 2017-12-31
  • 项目状态:
    已结题

项目摘要

Many studies involve incomplete data which may arise due to the study design under consideration, or missingness by happenstance. For example, budgetary constraints may prevent measuring expensive covariates for all individuals in a cohort which leads us to consider appropriate sampling designs and to have incomplete data as a result of the sampling design. The main goal of this research proposal is to identify efficient sampling designs under such a situation in different data type settings, and develop, evaluate and apply statistical methods for incomplete data analysis. To limit sample size for obtaining values of expensive covariates while giving adequate power for their corresponding association tests, the best solution is to obtain them under a cost-efficient sampling design and to use efficient statistical methods that lead to efficient estimates and powerful association tests. It is known that selecting an informative subset of individuals from an existing cohort based on response-dependent sampling can improve cost-efficiency of studies. We will consider outcome-dependent two-phase sampling designs: in phase one, we have easily measured variables for all individuals in the cohort or in a large random sample from the population, and in phase two, we obtain expensive variables for a subset of individuals selected according to their response variable obtained in phase one. In response-dependent sampling designs, inference based on standard statistical methods, ignoring the selection, may be misleading. There have been many studies on developing methods to efficiently analyze response-dependent multi-phase sampling designs in the literature. However, there has not been sufficient work on identifying efficient outcome-dependent multi-phase sampling designs. The objective of this proposal is to develop analytic and simulation-based approaches to compare various sampling designs under each proposed method according to the allocation of the phase two samples, the distribution of the expensive covariate and associated effect size, as well as to check the robustness of methods under misspecification of model assumptions. We will consider response-dependent sampling designs under different data type settings. For example, the response variable can be a time-to-event variable, which may not be completely observed but censored for some individuals; or there can be multiple continuous uncensored or time-to-event response variables and the sampling in the second phase may depend on multiple response variables. This study will be helpful in addressing how to optimally sample subjects to obtain the best power to identify the associations between response variable(s) and expensive covariate(s) for a given sample size, and which method of analysis leads to more powerful association tests under specified modeling assumptions. It will also be helpful to identify sampling designs and statistical methods which are less than optimal but may be more robust to model misspecification. Hence, it is anticipated that we will have a better understanding about cost-efficient sampling designs that will be beneficial in reducing costs of many research studies in Natural Sciences, Social Sciences and Health Sciences, and therefore, be beneficial to the economy of Canada. In addition, statistical methods will be developed under complex modeling assumptions for response-selective problems, which have not been considered deeply in the literature.
许多研究涉及不完整的数据,这些数据可能是由于正在考虑的研究设计或因偶然性而遗漏的。例如,预算限制可能会阻止测量队列中所有个人的昂贵协变量,这会导致我们考虑适当的抽样设计,并因抽样设计而产生不完整的数据。本研究方案的主要目的是在不同的数据类型设置下,确定在这种情况下的有效抽样设计,并针对不完全数据分析开发、评估和应用统计方法。为了限制获得昂贵协变量值的样本量,同时为其相应的相关性检验提供足够的权力,最好的解决方案是在成本效益高的抽样设计下获得它们,并使用有效的统计方法来获得有效的估计和强大的相关性检验。众所周知,基于响应依赖抽样从现有队列中选择信息丰富的个体子集可以提高研究的成本效率。我们将考虑与结果相关的两阶段抽样设计:在第一阶段,我们很容易测量队列中或从总体中的大随机样本中的所有个体的变量,在第二阶段,我们根据第一阶段获得的反应变量来获得选择的个体子集的昂贵变量。在依赖于回答的抽样设计中,基于标准统计方法的推断忽略了选择,可能具有误导性。 在文献中,已经有许多关于开发有效分析响应相关的多阶段抽样设计的方法的研究。然而,在确定有效的依赖于结果的多阶段抽样设计方面还没有足够的工作。这一建议的目的是发展基于分析和模拟的方法,根据第二阶段样本的分配、昂贵协变量的分布和相关效应大小来比较每种方法下的各种抽样设计,并检查在错误指定模型假设的情况下方法的稳健性。我们将考虑不同数据类型设置下的依赖于响应的抽样设计。例如,响应变量可以是事件间隔时间变量,它可能不是完全观察到的,但对于某些个体是被审查的;或者可以存在多个连续的未经审查的响应变量或事件间隔时间响应变量,并且第二阶段的抽样可能取决于多个响应变量。 这项研究将有助于解决在给定样本量的情况下,如何对被试进行最优抽样以获得最佳功率来识别响应变量(S)和昂贵协变量(S)之间的关联,以及在特定的建模假设下,哪种分析方法会导致更强大的关联检验。这也将有助于确定抽样设计和统计方法,这些设计和统计方法不是最优的,但对模型错误指定可能更健壮。因此,预计我们将对具有成本效益的抽样设计有更好的了解,这将有助于降低自然科学、社会科学和健康科学中许多研究的成本,从而有利于加拿大的经济。此外,统计方法将在复杂的建模假设下开发,用于响应选择问题,这些问题在文献中没有被深入考虑。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Yilmaz, Yildiz其他文献

Yilmaz, Yildiz的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Yilmaz, Yildiz', 18)}}的其他基金

Sampling Designs and Statistical Methods for the Analysis of Complex Life History and Genetic Data
用于分析复杂生活史和遗传数据的抽样设计和统计方法
  • 批准号:
    RGPIN-2020-05528
  • 财政年份:
    2022
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling Designs and Statistical Methods for the Analysis of Complex Life History and Genetic Data
用于分析复杂生活史和遗传数据的抽样设计和统计方法
  • 批准号:
    RGPIN-2020-05528
  • 财政年份:
    2021
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling Designs and Statistical Methods for the Analysis of Complex Life History and Genetic Data
用于分析复杂生活史和遗传数据的抽样设计和统计方法
  • 批准号:
    RGPIN-2020-05528
  • 财政年份:
    2020
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling designs and statistical methods for incomplete data analysis
不完全数据分析的抽样设计和统计方法
  • 批准号:
    RGPIN-2014-04904
  • 财政年份:
    2019
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling designs and statistical methods for incomplete data analysis
不完全数据分析的抽样设计和统计方法
  • 批准号:
    RGPIN-2014-04904
  • 财政年份:
    2018
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling designs and statistical methods for incomplete data analysis
不完全数据分析的抽样设计和统计方法
  • 批准号:
    RGPIN-2014-04904
  • 财政年份:
    2017
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling designs and statistical methods for incomplete data analysis
不完全数据分析的抽样设计和统计方法
  • 批准号:
    RGPIN-2014-04904
  • 财政年份:
    2015
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling designs and statistical methods for incomplete data analysis
不完全数据分析的抽样设计和统计方法
  • 批准号:
    RGPIN-2014-04904
  • 财政年份:
    2014
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual

相似国自然基金

图的正则性和胞腔代数
  • 批准号:
    10871027
  • 批准年份:
    2008
  • 资助金额:
    23.0 万元
  • 项目类别:
    面上项目

相似海外基金

Sampling Designs and Statistical Methods for the Analysis of Complex Life History and Genetic Data
用于分析复杂生活史和遗传数据的抽样设计和统计方法
  • 批准号:
    RGPIN-2020-05528
  • 财政年份:
    2022
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling Designs and Statistical Methods for the Analysis of Complex Life History and Genetic Data
用于分析复杂生活史和遗传数据的抽样设计和统计方法
  • 批准号:
    RGPIN-2020-05528
  • 财政年份:
    2021
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling Designs and Statistical Methods for the Analysis of Complex Life History and Genetic Data
用于分析复杂生活史和遗传数据的抽样设计和统计方法
  • 批准号:
    RGPIN-2020-05528
  • 财政年份:
    2020
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling designs and statistical methods for incomplete data analysis
不完全数据分析的抽样设计和统计方法
  • 批准号:
    RGPIN-2014-04904
  • 财政年份:
    2019
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling designs and statistical methods for incomplete data analysis
不完全数据分析的抽样设计和统计方法
  • 批准号:
    RGPIN-2014-04904
  • 财政年份:
    2018
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling designs and statistical methods for incomplete data analysis
不完全数据分析的抽样设计和统计方法
  • 批准号:
    RGPIN-2014-04904
  • 财政年份:
    2017
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling designs and statistical methods for incomplete data analysis
不完全数据分析的抽样设计和统计方法
  • 批准号:
    RGPIN-2014-04904
  • 财政年份:
    2015
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling designs and statistical methods for incomplete data analysis
不完全数据分析的抽样设计和统计方法
  • 批准号:
    RGPIN-2014-04904
  • 财政年份:
    2014
  • 资助金额:
    $ 1.09万
  • 项目类别:
    Discovery Grants Program - Individual
Statistical Designs and Methods for Double-Sampling for HIV/AIDS
HIV/艾滋病双重抽样的统计设计和方法
  • 批准号:
    8604137
  • 财政年份:
    2013
  • 资助金额:
    $ 1.09万
  • 项目类别:
Statistical Designs and Methods for Double-Sampling for HIV/AIDS
HIV/艾滋病双重抽样的统计设计和方法
  • 批准号:
    8541216
  • 财政年份:
    2013
  • 资助金额:
    $ 1.09万
  • 项目类别:
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了