Topics in statistical analysis with missing data
缺失数据统计分析的主题
基本信息
- 批准号:RGPIN-2014-04051
- 负责人:
- 金额:$ 0.8万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2018
- 资助国家:加拿大
- 起止时间:2018-01-01 至 2019-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Regression models are widely used to study the association between a response variable and a set of covariates. Standard inference methods require that there is no selection bias and that the variables of selected subjects are fully observed with no measurement errors. However, in practice, many studies involve some combinations of selection bias, missing or mismeasured data, which are caused either by design or happenstance. Complex sampling designs, including the two-phase or multi-phase design (TMD) and the partial questionnaire design (PQD), are used to reduce the cost of data collection while at the same time improve data quality; such designs produce data with certain variables missing at random in a monotone pattern (e.g., TMD) or nonmonotone patterns (e.g., PQD). Missing by happenstance may occur in any study, including designed experiments, where the missingness is often not at random and in nonmonotone patterns. The two objectives of the proposed research program are: (i) development and application of statistical methods for regression models with data missing (not) at random and data missing in arbitrary patterns, and (ii) investigation and application optimal sampling strategies for the two-phase and multi-phase studies and the PQD which maximize the information in observed data.**Models explored under (i) will include the generalized linear models, Cox proportional hazards model and partial linear models where certain variables (response or covariates) are missing and auxiliary information is sometimes available. Our recent research in this area based on parametric and semiparametric models has several promising results for both response and covariates missing problems. Parametric working regression models (Chen and Chen, 2000; Lawless and Kalbfleisch, 2011; Zhao et al., 2013) can efficiently use the information from incomplete observations and auxiliary variables to improve estimation efficiency and produce unbiased estimates for missing at random case and certain missing not at random case. Both parametric and semiparametric procedures (Chen et al., 2011) can be developed to improve consistency of the commonly used multiple imputation methods and further enhance the application of the multiple imputation methods. Semiparametric maximum likelihood estimation can be developed to deal with the general missing data problems through a piece-wise nonparametric model (Zhao, 2009) for the joint distribution of the variables with missing values and auxiliary variables either observed or constructed (Breslow et al., 2009).**Topics under (ii) will be investigated jointly with the estimation methods considered under (i). Both our theoretical and numerical studies (Zhao et al., 2012) in comparing the balance design, which is optimal in certain cases (Breslow and Cain, 1988), with sampling designs which minimize the variance of the fully parametric maximum likelihood estimator of a regression parameter of interest for normal models indicate that more efficient sampling designs for two-phase studies are possible and can be extended to the multi-phase studies and the PQD, and the optimal design strategies for normal models can be extended to more general settings and they produce better results in general. In the proposed research we will investigate design strategies for the multi-phase studies and the PQD in more general settings through both analytical and numerical methods. If the general optimal design strategies can be successfully developed and implemented they will be widely used in all fields of nature sciences and engineering involving data collections and analyses.
回归模型广泛用于研究响应变量和协变量集之间的关联。 标准的推断方法要求不存在选择偏差,所选受试者的变量被充分观察,没有测量误差。 然而,在实践中,许多研究涉及一些组合的选择偏差,缺失或错误测量的数据,这是由设计或偶然事件。 复杂的抽样设计,包括两阶段或多阶段设计(TMD)和部分问卷设计(PQD),用于降低数据收集成本,同时提高数据质量;此类设计产生的数据具有单调模式中随机缺失的某些变量(例如,TMD)或非单调模式(例如,PQD)。 偶然缺失可能发生在任何研究中,包括设计的实验,其中缺失通常不是随机的,而是非单调的。 拟议的研究计划的两个目标是:(i)开发和应用统计方法的回归模型的数据缺失(不)随机和数据缺失的任意模式,和(ii)调查和应用最佳抽样策略的两个阶段和多阶段的研究和PQD最大限度地提高观测数据中的信息。在(i)项下探索的模型将包括广义线性模型、考克斯比例风险模型和部分线性模型,其中某些变量(响应或协变量)缺失,辅助信息有时可用。 我们最近在这方面的研究基于参数和半参数模型的响应和协变量缺失问题有几个有前途的结果。 参数工作回归模型(Chen和Chen,2000年; Lawless和Kalbfleisch,2011年; Zhao等人,2013)可以有效地利用不完全观测值和辅助变量的信息,提高估计效率,并对随机缺失和某些非随机缺失情形给出无偏估计。 参数和半参数程序(Chen等人,2011),以提高常用的多重插补方法的一致性,并进一步加强多重插补方法的应用。 半参数最大似然估计可以通过分段非参数模型(Zhao,2009)来处理一般的缺失数据问题,该模型用于具有缺失值的变量和观察到的或构建的辅助变量的联合分布(Breslow等人,2009年)。(ii)项下的专题将与(i)项下考虑的估算方法一并研究。 我们的理论和数值研究(Zhao等人,2012)在比较平衡设计时,这在某些情况下是最佳的(Breslow和Cain,1988),采用最小化正态模型相关回归参数的全参数最大似然估计方差的抽样设计,表明两阶段研究的更有效抽样设计是可能的,并可扩展到多阶段研究和PQD,正态模型的最优设计策略可以推广到更一般的情况,并且它们在一般情况下产生更好的结果。 在拟议的研究中,我们将通过分析和数值方法研究多阶段研究和PQD在更一般的设置中的设计策略。 如果通用优化设计策略能够成功地开发和实施,它们将广泛应用于自然科学和工程的所有领域,涉及数据收集和分析。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Zhao, Yang其他文献
Dinuclear Dysprosium and Ytterbium Complexes Incorporating N,N-Bis(pyrrolyl-alpha-methyl)-N-methylamine Ligand: Syntheses and Structures
含有 N,N-双(吡咯基-α-甲基)-N-甲胺配体的双核镝和镱配合物:合成和结构
- DOI:
- 发表时间:
- 期刊:
- 影响因子:1.4
- 作者:
Zhang, Chongguang;Cheng, Xiaojuan;Li, Yahong;Zhao, Yang;Zheng, Lina;Zhang, Yong;Zhang, Suyun;Zhou, Fengying - 通讯作者:
Zhou, Fengying
Measuring Micro/Meso Deformation Field of Geo-Materials with SEM and Digital Image Correlation Method
扫描电镜和数字图像相关法测量土工材料微细观变形场
- DOI:
10.1166/asl.2011.1613 - 发表时间:
2011-04 - 期刊:
- 影响因子:0
- 作者:
Zuo, Jian-Ping;Zhao, Yang;Chai, Neng-Bin;Wang, Huai-Wen - 通讯作者:
Wang, Huai-Wen
Geological Characteristics of Low-Yield and Low-Efficiency CBM Wells and Practical Measures for Production Increase in the Qinshui Basin.
- DOI:
10.1021/acsomega.3c05358 - 发表时间:
2023-12-19 - 期刊:
- 影响因子:4.1
- 作者:
Zhang, Zhou;Ren, Junshan;Zhao, Yang;Wang, Meizhu;Yang, Jiaosheng;Zhang, Cong - 通讯作者:
Zhang, Cong
Energy density response prediction of structures with uncertainty using interval analysis method
利用区间分析法预测不确定结构的能量密度响应
- DOI:
- 发表时间:
2012 - 期刊:
- 影响因子:0
- 作者:
Zhao, Yang;Wang, Kun - 通讯作者:
Wang, Kun
Measuring the Non-Linear Relationship between Three-Dimensional Built Environment and Urban Vitality Based on a Random Forest Model.
基于随机森林模型测量三维建筑环境与城市活力之间的非线性关系
- DOI:
10.3390/ijerph20010734 - 发表时间:
2022-12-30 - 期刊:
- 影响因子:0
- 作者:
Lin, Jinyao;Zhuang, Yaye;Zhao, Yang;Li, Hua;He, Xiaoyu;Lu, Siyan - 通讯作者:
Lu, Siyan
Zhao, Yang的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Zhao, Yang', 18)}}的其他基金
Development of High-Performance and Safe Next-Generation All-Solid-State Na Batteries
高性能、安全的下一代全固态钠电池的开发
- 批准号:
RGPIN-2021-03392 - 财政年份:2022
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Methods for statistical analysis with nonmonotone missing at random data
随机数据非单调缺失的统计分析方法
- 批准号:
DDG-2022-00021 - 财政年份:2022
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Development Grant
Ultrasonic Scanner for Solid-state Battey Interface Stability Study
用于固态电池界面稳定性研究的超声波扫描仪
- 批准号:
RTI-2022-00509 - 财政年份:2021
- 资助金额:
$ 0.8万 - 项目类别:
Research Tools and Instruments
Development of High-Performance and Safe Next-Generation All-Solid-State Na Batteries
高性能、安全的下一代全固态钠电池的开发
- 批准号:
RGPIN-2021-03392 - 财政年份:2021
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Development of High-Performance and Safe Next-Generation All-Solid-State Na Batteries
高性能、安全的下一代全固态钠电池的开发
- 批准号:
DGECR-2021-00339 - 财政年份:2021
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Launch Supplement
Topics in statistical analysis with missing data
缺失数据统计分析的主题
- 批准号:
RGPIN-2014-04051 - 财政年份:2017
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Topics in statistical analysis with missing data
缺失数据统计分析的主题
- 批准号:
RGPIN-2014-04051 - 财政年份:2016
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Topics in statistical analysis with missing data
缺失数据统计分析的主题
- 批准号:
RGPIN-2014-04051 - 财政年份:2015
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Topics in statistical analysis with missing data
缺失数据统计分析的主题
- 批准号:
RGPIN-2014-04051 - 财政年份:2014
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Topics in statistical analysis with incomplete data
不完整数据的统计分析主题
- 批准号:
327037-2009 - 财政年份:2013
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
相似国自然基金
基于随机网络演算的无线机会调度算法研究
- 批准号:60702009
- 批准年份:2007
- 资助金额:24.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Topics in statistical analysis with missing data
缺失数据统计分析的主题
- 批准号:
RGPIN-2014-04051 - 财政年份:2017
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Topics in statistical analysis with missing data
缺失数据统计分析的主题
- 批准号:
RGPIN-2014-04051 - 财政年份:2016
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Topics in statistical analysis with missing data
缺失数据统计分析的主题
- 批准号:
RGPIN-2014-04051 - 财政年份:2015
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Topics in statistical analysis with missing data
缺失数据统计分析的主题
- 批准号:
RGPIN-2014-04051 - 财政年份:2014
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Topics in statistical analysis with incomplete data
不完整数据的统计分析主题
- 批准号:
327037-2009 - 财政年份:2013
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Topics in statistical analysis with incomplete data
不完整数据的统计分析主题
- 批准号:
327037-2009 - 财政年份:2012
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Topics in statistical analysis with incomplete data
不完整数据的统计分析主题
- 批准号:
327037-2009 - 财政年份:2011
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Topics in statistical analysis with incomplete data
不完整数据的统计分析主题
- 批准号:
327037-2009 - 财政年份:2010
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Topics in statistical analysis with incomplete data
不完整数据的统计分析主题
- 批准号:
327037-2009 - 财政年份:2009
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual
Topics in life history analysis and statistical inference for incomplete data
生活史分析和不完整数据统计推断的主题
- 批准号:
8597-1998 - 财政年份:2001
- 资助金额:
$ 0.8万 - 项目类别:
Discovery Grants Program - Individual