Statistical Models and Diagnostics Tools for Spatially Correlated Skewed and Heterogeneous Data
空间相关倾斜和异构数据的统计模型和诊断工具
基本信息
- 批准号:RGPIN-2019-07212
- 负责人:
- 金额:$ 1.17万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2021
- 资助国家:加拿大
- 起止时间:2021-01-01 至 2022-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Skewed and heterogeneous data are often observed in applied research, e.g., abundance of a species related to habitat suitability in ecological studies, or number of hospitalizations in health services research. Modelling such data can be further complicated when data are geographically clustered due to unmeasured regional characteristics or repeatedly collected over time. Logarithmic or square-root transformations are often used to achieve normality and then a linear regression can be applied to the transformed data. However, in some contexts, the data contain both an abundance of zeros and high extreme values, so normality may not be easily achieved by any forms of transformation. Moreover, the transformed response variable is operated on a different scale that can mask the important information of the original response variable. Survival data is also often highly skewed, which can have cure fraction (i.e., a substantial portion of subjects never fail) as well as complex event types (i.e. multi-state or competing risk events). In recent years, there has been growing interest to capture spatial patterns in survival times for determining the possible factors that contribute towards such variability. Traditional parametric survival models cannot account for cure fraction, multiple event types as well as spatial correlation at the same time. To fill the gap in analytical methods for handling this sort of disparate data, one key focus of my research program is to develop models for spatially correlated skewed outcomes that do not require transformation of the data, but rather can be applied to the data on their original scale. The proposed modeling methods will improve the model prediction and accuracy of parameter estimates. Model diagnostics is an essential step to ensure the validity of the model, but it has been very challenging to diagnose models for skewed and heterogeneous data with discreteness or incomplete information due to censoring, partly because the traditional residuals have complicated reference distributions that are dependent on the parameters in the model. To fill this gap, we recently extended randomized quantile residuals for diagnosing zero-inflated mixed-effects models and parametric survival models. This method will be further developed to diagnose spatial and spatial-temporal models and spatial survival models. The extended model diagnosis methods will guide researchers developing better models and drawing more reliable conclusion from their data. The proposed theoretical work is driven by real-life projects, with applicability to various aspects of applied research. The research outcomes are anticipated to contribute to statistical theory and practice by providing feasible, efficient, and robust approaches, and the associated training will produce highly qualified statisticians. Packages in R will be made available to assist in the dissemination and implementation of the proposed models to a wide audience.
在应用研究中经常观察到偏斜和异构数据,例如,在生态学研究中,与栖息地适宜性相关的物种丰度,或在卫生服务研究中的住院人数。如果数据因未计量的区域特征而在地理上聚集在一起,或随着时间的推移而重复收集,则对此类数据进行建模可能会更加复杂。 对数或平方根变换通常用于实现正态性,然后可以将线性回归应用于变换后的数据。然而,在某些情况下,数据包含大量的零和高极值,因此任何形式的变换都不容易实现正态性。此外,变换后的响应变量在不同的尺度上操作,这可能掩盖原始响应变量的重要信息。存活数据也经常是高度偏斜的,其可以具有治愈分数(即,大部分受试者从未失败)以及复杂事件类型(即,多状态或竞争风险事件)。近年来,人们越来越感兴趣的是捕捉生存时间的空间模式,以确定可能的因素,有助于这种变化。传统的参数生存模型不能同时考虑治愈率、多事件类型以及空间相关性。为了填补处理这种不同数据的分析方法中的差距,我的研究计划的一个关键重点是开发空间相关的偏斜结果模型,这些模型不需要转换数据,而是可以应用于原始规模的数据。模型诊断是保证模型有效性的重要步骤,但由于删失数据的离散性和不完全性,传统的残差具有复杂的参考分布,且依赖于模型中的参数,因此对具有偏差和异质性的数据进行模型诊断具有很大的挑战性。为了填补这一空白,我们最近扩展了诊断零膨胀混合效应模型和参数生存模型的随机分位数残差。该方法将进一步发展,以诊断空间和时空模型和空间生存模型。扩展的模型诊断方法将指导研究人员开发更好的模型,并从他们的数据中得出更可靠的结论。拟议的理论工作是由现实生活中的项目驱动,适用于应用研究的各个方面。研究成果预计将有助于统计理论和实践,提供可行的,有效的和强大的方法,相关的培训将产生高素质的统计学家。将提供R语言的软件包,以协助向广大受众传播和实施拟议的模型。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Feng, cindy其他文献
Feng, cindy的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Feng, cindy', 18)}}的其他基金
Statistical Models and Diagnostics Tools for Spatially Correlated Skewed and Heterogeneous Data
空间相关倾斜和异构数据的统计模型和诊断工具
- 批准号:
RGPIN-2019-07212 - 财政年份:2022
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
相似国自然基金
Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:合作创新研究团队
新型手性NAD(P)H Models合成及生化模拟
- 批准号:20472090
- 批准年份:2004
- 资助金额:23.0 万元
- 项目类别:面上项目
相似海外基金
Development of datasets, inverse models, and methods for adaptive fault detection and diagnostics in commercial buildings
开发商业建筑自适应故障检测和诊断的数据集、逆模型和方法
- 批准号:
RGPIN-2017-06317 - 财政年份:2022
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
Statistical Models and Diagnostics Tools for Spatially Correlated Skewed and Heterogeneous Data
空间相关倾斜和异构数据的统计模型和诊断工具
- 批准号:
RGPIN-2019-07212 - 财政年份:2022
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
Development of datasets, inverse models, and methods for adaptive fault detection and diagnostics in commercial buildings
开发商业建筑自适应故障检测和诊断的数据集、逆模型和方法
- 批准号:
RGPIN-2017-06317 - 财政年份:2021
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
Machinery Diagnostics Using Mechanistic and Data-Driven Models
使用机械和数据驱动模型进行机械诊断
- 批准号:
RGPIN-2017-04788 - 财政年份:2021
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
Changes in the ocean's biological pump: innovative models and diagnostics
海洋生物泵的变化:创新模型和诊断
- 批准号:
DP210101650 - 财政年份:2021
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Projects
Diagnostics and advanced models for the reduction of unplanned underground conductor failures
用于减少地下导体意外故障的诊断和先进模型
- 批准号:
543705-2019 - 财政年份:2021
- 资助金额:
$ 1.17万 - 项目类别:
Collaborative Research and Development Grants
Development of datasets, inverse models, and methods for adaptive fault detection and diagnostics in commercial buildings
开发商业建筑自适应故障检测和诊断的数据集、逆模型和方法
- 批准号:
RGPIN-2017-06317 - 财政年份:2020
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
Diagnostics and advanced models for the reduction of unplanned underground conductor failures
用于减少地下导体意外故障的诊断和先进模型
- 批准号:
543705-2019 - 财政年份:2020
- 资助金额:
$ 1.17万 - 项目类别:
Collaborative Research and Development Grants
Machinery Diagnostics Using Mechanistic and Data-Driven Models
使用机械和数据驱动模型进行机械诊断
- 批准号:
RGPIN-2017-04788 - 财政年份:2020
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual
Statistical Models and Diagnostics Tools for Spatially Correlated Skewed and Heterogeneous Data
空间相关倾斜和异构数据的统计模型和诊断工具
- 批准号:
RGPIN-2019-07212 - 财政年份:2020
- 资助金额:
$ 1.17万 - 项目类别:
Discovery Grants Program - Individual