Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
基本信息
- 批准号:RGPIN-2018-04558
- 负责人:
- 金额:$ 1.31万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2020
- 资助国家:加拿大
- 起止时间:2020-01-01 至 2021-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Discrete data in the form of counts or proportions often arise in many fields of study, such as, epidemiology, biostatistics, medical and public health sciences, environmental studies and social sciences. These data often encounter over-dispersion (variance is larger than what can be predicted by a simple model, such as, the binomial or the Poisson model) and zero-inflation (more zero counts than what can be predicted by a simple model).
Regression analysis of discrete data can be further complicated by the existence of missing values in the response variable and/or in the explanatory variables (covariates). If the missingness does not depend on observed data, then the missing data are called missing completely at random (MCAR). If the missing data mechanism depends only on observed data, then the data are missing at random (MAR). The MAR is also known as ignorable missing. That is, in this case, the missing data mechanism can be ignored. If the missing data mechanism depends on both observed and unobserved data, that is, failure to observe a value depends on the value that would have been observed, then the data are called missing not at random (MNAR) in which case the missingness is nonignorable.
Longitudinal data (count/binary/continuous/survival) are frequently encountered in many subject-matter areas such as epidemiology, biostatistics, medical and public health sciences, environmental studies and social sciences. Longitudinal studies are characterized by observing the same variables repeatedly over a period of time. Usually the subjects are assumed to be independent, while the collected observations of the same subject are correlated.
Further, model selection (selecting regression variables that contribution most) procedures in large (big) data sets with many explanatory variables is important, as in practice interpreting results from a simple model is much easier.
In this research I we will develop estimation procedures in discrete data regression models (involving over-dispersion, zero-inflation, missing responses, measurement errors in covariates), model selection, and small sample bias correction of parameter estimates in longitudinal set up or otherwise.
In many applied fields sometimes it is necessary to compare effectiveness of one procedure over another (two drugs, two teaching methods, two fertilizers etc.). For example, under two biologically different conditions we are often interested in identifying differentially expressed genes. It is often the case that the assumption of equal variances of the two groups is violated for many genes where a large number of them are required to be filtered or ranked. In these cases exact tests are unavailable. In this research I plan to develop approximate procedures and compare them with existing procedures under different assumptions regarding the data distribution (normal, negative binomial, beta-binomial, Weibull, Gamma etc.).
以计数或比例形式存在的离散数据经常出现在许多研究领域,例如流行病学,生物统计学,医学和公共卫生科学,环境研究和社会科学。这些数据经常遇到过度分散(方差大于简单模型(如二项式或泊松模型)所能预测的值)和零膨胀(比简单模型所能预测的值更多的零计数)。
由于响应变量和/或解释变量(协变量)中存在缺失值,离散数据的回归分析可能会进一步复杂化。如果缺失不依赖于观测数据,则缺失数据被称为完全随机缺失(MCAR)。如果缺失数据机制仅依赖于观测数据,则数据随机缺失(MAR)。MAR也被称为可撤销的失踪。也就是说,在这种情况下,可以忽略缺失数据机制。如果缺失数据机制同时依赖于观察到的数据和未观察到的数据,也就是说,未能观察到一个值取决于本应观察到的值,那么这些数据被称为非随机缺失(MNAR),在这种情况下,缺失是不可解释的。
纵向数据(计数/二进制/连续/生存)经常在流行病学,生物统计学,医学和公共卫生科学,环境研究和社会科学等许多主题领域遇到。纵向研究的特点是在一段时间内重复观察相同的变量。通常假设受试者是独立的,而收集到的同一受试者的观察结果是相关的。
此外,在具有许多解释变量的大型(大)数据集中进行模型选择(选择贡献最大的回归变量)程序很重要,因为在实践中,解释简单模型的结果要容易得多。
在这项研究中,我们将开发离散数据回归模型的估计程序(涉及过度分散,零通货膨胀,缺失响应,协变量的测量误差),模型选择,以及纵向设置或其他参数估计的小样本偏差校正。
在许多应用领域,有时需要比较一种程序与另一种程序(两种药物,两种教学方法,两种肥料等)的有效性。例如,在两种不同的生物学条件下,我们通常对鉴定差异表达的基因感兴趣。通常的情况是,对于许多基因来说,两个组的方差相等的假设被违反,其中大量的基因需要被过滤或排序。在这些情况下,无法进行精确的测试。在这项研究中,我计划开发近似程序,并在不同的假设下将其与现有程序进行比较,这些假设涉及数据分布(正态、负二项、β-二项、Weibull、Gamma等)。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Paul, Sudhir其他文献
Constitutive Production of Catalytic Antibodies to a Staphylococcus aureus Virulence Factor and Effect of Infection
- DOI:
10.1074/jbc.m111.330043 - 发表时间:
2012-03-23 - 期刊:
- 影响因子:4.8
- 作者:
Brown, Eric L.;Nishiyama, Yasuhiro;Paul, Sudhir - 通讯作者:
Paul, Sudhir
Catalytic immunoglobulin gene delivery in a mouse model of Alzheimer's disease: prophylactic and therapeutic applications.
- DOI:
10.1007/s12035-014-8691-z - 发表时间:
2015-02 - 期刊:
- 影响因子:5.1
- 作者:
Kou, Jinghong;Yang, Junling;Lim, Jeong-Eun;Pattanayak, Abhinandan;Song, Min;Planque, Stephanie;Paul, Sudhir;Fukuchi, Ken-ichiro - 通讯作者:
Fukuchi, Ken-ichiro
Estimation for zero-inflated beta-binomial regression model with missing response data
- DOI:
10.1002/sim.7845 - 发表时间:
2018-11-20 - 期刊:
- 影响因子:2
- 作者:
Luo, Rong;Paul, Sudhir - 通讯作者:
Paul, Sudhir
Neutralization of genetically diverse HIV-1 strains by IgA antibodies to the gp120-CD4-binding site from long-term survivors of HIV infection
- DOI:
10.1097/qad.0b013e3283376e88 - 发表时间:
2010-03-27 - 期刊:
- 影响因子:3.8
- 作者:
Planque, Stephanie;Salas, Maria;Paul, Sudhir - 通讯作者:
Paul, Sudhir
Catalytic antibodies to amyloid β peptide in defense against Alzheimer disease
- DOI:
10.1016/j.autrev.2008.03.004 - 发表时间:
2008-05-01 - 期刊:
- 影响因子:13.6
- 作者:
Taguchi, Hiroaki;Planque, Stephanie;Paul, Sudhir - 通讯作者:
Paul, Sudhir
Paul, Sudhir的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Paul, Sudhir', 18)}}的其他基金
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2022
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2021
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2019
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2018
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
GLM, GLMM, GEE for Correlated Discrete Data with Over-dispersion, Zero-inflation, Measurement Error and Misspecification
GLM、GLMM、GEE,用于具有过度离散、零膨胀、测量误差和错误指定的相关离散数据
- 批准号:
8593-2013 - 财政年份:2017
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
GLM, GLMM, GEE for Correlated Discrete Data with Over-dispersion, Zero-inflation, Measurement Error and Misspecification
GLM、GLMM、GEE,用于具有过度离散、零膨胀、测量误差和错误指定的相关离散数据
- 批准号:
8593-2013 - 财政年份:2016
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
GLM, GLMM, GEE for Correlated Discrete Data with Over-dispersion, Zero-inflation, Measurement Error and Misspecification
GLM、GLMM、GEE,用于具有过度离散、零膨胀、测量误差和错误指定的相关离散数据
- 批准号:
8593-2013 - 财政年份:2015
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
GLM, GLMM, GEE for Correlated Discrete Data with Over-dispersion, Zero-inflation, Measurement Error and Misspecification
GLM、GLMM、GEE,用于具有过度离散、零膨胀、测量误差和错误指定的相关离散数据
- 批准号:
8593-2013 - 财政年份:2014
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
GLM, GLMM, GEE for Correlated Discrete Data with Over-dispersion, Zero-inflation, Measurement Error and Misspecification
GLM、GLMM、GEE,用于具有过度离散、零膨胀、测量误差和错误指定的相关离散数据
- 批准号:
8593-2013 - 财政年份:2013
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Generalized linear models with zero-inflation and/or ever-dispersion with covariate measurement errors, methods for longitudinal and clustered data and finite mixture models
具有协变量测量误差的零膨胀和/或不断离散的广义线性模型、纵向和聚类数据的方法以及有限混合模型
- 批准号:
8593-2008 - 财政年份:2012
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
相似海外基金
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2022
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2021
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Parametric and/or Semi-parametric Dynamic Mixed Models for Discrete Spatial and/or Longitudinal Data
离散空间和/或纵向数据的参数和/或半参数动态混合模型
- 批准号:
RGPIN-2015-04503 - 财政年份:2019
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2019
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Parametric and/or Semi-parametric Dynamic Mixed Models for Discrete Spatial and/or Longitudinal Data
离散空间和/或纵向数据的参数和/或半参数动态混合模型
- 批准号:
RGPIN-2015-04503 - 财政年份:2018
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Discrete and/or Longitudinal Data (small/big) analysis and The Behrens-Fisher problem
离散和/或纵向数据(小/大)分析和 Behrens-Fisher 问题
- 批准号:
RGPIN-2018-04558 - 财政年份:2018
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Parametric and/or Semi-parametric Dynamic Mixed Models for Discrete Spatial and/or Longitudinal Data
离散空间和/或纵向数据的参数和/或半参数动态混合模型
- 批准号:
RGPIN-2015-04503 - 财政年份:2017
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Parametric and/or Semi-parametric Dynamic Mixed Models for Discrete Spatial and/or Longitudinal Data
离散空间和/或纵向数据的参数和/或半参数动态混合模型
- 批准号:
RGPIN-2015-04503 - 财政年份:2016
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual
Exploring Joint Modeling Approaches for Longitudinal Data: Parsimonious Correlation Modeling and Discrete Observations
探索纵向数据的联合建模方法:简约相关建模和离散观测
- 批准号:
1533956 - 财政年份:2015
- 资助金额:
$ 1.31万 - 项目类别:
Standard Grant
Parametric and/or Semi-parametric Dynamic Mixed Models for Discrete Spatial and/or Longitudinal Data
离散空间和/或纵向数据的参数和/或半参数动态混合模型
- 批准号:
RGPIN-2015-04503 - 财政年份:2015
- 资助金额:
$ 1.31万 - 项目类别:
Discovery Grants Program - Individual