Inference and computational methods for mixed models with large or complex data
具有大量或复杂数据的混合模型的推理和计算方法
基本信息
- 批准号:RGPIN-2016-05883
- 负责人:
- 金额:$ 2.38万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2020
- 资助国家:加拿大
- 起止时间:2020-01-01 至 2021-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Mixed effects models are useful in many fields of application of statistics. By taking the within-experimental-unit correlation into account, they yield inferences and predictions that are more accurate and efficient than methods that do not exploit this structure within the data. They can also protect the estimations against bias induced by unobserved variables. These properties make them attractive for many modern uses. For instance, they can be used to infer on the determinants of animal movement from data generated by GPS collars, to compute insurance premiums that take into account the near continuous-time data gathered by automobile sensors, to develop personalized medicine strategies using administrative health databases or to understand the associations between "events" in social media. Even though mixed effects models have been investigated for decades, they are still the focus of much ongoing research because further developments are required to make them better suited for modern tasks of the type described above. Indeed as part of my previous NSERC Discovery Grant we developped a Two-Step method to fit mixed models to complex response dependent binary data that is highly efficient when the data consist of a moderate number of very large clusters.
My research program over the next few years intends to build upon these recent developments and consists in developing new tools to fit mixed effects models to large datasets and/or to datasets obtained through response dependent sampling schemes. This research program innovates in many aspects. In the short term, model selection criteria that are easy to use and compute on multi-core machines will be derived and added to our R package TwoStepCLogit. This will enable the numerous end users in biology, ecology and environmental sciences to use the new methods with their ever growing GIS/GPS databases that record animal movement data. In the medium term, I will adapt our Two-Step method so that it can fit generalized mixed regression models to massive databases where a large number of clients or patients are followed longitudinally (e.g., insurance, marketing, pharmacoepidemiology, twitter and social media data). Industrial partners will likely get involved at this stage and should provide data and internship opportunities for students/postdocs. The advantage of this new method is that it will easily be amenable to highly parallelized computing. In the longer term, we will try to capitalize on the fact that the Two-Step method is based on the EM-algorithm to derive an on-line implementation of the methods (i.e., update the model fit as soon as a new data point comes in). Again, R packages to implement the methods will be made publicly available.
混合效应模型在统计学的许多应用领域都很有用。通过考虑实验单元内的相关性,它们产生的推断和预测比不利用数据中的这种结构的方法更准确和有效。它们还可以保护估计免受未观测变量引起的偏差。这些特性使它们对许多现代用途具有吸引力。例如,它们可用于从GPS项圈生成的数据中推断动物运动的决定因素,计算考虑汽车传感器收集的近连续时间数据的保险费,使用管理健康数据库开发个性化医疗策略或了解社交媒体中“事件”之间的关联。尽管混合效应模型已经被研究了几十年,但它们仍然是许多正在进行的研究的焦点,因为需要进一步的发展,以使它们更适合上述类型的现代任务。事实上,作为我以前的NSERC发现补助金的一部分,我们开发了一种两步法,用于将混合模型拟合到复杂的响应依赖二进制数据,当数据由中等数量的非常大的集群组成时,这种方法非常有效。
我的研究计划在未来几年打算建立在这些最新的发展,并包括开发新的工具,以适应混合效应模型的大型数据集和/或通过响应依赖抽样方案获得的数据集。该研究计划在许多方面进行了创新。在短期内,易于在多核机器上使用和计算的模型选择标准将被导出并添加到我们的R包TwoStepCLogit中。这将使生物学、生态学和环境科学领域的众多终端用户能够将新方法与其不断增长的记录动物运动数据的GIS/GPS数据库一起使用。在中期,我将调整我们的两步法,使其能够将广义混合回归模型拟合到大量纵向跟踪客户或患者的大型数据库中(例如,保险、营销、药物流行病学、推特和社交媒体数据)。工业合作伙伴可能会在这个阶段参与进来,并为学生/博士后提供数据和实习机会。这种新方法的优点是,它将很容易服从高度并行计算。从长远来看,我们将尝试利用两步法基于EM算法的事实来推导方法的在线实现(即,一有新的数据点就更新模型拟合)。同样,实现这些方法的R包将公开提供。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Duchesne, Thierry其他文献
Inference methods for the conditional logistic regression model with longitudinal data
- DOI:
10.1002/bimj.200610379 - 发表时间:
2008-02-01 - 期刊:
- 影响因子:1.7
- 作者:
Craiu, Radu V.;Duchesne, Thierry;Fortin, Daniel - 通讯作者:
Fortin, Daniel
A general angular regression model for the analysis of data on animal movement in ecology
- DOI:
10.1111/rssc.12124 - 发表时间:
2016-04-01 - 期刊:
- 影响因子:1.6
- 作者:
Rivest, Louis-Paul;Duchesne, Thierry;Fortin, Daniel - 通讯作者:
Fortin, Daniel
Confounding adjustment methods for multi-level treatment comparisons under lack of positivity and unknown model specification.
- DOI:
10.1080/02664763.2021.1911966 - 发表时间:
2022 - 期刊:
- 影响因子:1.5
- 作者:
Diop, S. Arona;Duchesne, Thierry;G. Cumming, Steven;Diop, Awa;Talbot, Denis - 通讯作者:
Talbot, Denis
Redefining the maximum sustainable yield for the Schaefer population model including multiplicative environmental noise
- DOI:
10.1016/j.jtbi.2008.04.025 - 发表时间:
2008-09-07 - 期刊:
- 影响因子:2
- 作者:
Bousquet, Nicolas;Duchesne, Thierry;Rivest, Louis-Paul - 通讯作者:
Rivest, Louis-Paul
On the performance of some non-parametric estimators of the conditional survival function with interval-censored data
- DOI:
10.1016/j.csda.2011.06.027 - 发表时间:
2011-12-01 - 期刊:
- 影响因子:1.8
- 作者:
Dehghan, Mohammad Hossein;Duchesne, Thierry - 通讯作者:
Duchesne, Thierry
Duchesne, Thierry的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Duchesne, Thierry', 18)}}的其他基金
Inference and computational methods for mixed models with large or complex data
具有大量或复杂数据的混合模型的推理和计算方法
- 批准号:
RGPIN-2016-05883 - 财政年份:2021
- 资助金额:
$ 2.38万 - 项目类别:
Discovery Grants Program - Individual
Development of new methods for the joint modeling of longitudinal and survival data with applications in finance and insurance
开发纵向数据和生存数据联合建模的新方法及其在金融和保险中的应用
- 批准号:
557209-2020 - 财政年份:2021
- 资助金额:
$ 2.38万 - 项目类别:
Alliance Grants
Development of new methods for the joint modeling of longitudinal and survival data with applications in finance and insurance
开发纵向数据和生存数据联合建模的新方法及其在金融和保险中的应用
- 批准号:
557209-2020 - 财政年份:2020
- 资助金额:
$ 2.38万 - 项目类别:
Alliance Grants
Inference and computational methods for mixed models with large or complex data
具有大量或复杂数据的混合模型的推理和计算方法
- 批准号:
RGPIN-2016-05883 - 财政年份:2019
- 资助金额:
$ 2.38万 - 项目类别:
Discovery Grants Program - Individual
Inference and computational methods for mixed models with large or complex data
具有大量或复杂数据的混合模型的推理和计算方法
- 批准号:
RGPIN-2016-05883 - 财政年份:2018
- 资助金额:
$ 2.38万 - 项目类别:
Discovery Grants Program - Individual
Inference and computational methods for mixed models with large or complex data
具有大量或复杂数据的混合模型的推理和计算方法
- 批准号:
RGPIN-2016-05883 - 财政年份:2017
- 资助金额:
$ 2.38万 - 项目类别:
Discovery Grants Program - Individual
Atelier de maillage en analyse de données, modélisation et aide à la décision
邮件工作室分析、建模和决策辅助
- 批准号:
505442-2016 - 财政年份:2016
- 资助金额:
$ 2.38万 - 项目类别:
Connect Grants Level 2
Inference and computational methods for mixed models with large or complex data
具有大量或复杂数据的混合模型的推理和计算方法
- 批准号:
RGPIN-2016-05883 - 财政年份:2016
- 资助金额:
$ 2.38万 - 项目类别:
Discovery Grants Program - Individual
Statistical methods for longitudinal and censored or missing data
纵向和删失或缺失数据的统计方法
- 批准号:
227119-2010 - 财政年份:2015
- 资助金额:
$ 2.38万 - 项目类别:
Discovery Grants Program - Individual
Modélisation de l'incertitude prévisionnelle des processus hydrologiques via une modélisation des processus intrants au processus hydrologique
通过水文过程内部模型的水文过程不确定性预测模型
- 批准号:
479534-2015 - 财政年份:2015
- 资助金额:
$ 2.38万 - 项目类别:
Engage Grants Program
相似国自然基金
物体运动对流场扰动的数学模型研究
- 批准号:51072241
- 批准年份:2010
- 资助金额:10.0 万元
- 项目类别:专项基金项目
Computational Methods for Analyzing Toponome Data
- 批准号:60601030
- 批准年份:2006
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
相似海外基金
Can one size fit all? - High-Resolution 3D Genome Spatial Organization Inference with Generalizable Models
一种尺寸可以适合所有人吗?
- 批准号:
10707587 - 财政年份:2023
- 资助金额:
$ 2.38万 - 项目类别:
Scalable Computational Methods for Genealogical Inference: from species level to single cells
用于谱系推断的可扩展计算方法:从物种水平到单细胞
- 批准号:
10889303 - 财政年份:2023
- 资助金额:
$ 2.38万 - 项目类别:
Inference and computational methods for regression models in the presence of partially observed network data or high-dimensional capture-recapture data
存在部分观察到的网络数据或高维捕获-重捕获数据的回归模型的推理和计算方法
- 批准号:
RGPIN-2022-03309 - 财政年份:2022
- 资助金额:
$ 2.38万 - 项目类别:
Discovery Grants Program - Individual
Inference and computational methods for regression models in the presence of partially observed network data or high-dimensional capture-recapture data
存在部分观察到的网络数据或高维捕获-重捕获数据的回归模型的推理和计算方法
- 批准号:
DGECR-2022-00441 - 财政年份:2022
- 资助金额:
$ 2.38万 - 项目类别:
Discovery Launch Supplement
Computational methods involving differential equations in computer graphics, machine learning and inference problems
计算机图形学、机器学习和推理问题中涉及微分方程的计算方法
- 批准号:
RGPIN-2022-03327 - 财政年份:2022
- 资助金额:
$ 2.38万 - 项目类别:
Discovery Grants Program - Individual
Inference and computational methods for mixed models with large or complex data
具有大量或复杂数据的混合模型的推理和计算方法
- 批准号:
RGPIN-2016-05883 - 财政年份:2021
- 资助金额:
$ 2.38万 - 项目类别:
Discovery Grants Program - Individual
Statistical Methods and Algorithms for Population Genomic Inference
群体基因组推断的统计方法和算法
- 批准号:
9886109 - 财政年份:2020
- 资助金额:
$ 2.38万 - 项目类别:
Novel Statistical Inference for Biomedical Big Data
生物医学大数据的新颖统计推断
- 批准号:
10701041 - 财政年份:2020
- 资助金额:
$ 2.38万 - 项目类别:
Statistical Methods and Algorithms for Population Genomic Inference
群体基因组推断的统计方法和算法
- 批准号:
10087945 - 财政年份:2020
- 资助金额:
$ 2.38万 - 项目类别:
Novel Statistical Inference for Biomedical Big Data
生物医学大数据的新颖统计推断
- 批准号:
10252023 - 财政年份:2020
- 资助金额:
$ 2.38万 - 项目类别: