Methodologies for Modeling and Analyzing Massive Environmental and Biomedical Data Sets

大量环境和生物医学数据集的建模和分析方法

基本信息

  • 批准号:
    RGPIN-2014-05193
  • 负责人:
  • 金额:
    $ 0.8万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2018
  • 资助国家:
    加拿大
  • 起止时间:
    2018-01-01 至 2019-12-31
  • 项目状态:
    已结题

项目摘要

Nowadays, high throughput data arising for instance in biostatistics-especially those observed in connection with gene expression studies-and in the environmental sciences-for instance, meteorological data transmitted from satellites-must be rapidly analyzed. Innovative techniques such as those based on samples moments, which the applicant has previously advocated, or those relying on the Bayesian approach, which are discussed in several of his papers, shall be further developed and adapted to data mine such massive sets of observations. Since complex data frequently involve several variables, I also plan to extend the semi-parametric univariate moment-based density estimation techniques that I have introduced to the multivariate context. Novel multivariate data visualization techniques that would be suited to certain types of large data sets shall be proposed as well. Extant distributional results on singular quadratic forms in Gaussian and elliptically contoured vectors shall be extended to the Hermitian case and to generalized quadratic expressions, which involve random matrices in lieu of random vectors. The bivariate density estimation techniques introduced by the applicant at the last annual meeting of The International Environmentrics Society, which consists in expressing joint density estimates in terms the product of the density estimates of the marginal distributions and a polynomial adjustment whose coefficients are determined from a moment matching technique, will be extended to multivariate settings. Once evaluated at the inverse distribution functions of the marginals, such a polynomial turns out to be a copula density. This approach arguably gives rise to the most flexible type of copulae one could devise. This methodology shall be applied to colossal data sets arising from various fields of scientific investigation such as environmetrics, financial modeling, econometrics and genomic studies. Being merely based on a finite number of joint sample moments, such techniques should prove more suitable than, for instance, kernel density estimates for modeling series of observations that can be construed as "big data", as they readily produce density estimates in a functional form that lends itself to algebraic manipulations. Given their computational simplicity, moment-based data mining methods ought to efficiently assist researchers in detecting anomalies, patterns and dependencies in large and complex data sets. I also intend to develop software documentation and source code to facilitate the implementation of the aforementioned distributional methodologies. Additionally, monographs on the evaluation of the distribution of various types of quadratic forms and on moment-based density estimation and approximation techniques are planned.
如今,生物统计学中产生的高通量数据--特别是在基因表达研究中观察到的数据--以及环境科学中产生的高通量数据--例如从卫星传输的气象数据--必须得到快速分析。创新技术,如申请人以前倡导的基于样本矩的技术,或依赖于贝叶斯方法的技术,在他的几篇论文中讨论过,应进一步开发和调整,以对如此大规模的观测集进行数据挖掘。由于复杂数据往往涉及多个变量,我还计划将我介绍的基于单变量矩的半参数密度估计技术扩展到多变量环境。还将提出适用于某些类型的大型数据集的新颖的多变量数据可视化技术。关于高斯和椭圆等值线向量中奇异二次型的现有分布结果应推广到厄米特情形和广义二次表达式,其中涉及随机矩阵而不是随机向量。申请人在国际环境学会上一次年会上介绍的双变量密度估计技术,包括用边缘分布的密度估计和多项式平差的乘积表示联合密度估计,其系数由矩匹配技术确定,将扩展到多变量环境。一旦在边缘的逆分布函数上求值,这样的多项式就是Copula密度。这种方法可以说产生了人们所能设计出的最灵活的交配类型。这种方法应适用于科学研究的各个领域,如环境计量学、金融建模、计量经济学和基因组研究等领域产生的海量数据集。这种技术仅仅基于有限数量的联合样本矩,应该证明比例如核密度估计更适合于对可被解释为“大数据”的一系列观测进行建模,因为它们很容易产生函数形式的密度估计,这使其本身易于进行代数处理。考虑到它们的计算简单性,基于矩的数据挖掘方法应该可以有效地帮助研究人员在大型和复杂的数据集中发现异常、模式和依赖关系。我还打算编写软件文件和源代码,以促进上述分配方法的实施。此外,还计划出版关于评估各种二次型的分布以及基于矩的密度估计和近似技术的专著。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Provost, Serge其他文献

Provost, Serge的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Provost, Serge', 18)}}的其他基金

Big Data Modeling via Moment-Based Methodologies and the Statistical Analysis of Spatio-Temporal Measurements
通过基于矩的方法进行大数据建模以及时空测量的统计分析
  • 批准号:
    RGPIN-2019-06323
  • 财政年份:
    2022
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Discovery Grants Program - Individual
Big Data Modeling via Moment-Based Methodologies and the Statistical Analysis of Spatio-Temporal Measurements
通过基于矩的方法进行大数据建模以及时空测量的统计分析
  • 批准号:
    RGPIN-2019-06323
  • 财政年份:
    2021
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Discovery Grants Program - Individual
Big Data Modeling via Moment-Based Methodologies and the Statistical Analysis of Spatio-Temporal Measurements
通过基于矩的方法进行大数据建模以及时空测量的统计分析
  • 批准号:
    RGPIN-2019-06323
  • 财政年份:
    2020
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Discovery Grants Program - Individual
Big Data Modeling via Moment-Based Methodologies and the Statistical Analysis of Spatio-Temporal Measurements
通过基于矩的方法进行大数据建模以及时空测量的统计分析
  • 批准号:
    RGPIN-2019-06323
  • 财政年份:
    2019
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Discovery Grants Program - Individual
Methodologies for Modeling and Analyzing Massive Environmental and Biomedical Data Sets
大量环境和生物医学数据集的建模和分析方法
  • 批准号:
    RGPIN-2014-05193
  • 财政年份:
    2017
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Discovery Grants Program - Individual
Methodologies for Modeling and Analyzing Massive Environmental and Biomedical Data Sets
大量环境和生物医学数据集的建模和分析方法
  • 批准号:
    RGPIN-2014-05193
  • 财政年份:
    2016
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Discovery Grants Program - Individual
Methodologies for Modeling and Analyzing Massive Environmental and Biomedical Data Sets
大量环境和生物医学数据集的建模和分析方法
  • 批准号:
    RGPIN-2014-05193
  • 财政年份:
    2015
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Discovery Grants Program - Individual
Methodologies for Modeling and Analyzing Massive Environmental and Biomedical Data Sets
大量环境和生物医学数据集的建模和分析方法
  • 批准号:
    RGPIN-2014-05193
  • 财政年份:
    2014
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Discovery Grants Program - Individual
Advances in distribution theory with applications to transportation logistics and statiscal genesis
分配理论的进展及其在运输物流和统计生成中的应用
  • 批准号:
    8666-2009
  • 财政年份:
    2013
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Discovery Grants Program - Individual
Advances in distribution theory with applications to transportation logistics and statiscal genesis
分配理论的进展及其在运输物流和统计生成中的应用
  • 批准号:
    8666-2009
  • 财政年份:
    2012
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Discovery Grants Program - Individual

相似国自然基金

Galaxy Analytical Modeling Evolution (GAME) and cosmological hydrodynamic simulations.
  • 批准号:
  • 批准年份:
    2025
  • 资助金额:
    10.0 万元
  • 项目类别:
    省市级项目

相似海外基金

Collaborative Research: SaTC: CORE: Small: Privately Collecting and Analyzing V2X Data for Urban Traffic Modeling
合作研究:SaTC:核心:小型:私下收集和分析用于城市交通建模的 V2X 数据
  • 批准号:
    2302689
  • 财政年份:
    2022
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Standard Grant
Modeling, Analyzing and Managing Insurance Risks
保险风险建模、分析和管理
  • 批准号:
    RGPIN-2019-05640
  • 财政年份:
    2022
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Discovery Grants Program - Individual
Adaptive Data Processing, Modeling, and Quantification Methods for Analyzing Cardiac Fibrillation
用于分析心颤的自适应数据处理、建模和量化方法
  • 批准号:
    RGPIN-2020-04933
  • 财政年份:
    2022
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Discovery Grants Program - Individual
Statistical methods for analyzing messy microbiome data: detection of hidden artifacts and robust modeling approaches
分析杂乱微生物组数据的统计方法:隐藏伪影的检测和稳健的建模方法
  • 批准号:
    10708908
  • 财政年份:
    2022
  • 资助金额:
    $ 0.8万
  • 项目类别:
Statistical methods for analyzing messy microbiome data: detection of hidden artifacts and robust modeling approaches
分析杂乱微生物组数据的统计方法:隐藏伪影的检测和稳健的建模方法
  • 批准号:
    10503637
  • 财政年份:
    2022
  • 资助金额:
    $ 0.8万
  • 项目类别:
Modeling, Analyzing and Managing Insurance Risks
保险风险建模、分析和管理
  • 批准号:
    RGPIN-2019-05640
  • 财政年份:
    2021
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Discovery Grants Program - Individual
Adaptive Data Processing, Modeling, and Quantification Methods for Analyzing Cardiac Fibrillation
用于分析心颤的自适应数据处理、建模和量化方法
  • 批准号:
    RGPIN-2020-04933
  • 财政年份:
    2021
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Discovery Grants Program - Individual
Collaborative Research: SaTC: CORE: Small: Privately Collecting and Analyzing V2X Data for Urban Traffic Modeling
合作研究:SaTC:核心:小型:私下收集和分析用于城市交通建模的 V2X 数据
  • 批准号:
    2034615
  • 财政年份:
    2021
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Standard Grant
Collaborative Research: SaTC: CORE: Small: Privately Collecting and Analyzing V2X Data for Urban Traffic Modeling
合作研究:SaTC:核心:小型:私下收集和分析用于城市交通建模的 V2X 数据
  • 批准号:
    2034870
  • 财政年份:
    2021
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Standard Grant
Modeling, Analyzing and Managing Insurance Risks
保险风险建模、分析和管理
  • 批准号:
    RGPIN-2019-05640
  • 财政年份:
    2020
  • 资助金额:
    $ 0.8万
  • 项目类别:
    Discovery Grants Program - Individual
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了