Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values

复杂特征数据的统计分析:高维、测量误差和缺失值

基本信息

  • 批准号:
    RGPIN-2018-03819
  • 负责人:
  • 金额:
    $ 3.28万
  • 依托单位:
  • 依托单位国家:
    加拿大
  • 项目类别:
    Discovery Grants Program - Individual
  • 财政年份:
    2022
  • 资助国家:
    加拿大
  • 起止时间:
    2022-01-01 至 2023-12-31
  • 项目状态:
    已结题

项目摘要

As the advancement of modern technology in acquiring data, data with diverse features are becoming more accessible than ever before. The increasing complexity of structures and the large dimension of data have posed an urgent need for the development of novel and flexible modeling and analysis tools. While many complex features may be present in different applications, this research focuses on two prevailing issues commonly present in modern data : the quality and dimensionality of data. I plan to explore important problems in the following areas.(1) High dimensional data with measurement error and missing valuesIn the era of Big Data, large scale data are often available where the dimension of the variables is much larger than the number of subjects in the study. This presents a great challenge to traditional statistical methods which normally require the sample size to be bigger than the dimension of the variables. In addition, we face challenges related to data quality - measurement imprecision and missing observations. This research aims to investigate problems concerning high dimensionality, measurement error, and missing observations. The plan is to examine how measurement error and missing values may interplay in the analysis of high dimensional data. The objectives are to develop valid inference methods to handle data with all these features involved. Applications of the developed methods to survival data, image data and longitudinal data are planned.(2) Causal inference with complex featured dataAs opposed to association studies, causal inference is often the focus of empirical research. While many research methods are available for various settings, they are vulnerable to poor quality data. Most existing methods require that the data are “perfect” in the sense that no missing observations nor measurement error are present, but these assumptions are often violated in practice. Measurement error and missing observations have been a long standing concern in many studies including epidemiological, nutrition and environmental studies. However, research on causal inference with these features is rather limited and remains unexplored. I plan to explore this exciting area and develop new methods to address complex effects caused by measurement error and/or missing observation on causal inference. Furthermore, I intend to investigate the problems in the presence of large scale data where the dimension of potential confounders is high.My primary goals are to develop original and innovative methodology in advancing foundational work and to facilitate applications. This research is anticipated to provide valuable insights into making the best use of available large scale data and to broaden the scope of existing strategies and research. It is expected to have significant impact on the statistical community as well as other fields including public health, medical studies and data science.
随着现代数据获取技术的进步,具有不同特征的数据变得比以往任何时候都更容易获得。结构的日益复杂和数据的大维度对开发新颖灵活的建模和分析工具提出了迫切的需求。 虽然许多复杂的功能可能会出现在不同的应用程序中,本研究的重点是现代数据中常见的两个普遍存在的问题:数据的质量和维度。 我计划探讨以下几个方面的重要问题。(1)具有测量误差和缺失值的高维数据在大数据时代,通常可以获得大规模数据,其中变量的维度远远大于研究中的受试者数量。这对传统的统计方法提出了很大的挑战,因为传统的统计方法通常要求样本容量大于变量的维数。此外,我们还面临着与数据质量相关的挑战-测量不精确和观测数据缺失。本研究旨在探讨高维数、测量误差与缺失观测等问题。该计划是为了研究如何测量误差和缺失值可能在高维数据的分析相互作用。我们的目标是开发有效的推理方法来处理涉及所有这些功能的数据。计划将所开发的方法应用于生存数据、图像数据和纵向数据。(2)与关联研究相反,因果推理往往是实证研究的重点。 虽然许多研究方法可用于各种环境,但它们容易受到低质量数据的影响。大多数现有的方法要求数据是“完美的”,在这个意义上说,没有丢失的观察,也没有测量误差,但这些假设往往在实践中被违反。测量误差和观测数据缺失一直是流行病学、营养和环境研究等许多研究中的一个长期问题。然而,对具有这些特征的因果推理的研究是相当有限的,仍然没有探索。我计划探索这一令人兴奋的领域,并开发新的方法来解决测量误差和/或缺失观测对因果推理造成的复杂影响。此外,我打算调查的问题,在大规模的数据存在的潜在混杂因素的维度是high.My的主要目标是发展原创性和创新的方法,在推进基础工作,并促进应用。预计这项研究将为充分利用现有的大规模数据提供有价值的见解,并扩大现有战略和研究的范围。预计它将对统计界以及公共卫生、医学研究和数据科学等其他领域产生重大影响。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Yi, Grace其他文献

Assessing trauma and related distress in refugee youth and their caregivers: should we be concerned about iatrogenic effects?
  • DOI:
    10.1007/s00787-020-01635-z
  • 发表时间:
    2021-09
  • 期刊:
  • 影响因子:
    6.4
  • 作者:
    Greene, M. Claire;Kane, Jeremy C.;Bolton, Paul;Murray, Laura K.;Wainberg, Milton L.;Yi, Grace;Sim, Amanda;Puffer, Eve;Ismael, Abdulkadir;Hall, Brian J.
  • 通讯作者:
    Hall, Brian J.
The Effect of Intimate Partner Violence and Probable Traumatic Brain Injury on Mental Health Outcomes for Black Women
  • DOI:
    10.1080/10926771.2019.1587657
  • 发表时间:
    2019-01-01
  • 期刊:
  • 影响因子:
    1.8
  • 作者:
    Cimino, Andrea N.;Yi, Grace;Stockman, Jamila K.
  • 通讯作者:
    Stockman, Jamila K.

Yi, Grace的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Yi, Grace', 18)}}的其他基金

Data Science
数据科学
  • 批准号:
    CRC-2019-00427
  • 财政年份:
    2022
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Canada Research Chairs
Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
  • 批准号:
    RGPIN-2018-03819
  • 财政年份:
    2021
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Data Science
数据科学
  • 批准号:
    CRC-2019-00427
  • 财政年份:
    2021
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Canada Research Chairs
Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
  • 批准号:
    RGPIN-2018-03819
  • 财政年份:
    2020
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Data Science
数据科学
  • 批准号:
    CRC-2019-00427
  • 财政年份:
    2020
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Canada Research Chairs
Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
  • 批准号:
    RGPIN-2018-03819
  • 财政年份:
    2020
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Data Science
数据科学
  • 批准号:
    CRC-2019-00427
  • 财政年份:
    2019
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Canada Research Chairs
Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
  • 批准号:
    RGPIN-2018-03819
  • 财政年份:
    2019
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
  • 批准号:
    RGPIN-2018-03819
  • 财政年份:
    2018
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Statistical Methods on Challenging Issues of Biosciences
生物科学难题的统计方法
  • 批准号:
    239733-2013
  • 财政年份:
    2017
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual

相似国自然基金

Scalable Learning and Optimization: High-dimensional Models and Online Decision-Making Strategies for Big Data Analysis
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    合作创新研究团队
Intelligent Patent Analysis for Optimized Technology Stack Selection:Blockchain BusinessRegistry Case Demonstration
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    万元
  • 项目类别:
    外国学者研究基金项目
基于Meta-analysis的新疆棉花灌水增产模型研究
  • 批准号:
    41601604
  • 批准年份:
    2016
  • 资助金额:
    22.0 万元
  • 项目类别:
    青年科学基金项目
大规模微阵列数据组的meta-analysis方法研究
  • 批准号:
    31100958
  • 批准年份:
    2011
  • 资助金额:
    20.0 万元
  • 项目类别:
    青年科学基金项目
用“后合成核磁共振分析”(retrobiosynthetic NMR analysis)技术阐明青蒿素生物合成途径
  • 批准号:
    30470153
  • 批准年份:
    2004
  • 资助金额:
    22.0 万元
  • 项目类别:
    面上项目

相似海外基金

REU Site: University of North Carolina at Greensboro - Complex Data Analysis using Statistical and Machine Learning Tools
REU 站点:北卡罗来纳大学格林斯伯勒分校 - 使用统计和机器学习工具进行复杂数据分析
  • 批准号:
    2244160
  • 财政年份:
    2023
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Standard Grant
Statistical models for the integrative analysis of complex biomedical images with manifold structure
具有流形结构的复杂生物医学图像综合分析的统计模型
  • 批准号:
    10590469
  • 财政年份:
    2023
  • 资助金额:
    $ 3.28万
  • 项目类别:
Statistical Challenges and Methods in the Analysis of High Dimensional and Complex Structured Data
高维复杂结构化数据分析中的统计挑战和方法
  • 批准号:
    RGPIN-2018-05475
  • 财政年份:
    2022
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling Designs and Statistical Methods for the Analysis of Complex Life History and Genetic Data
用于分析复杂生活史和遗传数据的抽样设计和统计方法
  • 批准号:
    RGPIN-2020-05528
  • 财政年份:
    2022
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Statistical methods of multivariate analysis for large and complex data
海量复杂数据的多元分析统计方法
  • 批准号:
    RGPIN-2016-05880
  • 财政年份:
    2022
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Statistical Analysis of Complex Featured Data: High Dimensionality, Measurement Error and Missing Values
复杂特征数据的统计分析:高维、测量误差和缺失值
  • 批准号:
    RGPIN-2018-03819
  • 财政年份:
    2021
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Statistical methods of multivariate analysis for large and complex data
海量复杂数据的多元分析统计方法
  • 批准号:
    RGPIN-2016-05880
  • 财政年份:
    2021
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Sampling Designs and Statistical Methods for the Analysis of Complex Life History and Genetic Data
用于分析复杂生活史和遗传数据的抽样设计和统计方法
  • 批准号:
    RGPIN-2020-05528
  • 财政年份:
    2021
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
Statistical Challenges and Methods in the Analysis of High Dimensional and Complex Structured Data
高维复杂结构化数据分析中的统计挑战和方法
  • 批准号:
    RGPIN-2018-05475
  • 财政年份:
    2021
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Discovery Grants Program - Individual
REU Site: University of North Carolina at Greensboro in Complex Data Analysis using Statistical and Machine Learning Tools
REU 网站:北卡罗来纳大学格林斯博罗分校使用统计和机器学习工具进行复杂数据分析
  • 批准号:
    1950549
  • 财政年份:
    2020
  • 资助金额:
    $ 3.28万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了