CAREER: Next-Generation Methods for Statistical Integration of High-Dimensional Disparate Data Sources

职业:高维不同数据源统计集成的下一代方法

基本信息

项目摘要

Multi-view data (collected on the same samples from multiple sources) are increasingly common with advances in multi-omics, neuroimaging and wearable technologies. For example, wearable devices such as physical activity trackers, continuous glucose monitors and ambulatory blood pressure monitors are worn concurrently to provide measurements of distinct subjects’ characteristics. There is enormous potential in integrating that concurrent information from the distinct vantages to better understand between-view associations and improve prediction of health outcomes. Existing tools for data integration are sensitive to outliers, and are not designed for mixed data types (e.g. continuous skewed glucose measurements, zero-inflated activity counts, binary indicators of sleep/wake). The PI will develop a more robust framework for multi-view data integration that is better able to account for outliers, better match the mixed types of data actually collected, and be more accurate in separating common from view-specific signals. The new methods will be implemented in open-source software accompanied by reproducible workflow examples, providing immediate and easy access for other researchers. The educational component centers on the development of structured research experiences (SRE) for students. SRE enhances students written communication, software development and reproducible research skills, all of which are lacking in traditional curriculum. This will improve students’ preparation for conducting research, and widen their STEM employment opportunities. The involvement of students from traditionally underrepresented groups will positively impact their retention rate and will broaden the participation of underrepresented groups in STEM.Popular dimension reduction methods, such as principal component analysis and discriminant analysis, are tailored for single-view data, and thus fail to discover coordinated multi-view signals on a global level. On the other hand, existing multi-view dimension reduction methods suffer from reliance on the Gaussianity assumption, an inability to capture joint functional signals, and a lack of theoretical guarantees. The PI will address these drawbacks by (i) developing a joint dimension reduction framework for skewed continuous, binary and zero-inflated view types; (ii) a joint dimension reduction framework for mixed functional multi-view data and (iii) a new paradigm for simultaneous extraction of signals across views based on hierarchical low-rank constraints. This work will lead to critically needed new statistical methods for data integration with direct relevance for researchers working with wearable monitors, microbiome and multi-omics data through interdisciplinary collaborations of the PI. The proposed structured research experiences will center on the design and reproducibility of simulations studies, and align with computational components of the proposed research, including direct students’ involvement in multiple simulation studies.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
随着多组学、神经成像和可穿戴技术的进步,多视图数据(从多个来源收集相同样本)越来越普遍。例如,身体活动追踪器、连续血糖监测仪和动态血压监测仪等可穿戴设备可以同时佩戴,以提供不同受试者特征的测量。整合来自不同优势的并发信息,以更好地了解视图之间的关联并改进对健康结果的预测,具有巨大的潜力。现有的数据集成工具对异常值很敏感,并且不是为混合数据类型设计的(例如,连续倾斜的葡萄糖测量,零膨胀的活动计数,睡眠/清醒的二进制指标)。PI将为多视图数据集成开发一个更强大的框架,以便更好地解释异常值,更好地匹配实际收集的混合数据类型,并更准确地将公共信号与特定视图信号分离开来。新方法将在开源软件中实现,并附带可重复的工作流程示例,为其他研究人员提供即时和方便的访问。教育部分以学生结构化研究经验(SRE)的发展为中心。SRE提高学生的书面交流、软件开发和可重复研究技能,这些都是传统课程所缺乏的。这将提高学生进行研究的准备,并扩大他们的STEM就业机会。来自传统上代表性不足群体的学生的参与将对他们的保留率产生积极影响,并将扩大代表性不足群体对STEM的参与。流行的降维方法,如主成分分析和判别分析,都是针对单视图数据量身定制的,因此无法在全局层面上发现协调的多视图信号。另一方面,现有的多视图降维方法依赖于高斯假设,无法捕获联合功能信号,缺乏理论保证。PI将通过以下方式解决这些缺点:(i)为倾斜连续、二进制和零膨胀视图类型开发一个联合降维框架;(ii)混合功能多视图数据的联合降维框架和(iii)基于分层低秩约束的跨视图同时提取信号的新范式。这项工作将导致迫切需要新的统计方法,用于数据集成,直接相关的研究人员通过PI的跨学科合作使用可穿戴监视器,微生物组和多组学数据。拟议的结构化研究经验将以模拟研究的设计和可重复性为中心,并与拟议研究的计算组件保持一致,包括学生直接参与多种模拟研究。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Gluformer: Transformer-based Personalized glucose Forecasting with uncertainty quantification
Gluformer:基于 Transformer 的个性化血糖预测,具有不确定性量化
  • DOI:
    10.1109/icassp49357.2023.10096419
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Sergazinov, Renat;Armandpour, Mohammadreza;Gaynanova, Irina
  • 通讯作者:
    Gaynanova, Irina
A Case Study of Glucose Levels During Sleep Using Multilevel Fast Function on Scalar Regression Inference
使用标量回归推理上的多级快速函数进行睡眠期间血糖水平的案例研究
  • DOI:
    10.1111/biom.13878
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    1.9
  • 作者:
    Sergazinov, Renat;Leroux, Andrew;Cui, Erjia;Crainiceanu, Ciprian;Aurora, R. Nisha;Punjabi, Naresh M.;Gaynanova, Irina
  • 通讯作者:
    Gaynanova, Irina
Sensing the impact of extreme heat on physical activity and sleep
  • DOI:
    10.1177/20552076241241509
  • 发表时间:
    2024-01
  • 期刊:
  • 影响因子:
    3.9
  • 作者:
    S. Cheong;Irina Gaynanova
  • 通讯作者:
    S. Cheong;Irina Gaynanova
Pre- Versus Postmeal Sedentary Duration—Impact on Postprandial Glucose in Older Adults With Overweight or Obesity
餐前与餐后久坐时间对超重或肥胖老年人餐后血糖的影响
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Irina Gaynanova其他文献

Sparse semiparametric discriminant analysis for high-dimensional zero-inflated data
高维零膨胀数据的稀疏半参数判别分析
  • DOI:
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Hee Cheol Chung;Yang Ni;Irina Gaynanova
  • 通讯作者:
    Irina Gaynanova
Optimal variable selection in multi-group sparse discriminant analysis ∗
多组稀疏判别分析中的最优变量选择*
Corrections of Equations on Glycemic Variability and Quality of Glycemic Control.
血糖变异性方程的修正和血糖控制的质量。
Prediction error bounds for linear regression with the TREX
使用 TREX 进行线性回归的预测误差范围
  • DOI:
    10.1007/s11749-018-0584-4
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    1.3
  • 作者:
    J. Bien;Irina Gaynanova;Johannes Lederer;Christian L. Müller
  • 通讯作者:
    Christian L. Müller
Prediction and estimation consistency of sparse multi-class penalized optimal scoring
  • DOI:
    10.3150/19-bej1126
  • 发表时间:
    2018-09
  • 期刊:
  • 影响因子:
    1.5
  • 作者:
    Irina Gaynanova
  • 通讯作者:
    Irina Gaynanova

Irina Gaynanova的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Irina Gaynanova', 18)}}的其他基金

CAREER: Next-Generation Methods for Statistical Integration of High-Dimensional Disparate Data Sources
职业:高维不同数据源统计集成的下一代方法
  • 批准号:
    2044823
  • 财政年份:
    2021
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
Scalable Methods for Classification of Heterogeneous High-Dimensional Data
异构高维数据分类的可扩展方法
  • 批准号:
    1712943
  • 财政年份:
    2017
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant

相似国自然基金

Next Generation Majorana Nanowire Hybrids
  • 批准号:
  • 批准年份:
    2020
  • 资助金额:
    20 万元
  • 项目类别:

相似海外基金

CAREER: Next-generation Logic, Memory, and Agile Microwave Devices Enabled by Spin Phenomena in Emergent Quantum Materials
职业:由新兴量子材料中的自旋现象实现的下一代逻辑、存储器和敏捷微波器件
  • 批准号:
    2339723
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Securing Next-Generation Transportation Infrastructure: A Traffic Engineering Perspective
职业:保护下一代交通基础设施:交通工程视角
  • 批准号:
    2339753
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CAREER: LoRa Enabled Space-air-ground Integrated Networks for Next-Generation Agricultural IoT
职业生涯:LoRa 支持下一代农业物联网的天地一体化网络
  • 批准号:
    2338976
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Next-generation protease inhibitor discovery with chemically diversified antibodies
职业:利用化学多样化的抗体发现下一代蛋白酶抑制剂
  • 批准号:
    2339201
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Next Generation Online Resource Allocation
职业:下一代在线资源分配
  • 批准号:
    2340306
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CAREER: Next-Generation Flow Cytometry - A New Approach to Cell Heterogeneity
职业:下一代流式细胞术 - 细胞异质性的新方法
  • 批准号:
    2422750
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CAREER: Non-Local Metamaterials and Metasurfaces for Next Generation Non-Reciprocal Acoustic Devices
职业:下一代非互易声学器件的非局域超材料和超表面
  • 批准号:
    2340782
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CAREER: Next Generation of High-Level Synthesis for Agile Architectural Design (ArchHLS)
职业:下一代敏捷架构设计高级综合 (ArchHLS)
  • 批准号:
    2338365
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Engineering next-generation adrenal gland organoids
职业:设计下一代肾上腺类器官
  • 批准号:
    2335133
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Next-generation Rhizosphere Monitoring - Non-invasive Plant Phenotyping and Health Monitoring Using the Light-piping Properties of Plant Stems
职业:下一代根际监测 - 利用植物茎的光管特性进行非侵入性植物表型和健康监测
  • 批准号:
    2238365
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了