CAREER: Next-Generation Methods for Statistical Integration of High-Dimensional Disparate Data Sources

职业:高维不同数据源统计集成的下一代方法

基本信息

项目摘要

Multi-view data (collected on the same samples from multiple sources) are increasingly common with advances in multi-omics, neuroimaging and wearable technologies. For example, wearable devices such as physical activity trackers, continuous glucose monitors and ambulatory blood pressure monitors are worn concurrently to provide measurements of distinct subjects’ characteristics. There is enormous potential in integrating that concurrent information from the distinct vantages to better understand between-view associations and improve prediction of health outcomes. Existing tools for data integration are sensitive to outliers, and are not designed for mixed data types (e.g. continuous skewed glucose measurements, zero-inflated activity counts, binary indicators of sleep/wake). The PI will develop a more robust framework for multi-view data integration that is better able to account for outliers, better match the mixed types of data actually collected, and be more accurate in separating common from view-specific signals. The new methods will be implemented in open-source software accompanied by reproducible workflow examples, providing immediate and easy access for other researchers. The educational component centers on the development of structured research experiences (SRE) for students. SRE enhances students written communication, software development and reproducible research skills, all of which are lacking in traditional curriculum. This will improve students’ preparation for conducting research, and widen their STEM employment opportunities. The involvement of students from traditionally underrepresented groups will positively impact their retention rate and will broaden the participation of underrepresented groups in STEM.Popular dimension reduction methods, such as principal component analysis and discriminant analysis, are tailored for single-view data, and thus fail to discover coordinated multi-view signals on a global level. On the other hand, existing multi-view dimension reduction methods suffer from reliance on the Gaussianity assumption, an inability to capture joint functional signals, and a lack of theoretical guarantees. The PI will address these drawbacks by (i) developing a joint dimension reduction framework for skewed continuous, binary and zero-inflated view types; (ii) a joint dimension reduction framework for mixed functional multi-view data and (iii) a new paradigm for simultaneous extraction of signals across views based on hierarchical low-rank constraints. This work will lead to critically needed new statistical methods for data integration with direct relevance for researchers working with wearable monitors, microbiome and multi-omics data through interdisciplinary collaborations of the PI. The proposed structured research experiences will center on the design and reproducibility of simulations studies, and align with computational components of the proposed research, including direct students’ involvement in multiple simulation studies.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
随着多组学、神经成像和可穿戴技术的进步,多视图数据(从多个来源收集的相同样本)越来越常见。例如,可穿戴设备(诸如身体活动跟踪器、连续葡萄糖监测器和动态血压监测器)被同时穿戴以提供不同对象的特性的测量。整合来自不同优势的并发信息以更好地理解视图间关联并改善健康结果的预测具有巨大的潜力。现有的数据整合工具对离群值敏感,并且不是针对混合数据类型(例如,连续偏斜葡萄糖测量、零膨胀活动计数、睡眠/清醒的二进制指标)设计的。PI将为多视图数据集成开发一个更强大的框架,该框架能够更好地解释离群值,更好地匹配实际收集的混合类型的数据,并更准确地分离常见的视图特定信号。新方法将在开源软件中实施,并伴有可重复的工作流程示例,为其他研究人员提供即时和轻松的访问。教育部分集中在学生的结构化研究经验(SRE)的发展。SRE提高了学生的书面交流,软件开发和可复制的研究技能,所有这些都是传统课程所缺乏的。这将改善学生进行研究的准备,并扩大他们的STEM就业机会。传统上代表性不足的群体的学生的参与将积极影响他们的留存率,并将扩大代表性不足的群体在STEM的参与。流行的降维方法,如主成分分析和判别分析,是专为单视图的数据,从而无法发现协调的多视图信号在全球范围内。另一方面,现有的多视图降维方法存在依赖于高斯性假设、无法捕获联合功能信号以及缺乏理论保证等问题。PI将通过以下方式解决这些缺点:(i)为偏斜连续、二进制和零膨胀视图类型开发联合降维框架;(ii)为混合功能多视图数据开发联合降维框架;以及(iii)基于分层低秩约束的跨视图同时提取信号的新范例。 这项工作将导致迫切需要的新的统计方法,用于数据集成,与通过PI的跨学科合作使用可穿戴式监测器,微生物组和多组学数据的研究人员直接相关。建议的结构化研究经验将集中在模拟研究的设计和可重复性,并与建议研究的计算部分保持一致,包括直接让学生参与多个模拟研究。该奖项反映了NSF的法定使命,并通过使用基金会的智力价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(5)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Gluformer: Transformer-based Personalized glucose Forecasting with uncertainty quantification
Gluformer:基于 Transformer 的个性化血糖预测,具有不确定性量化
  • DOI:
    10.1109/icassp49357.2023.10096419
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Sergazinov, Renat;Armandpour, Mohammadreza;Gaynanova, Irina
  • 通讯作者:
    Gaynanova, Irina
A Case Study of Glucose Levels During Sleep Using Multilevel Fast Function on Scalar Regression Inference
使用标量回归推理上的多级快速函数进行睡眠期间血糖水平的案例研究
  • DOI:
    10.1111/biom.13878
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    1.9
  • 作者:
    Sergazinov, Renat;Leroux, Andrew;Cui, Erjia;Crainiceanu, Ciprian;Aurora, R. Nisha;Punjabi, Naresh M.;Gaynanova, Irina
  • 通讯作者:
    Gaynanova, Irina
Sensing the impact of extreme heat on physical activity and sleep
  • DOI:
    10.1177/20552076241241509
  • 发表时间:
    2024-01
  • 期刊:
  • 影响因子:
    3.9
  • 作者:
    S. Cheong;Irina Gaynanova
  • 通讯作者:
    S. Cheong;Irina Gaynanova
Pre- Versus Postmeal Sedentary Duration—Impact on Postprandial Glucose in Older Adults With Overweight or Obesity
餐前与餐后久坐时间对超重或肥胖老年人餐后血糖的影响
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Irina Gaynanova其他文献

Sparse semiparametric discriminant analysis for high-dimensional zero-inflated data
高维零膨胀数据的稀疏半参数判别分析
  • DOI:
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Hee Cheol Chung;Yang Ni;Irina Gaynanova
  • 通讯作者:
    Irina Gaynanova
Optimal variable selection in multi-group sparse discriminant analysis ∗
多组稀疏判别分析中的最优变量选择*
Corrections of Equations on Glycemic Variability and Quality of Glycemic Control.
血糖变异性方程的修正和血糖控制的质量。
Prediction error bounds for linear regression with the TREX
使用 TREX 进行线性回归的预测误差范围
  • DOI:
    10.1007/s11749-018-0584-4
  • 发表时间:
    2018
  • 期刊:
  • 影响因子:
    1.3
  • 作者:
    J. Bien;Irina Gaynanova;Johannes Lederer;Christian L. Müller
  • 通讯作者:
    Christian L. Müller
Prediction and estimation consistency of sparse multi-class penalized optimal scoring
  • DOI:
    10.3150/19-bej1126
  • 发表时间:
    2018-09
  • 期刊:
  • 影响因子:
    1.5
  • 作者:
    Irina Gaynanova
  • 通讯作者:
    Irina Gaynanova

Irina Gaynanova的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Irina Gaynanova', 18)}}的其他基金

CAREER: Next-Generation Methods for Statistical Integration of High-Dimensional Disparate Data Sources
职业:高维不同数据源统计集成的下一代方法
  • 批准号:
    2044823
  • 财政年份:
    2021
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
Scalable Methods for Classification of Heterogeneous High-Dimensional Data
异构高维数据分类的可扩展方法
  • 批准号:
    1712943
  • 财政年份:
    2017
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant

相似国自然基金

Next Generation Majorana Nanowire Hybrids
  • 批准号:
  • 批准年份:
    2020
  • 资助金额:
    20 万元
  • 项目类别:

相似海外基金

CAREER: Next-generation Logic, Memory, and Agile Microwave Devices Enabled by Spin Phenomena in Emergent Quantum Materials
职业:由新兴量子材料中的自旋现象实现的下一代逻辑、存储器和敏捷微波器件
  • 批准号:
    2339723
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Securing Next-Generation Transportation Infrastructure: A Traffic Engineering Perspective
职业:保护下一代交通基础设施:交通工程视角
  • 批准号:
    2339753
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CAREER: LoRa Enabled Space-air-ground Integrated Networks for Next-Generation Agricultural IoT
职业生涯:LoRa 支持下一代农业物联网的天地一体化网络
  • 批准号:
    2338976
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Next-generation protease inhibitor discovery with chemically diversified antibodies
职业:利用化学多样化的抗体发现下一代蛋白酶抑制剂
  • 批准号:
    2339201
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Next Generation Online Resource Allocation
职业:下一代在线资源分配
  • 批准号:
    2340306
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CAREER: Next-Generation Flow Cytometry - A New Approach to Cell Heterogeneity
职业:下一代流式细胞术 - 细胞异质性的新方法
  • 批准号:
    2422750
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CAREER: Non-Local Metamaterials and Metasurfaces for Next Generation Non-Reciprocal Acoustic Devices
职业:下一代非互易声学器件的非局域超材料和超表面
  • 批准号:
    2340782
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Standard Grant
CAREER: Next Generation of High-Level Synthesis for Agile Architectural Design (ArchHLS)
职业:下一代敏捷架构设计高级综合 (ArchHLS)
  • 批准号:
    2338365
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Engineering next-generation adrenal gland organoids
职业:设计下一代肾上腺类器官
  • 批准号:
    2335133
  • 财政年份:
    2024
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
CAREER: Next-generation Rhizosphere Monitoring - Non-invasive Plant Phenotyping and Health Monitoring Using the Light-piping Properties of Plant Stems
职业:下一代根际监测 - 利用植物茎的光管特性进行非侵入性植物表型和健康监测
  • 批准号:
    2238365
  • 财政年份:
    2023
  • 资助金额:
    $ 40万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了