III: Small: Predictive Modeling from High-Dimensional, Sparsely and Irregularly Sampled, Longitudinal Data

III:小:根据高维、稀疏和不规则采样的纵向数据进行预测建模

基本信息

  • 批准号:
    2226025
  • 负责人:
  • 金额:
    $ 59.99万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2022
  • 资助国家:
    美国
  • 起止时间:
    2022-10-01 至 2025-09-30
  • 项目状态:
    未结题

项目摘要

Longitudinal data resulting from repeated observations from a set of individuals over time are commonplace in many applications, including health sciences, learning sciences, social sciences, life sciences, and economics. Such data present unprecedented opportunities to uncover the relationship between the time- varying patterns of certain measured variables (features or covariates) and outcomes of interest e.g., economic meltdown societal unrest, disease onset, health risk, etc. In real-world settings, the number of variables is often very large; often only a small subset of variables is recorded at any given time, resulting in sparse data with a high proportion of missing observations. Furthermore, such data exhibit complex correlations which if not properly accounted for, can lead to misleading statistical inferences. Additional complications arise from the fact that the data exhibit abrupt discontinuities that are often driven by transitions between states that are not directly observable (e.g., from "healthy" to "infected"). Large size of data sets demand methods that are scalable. And in high stakes applications, e.g., healthcare, human interpretability of the predictive models is of paramount importance. The project will yield substantial advances over the current state-of-the-art in scalable machine learning methods for predictive modeling of longitudinal outcomes from high-dimensional, irregularly sampled, sparse, longitudinal health data. The open-source implementations of the predictive modeling tools will find applications in many domains including behavioral, social, environmental, economic, learning, and health sciences. The project will enhance the research-based training of a diverse graduate and undergraduate students in Data Sciences and Computer Science (especially Artificial Intelligence), areas of great national importance. The educational activities associated with the project will help equip a diverse cadre of Data Scientists, AI experts, and health sciences, social sciences, learning sciences, and related areas with state-of-the-art machine learning tools for predictive modeling from longitudinal data. The project will produce a new graduate course and course modules, sample projects, etc. on predictive modeling from longitudinal data to be integrated into Data Sciences curricula. The project will help introduce students from diverse backgrounds, including women and underrepresented minorities, to a broad range of educational, research, and career opportunities in Data Sciences. The broader impacts of the project will be further enhanced by broad dissemination of all research results (publications, software, data sets, course materials).The project will develop a family of scalable deep kernel gaussian process regression algorithms for interpretable predictive modeling from high dimensional, sparsely and irregularly time sampled, longitudinal data with complex, a priori unknown correlation structure. The resulting methods will be able to discover the patterns of transitions between unobserved or hidden states, account for abrupt discontinuities in outcomes. They will be able to explain their predictions by learning the underlying complex correlation structure exhibited by the data and by identifying not only the variables that drive the predictions, but also the temporal context in which they do so. The project will rigorously empirically evaluate the resulting methods with simulated longitudinal data (with different correlation structures, different missingness mechanisms, different time-dependent variable importance), several benchmark longitudinal data sets, and, most importantly, deidentified longitudinal electronic health records data and socio-demographic data from real-world healthcare applications (in collaboration with clinical experts).This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
从一组个体随时间的重复观察得到的纵向数据在许多应用中是常见的,包括健康科学、学习科学、社会科学、生命科学和经济学。这些数据提供了前所未有的机会来揭示某些测量变量(特征或协变量)的时变模式与感兴趣的结果之间的关系,在现实世界中,变量的数量通常非常大;通常在任何给定时间只记录一小部分变量,导致数据稀疏,缺失观测的比例很高。此外,这些数据表现出复杂的相关性,如果不加以适当解释,可能导致误导性的统计推断。额外的复杂性来自于数据表现出突然的不连续性的事实,该突然的不连续性通常由不可直接观察的状态之间的转变(例如,从“健康”到“感染”。 大规模的数据集需要可扩展的方法。 在高风险应用中,例如,在医疗保健领域,预测模型的人类可解释性至关重要。该项目将在当前最先进的可扩展机器学习方法中取得实质性进展,用于从高维,不规则采样,稀疏的纵向健康数据中预测纵向结果。预测建模工具的开源实现将在许多领域得到应用,包括行为、社会、环境、经济、学习和健康科学。该项目将加强对数据科学和计算机科学(特别是人工智能)的多样化研究生和本科生的研究培训,这些领域具有重要的国家意义。与该项目相关的教育活动将有助于为数据科学家、人工智能专家、健康科学、社会科学、学习科学和相关领域的各种骨干队伍提供最先进的机器学习工具,用于从纵向数据进行预测建模。该项目将产生一个新的研究生课程和课程模块,样本项目等,从纵向数据预测建模将被集成到数据科学课程。该项目将帮助介绍来自不同背景的学生,包括妇女和代表性不足的少数民族,以广泛的教育,研究和职业机会在数据科学。通过广泛传播所有研究成果(出版物、软件、数据集、课程材料),该项目的更广泛影响将得到进一步加强。该项目将开发一系列可扩展的深核高斯过程回归算法,用于从高维、稀疏和不规则时间采样、具有复杂、先验未知相关结构的纵向数据中进行可解释的预测建模。由此产生的方法将能够发现未观察到的或隐藏的状态之间的转换模式,解释结果中的突然中断。他们将能够通过学习数据所表现出的潜在复相关结构来解释他们的预测,不仅可以识别驱动预测的变量,还可以识别他们这样做的时间背景。 该项目将用模拟纵向数据对所产生的方法进行严格的经验评估(具有不同的相关结构、不同的缺失机制、不同的随时间变化的变量重要性)、若干基准纵向数据集,并且,最重要的是,去识别的纵向电子健康记录数据和来自真实世界健康护理应用的社会人口统计数据该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(1)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
A Simple, Fast Algorithm for Continual Learning from High-Dimensional Data
一种简单、快速的高维数据持续学习算法
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Vasant Honavar其他文献

Neural network design and the complexity of learning, by J. Stephen Judd. Cambridge, MA: MIT Press, 1990
  • DOI:
    10.1007/bf00993255
  • 发表时间:
    1992-06-01
  • 期刊:
  • 影响因子:
    2.900
  • 作者:
    Vasant Honavar
  • 通讯作者:
    Vasant Honavar
Machine-learning guided biophysical model development: application to ribosome catalysis
  • DOI:
    10.1016/j.bpj.2021.11.2053
  • 发表时间:
    2022-02-11
  • 期刊:
  • 影响因子:
  • 作者:
    Yang Jiang;Justin Petucci;Nishant Soni;Vasant Honavar;Edward O'Brien
  • 通讯作者:
    Edward O'Brien
Book Review:Neural Network Design and the Complexity of Learning, by J. Stephen Judd. Cambridge, MA: MIT Press, 1990
  • DOI:
    10.1023/a:1022680813848
  • 发表时间:
    1992-06-01
  • 期刊:
  • 影响因子:
    2.900
  • 作者:
    Vasant Honavar
  • 通讯作者:
    Vasant Honavar
Exploring inconsistencies in genome-wide protein function annotations: a machine learning approach
  • DOI:
    10.1186/1471-2105-8-284
  • 发表时间:
    2007-08-03
  • 期刊:
  • 影响因子:
    3.300
  • 作者:
    Carson Andorf;Drena Dobbs;Vasant Honavar
  • 通讯作者:
    Vasant Honavar
A practical guide to machine learning interatomic potentials – Status and future
机器学习原子间势的实用指南——现状与未来
  • DOI:
    10.1016/j.cossms.2025.101214
  • 发表时间:
    2025-03-01
  • 期刊:
  • 影响因子:
    13.400
  • 作者:
    Ryan Jacobs;Dane Morgan;Siamak Attarian;Jun Meng;Chen Shen;Zhenghao Wu;Clare Yijia Xie;Julia H. Yang;Nongnuch Artrith;Ben Blaiszik;Gerbrand Ceder;Kamal Choudhary;Gabor Csanyi;Ekin Dogus Cubuk;Bowen Deng;Ralf Drautz;Xiang Fu;Jonathan Godwin;Vasant Honavar;Olexandr Isayev;Brandon M. Wood
  • 通讯作者:
    Brandon M. Wood

Vasant Honavar的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Vasant Honavar', 18)}}的其他基金

Collaborative Research: RI: III: SHF: Small: Multi-Stakeholder Decision Making: Qualitative Preference Languages, Interactive Reasoning, and Explanation
协作研究:RI:III:SHF:小型:多利益相关者决策:定性偏好语言、交互式推理和解释
  • 批准号:
    2225824
  • 财政年份:
    2022
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Standard Grant
AI Institute: Planning: Institute for AI-Enabled Materials Discovery, Design, and Synthesis
人工智能研究所:规划:人工智能材料发现、设计和合成研究所
  • 批准号:
    2020243
  • 财政年份:
    2020
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Standard Grant
EAGER: Interpreting Black-Box Predictive Models Through Causal Attribution
EAGER:通过因果归因解释黑盒预测模型
  • 批准号:
    2041759
  • 财政年份:
    2020
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Standard Grant
BD Spokes: SPOKE: NORTHEAST: Collaborative Research: Integration of Environmental Factors and Causal Reasoning Approaches for Large-Scale Observational Health Research
BD 发言:发言:东北:合作研究:大规模观察健康研究的环境因素和因果推理方法的整合
  • 批准号:
    1636795
  • 财政年份:
    2017
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Standard Grant
EAGER: Towards a Computational Infrastructure for Analysis of Sensitive Data
EAGER:建立用于分析敏感数据的计算基础设施
  • 批准号:
    1551843
  • 财政年份:
    2015
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Standard Grant
SHF:Large:Collaborative Research: Inferring Software Specifications from Open Source Repositories by Leveraging Data and Collective Community Expertise
SHF:大型:协作研究:利用数据和集体社区专业知识从开源存储库推断软件规范
  • 批准号:
    1518732
  • 财政年份:
    2015
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Standard Grant
SGER: Exploratory Investigation of Modular Ontology Languages
SGER:模块化本体语言的探索性研究
  • 批准号:
    0639230
  • 财政年份:
    2006
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Standard Grant
ITR: Algorithms and Software for Knowledge Acquisition from Heterogeneous Distributed Data
ITR:从异构分布式数据获取知识的算法和软件
  • 批准号:
    0219699
  • 财政年份:
    2002
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Continuing Grant
RIA: Constructive Neural Network Learning Algorithms for Pattern Classification
RIA:用于模式分类的构造性神经网络学习算法
  • 批准号:
    9409580
  • 财政年份:
    1994
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Continuing Grant

相似国自然基金

昼夜节律性small RNA在血斑形成时间推断中的法医学应用研究
  • 批准号:
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
tRNA-derived small RNA上调YBX1/CCL5通路参与硼替佐米诱导慢性疼痛的机制研究
  • 批准号:
    n/a
  • 批准年份:
    2022
  • 资助金额:
    10.0 万元
  • 项目类别:
    省市级项目
Small RNA调控I-F型CRISPR-Cas适应性免疫性的应答及分子机制
  • 批准号:
    32000033
  • 批准年份:
    2020
  • 资助金额:
    24.0 万元
  • 项目类别:
    青年科学基金项目
Small RNAs调控解淀粉芽胞杆菌FZB42生防功能的机制研究
  • 批准号:
    31972324
  • 批准年份:
    2019
  • 资助金额:
    58.0 万元
  • 项目类别:
    面上项目
变异链球菌small RNAs连接LuxS密度感应与生物膜形成的机制研究
  • 批准号:
    81900988
  • 批准年份:
    2019
  • 资助金额:
    21.0 万元
  • 项目类别:
    青年科学基金项目
基于small RNA 测序技术解析鸽分泌鸽乳的分子机制
  • 批准号:
    31802058
  • 批准年份:
    2018
  • 资助金额:
    26.0 万元
  • 项目类别:
    青年科学基金项目
肠道细菌关键small RNAs在克罗恩病发生发展中的功能和作用机制
  • 批准号:
    31870821
  • 批准年份:
    2018
  • 资助金额:
    56.0 万元
  • 项目类别:
    面上项目
Small RNA介导的DNA甲基化调控的水稻草矮病毒致病机制
  • 批准号:
    31772128
  • 批准年份:
    2017
  • 资助金额:
    60.0 万元
  • 项目类别:
    面上项目
基于small RNA-seq的针灸治疗桥本甲状腺炎的免疫调控机制研究
  • 批准号:
    81704176
  • 批准年份:
    2017
  • 资助金额:
    20.0 万元
  • 项目类别:
    青年科学基金项目
水稻OsSGS3与OsHEN1调控small RNAs合成及其对抗病性的调节
  • 批准号:
    91640114
  • 批准年份:
    2016
  • 资助金额:
    85.0 万元
  • 项目类别:
    重大研究计划

相似海外基金

III: Small: RUI: A Fairness Auditing Framework for Predictive Mobility Models
III:小:RUI:预测移动模型的公平性审核框架
  • 批准号:
    2304213
  • 财政年份:
    2023
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Standard Grant
Predictive biomarkers for adjuvant immune checkpoint inhibitor therapy in non-small cell lung cancer
非小细胞肺癌辅助免疫检查点抑制剂治疗的预测生物标志物
  • 批准号:
    469156
  • 财政年份:
    2022
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Operating Grants
Predictive and Diagnostic Radiomic Signatures in Non-Small Cell Lung Cancer (NSCLC) on Immunotherapy
非小细胞肺癌 (NSCLC) 免疫治疗的预测和诊断放射学特征
  • 批准号:
    10418808
  • 财政年份:
    2021
  • 资助金额:
    $ 59.99万
  • 项目类别:
Big data for small patients - Building "child-size" individual predictive models for life after childhood cancer
小型患者的大数据 - 为儿童癌症后的生活建立“儿童大小”的个体预测模型
  • 批准号:
    EP/T028017/1
  • 财政年份:
    2021
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Fellowship
Predictive and Diagnostic Radiomic Signatures in Non-Small Cell Lung Cancer (NSCLC) on Immunotherapy
非小细胞肺癌 (NSCLC) 免疫治疗的预测和诊断放射学特征
  • 批准号:
    10652449
  • 财政年份:
    2021
  • 资助金额:
    $ 59.99万
  • 项目类别:
Predictive and Diagnostic Radiomic Signatures in Non-Small Cell Lung Cancer (NSCLC) on Immunotherapy
非小细胞肺癌 (NSCLC) 免疫治疗的预测和诊断放射学特征
  • 批准号:
    10316572
  • 财政年份:
    2021
  • 资助金额:
    $ 59.99万
  • 项目类别:
A combined statistical and high-throughput experimental approach for the predictive crystallization of small molecules.
用于预测小分子结晶的统计和高通量实验相结合的方法。
  • 批准号:
    2595838
  • 财政年份:
    2021
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Studentship
Identifying predictive biomarkers for neoadjuvant anti-PD-1 therapy in early-stage non-small cell lung carcinoma through multiplex immunofluorescence
通过多重免疫荧光鉴定早期非小细胞肺癌新辅助抗 PD-1 治疗的预测生物标志物
  • 批准号:
    466742
  • 财政年份:
    2021
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Studentship Programs
CNS Core: Small: PilotPC: Proactive Inverse Learning of Network Topology for Predictive Communication among Unmanned Vehicles
CNS 核心:小型:PilotPC:用于无人驾驶车辆之间预测通信的网络拓扑主动逆向学习
  • 批准号:
    2204721
  • 财政年份:
    2021
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Standard Grant
Predictive Modelling of Small Crack Formation in Superalloys
高温合金中小裂纹形成的预测模型
  • 批准号:
    2436900
  • 财政年份:
    2020
  • 资助金额:
    $ 59.99万
  • 项目类别:
    Studentship
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了