FRG: Collaborative Research: Generative Learning on Unstructured Data with Applications to Natural Language Processing and Hyperlink Prediction

FRG:协作研究:非结构化数据的生成学习及其在自然语言处理和超链接预测中的应用

基本信息

  • 批准号:
    1952406
  • 负责人:
  • 金额:
    $ 25万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2020
  • 资助国家:
    美国
  • 起止时间:
    2020-07-01 至 2024-06-30
  • 项目状态:
    已结题

项目摘要

This project addresses the pressing needs of analyzing “big” unstructured data and tackles some artificial intelligence questions from the statistical perspective, which requires the focused and synergistic efforts of a collaborative team. Specifically, the project develops generative models for statistical learning and leverages dependence relations modeled by graphical models in hyperlink prediction, which are applicable to topic sentence generation and protein structure identification. It will lead to a substantial improvement in the accuracy of generative learning based on numerical embeddings, particularly in topic sentence generation and hyperlink prediction. The integrated program of research and education will have significant impacts on machine learning and data science, social and political sciences, and biomedical and genomic research, among others. The project requires extensive algorithm and software development for natural language processing and multimedia data integration. The PIs, their postdocs, and students will develop innovative computational algorithms and software for the analysis of large-scale unstructured complex data. The advanced computational tools will be disseminated to facilitate technology transfer. The project will address some fundamental issues in two important areas of unstructured data analysis in machine learning and intelligence. In particular, the proposed research will develop a statistical framework for generative learning, which is primarily motivated by applications for unstructured data, namely topic sentence generation and high-order hyperlink prediction. The research will develop powerful generative methods for generating instances or examples to describe and interpret the corresponding learning model. Moreover, it will develop network models for modeling high-order interactions and relations of units by identifying hidden structures in networks. It will proceed in two areas: (1) instance generation and topic sentence generation; (2) hyperlink prediction for multiway relations in hypergraphs. In the first area, instance generation, particularly sentence generation, will be performed collaboratively with numerical embeddings in categorization and regression. In the second area, hyperlinks will be predicted based on observed pairwise as well as unobserved high-order relations, characterized by graphical models with hidden structures. Special effort will be devoted to inverse learning, the integration of data from multiple sources, and extracting latent structures of networks. Finally, the research will develop computational tools and design practical methods that have desirable statistical properties.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目解决了分析“大”非结构化数据的迫切需求,并从统计学角度解决了一些人工智能问题,这需要一个协作团队的集中和协同努力。具体而言,该项目开发了统计学习的生成模型,并利用超链接预测中图形模型建模的依赖关系,适用于主题句生成和蛋白质结构识别。它将导致基于数值嵌入的生成学习的准确性的实质性提高,特别是在主题句生成和超链接预测方面。研究和教育的综合计划将对机器学习和数据科学、社会和政治科学、生物医学和基因组研究等产生重大影响。该项目需要广泛的算法和软件开发,用于自然语言处理和多媒体数据集成。pi、他们的博士后和学生将开发用于分析大规模非结构化复杂数据的创新计算算法和软件。将分发先进的计算工具,以促进技术转让。该项目将解决机器学习和智能中非结构化数据分析两个重要领域的一些基本问题。特别是,该研究将开发一个生成学习的统计框架,主要是由非结构化数据的应用驱动的,即主题句生成和高阶超链接预测。该研究将开发强大的生成方法来生成实例或示例来描述和解释相应的学习模型。此外,它将开发网络模型,通过识别网络中的隐藏结构来建模高阶相互作用和单元关系。它将在两个方面进行:(1)实例生成和主题句生成;(2)超图中多向关系的超链接预测。在第一个领域,实例生成,特别是句子生成,将与分类和回归中的数值嵌入协同进行。在第二个领域,超链接将基于观察到的成对和未观察到的高阶关系进行预测,以具有隐藏结构的图形模型为特征。特别的努力将致力于逆学习,从多个来源的数据集成,并提取网络的潜在结构。最后,研究将开发具有理想统计特性的计算工具和设计实用方法。该奖项反映了美国国家科学基金会的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(8)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Topic Modeling on Triage Notes With Semiorthogonal Nonnegative Matrix Factorization
  • DOI:
    10.1080/01621459.2020.1862667
  • 发表时间:
    2020-12
  • 期刊:
  • 影响因子:
    3.7
  • 作者:
    Yutong Li;Ruoqing Zhu;A. Qu;Han Ye;Zhankun Sun
  • 通讯作者:
    Yutong Li;Ruoqing Zhu;A. Qu;Han Ye;Zhankun Sun
Tensors in Statistics
  • DOI:
    10.1146/annurev-statistics-042720-020816
  • 发表时间:
    2021-03
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Xuan Bi;Xiwei Tang;Yubai Yuan;Yanqing Zhang;A. Qu
  • 通讯作者:
    Xuan Bi;Xiwei Tang;Yubai Yuan;Yanqing Zhang;A. Qu
Semi-Standard Partial Covariance Variable Selection When Irrepresentable Conditions Fail
不可表征条件失败时的半标准偏协方差变量选择
  • DOI:
    10.5705/ss.202020.0495
  • 发表时间:
    2023
  • 期刊:
  • 影响因子:
    1.4
  • 作者:
    Xue, Fei;Qu, Annie
  • 通讯作者:
    Qu, Annie
Deep learning from a statistical perspective
  • DOI:
    10.1002/sta4.294
  • 发表时间:
    2020-01
  • 期刊:
  • 影响因子:
    1.7
  • 作者:
    Yubai Yuan;Yujia Deng;Yanqing Zhang;A. Qu
  • 通讯作者:
    Yubai Yuan;Yujia Deng;Yanqing Zhang;A. Qu
Community detection with dependent connectivity
具有相关连接性的社区检测
  • DOI:
    10.1214/20-aos2042
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yuan, Yubai;Qu, Annie
  • 通讯作者:
    Qu, Annie
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Annie Qu其他文献

At-harvest prediction of grey mould risk in pear fruit in long-term cold storage
  • DOI:
    10.1016/j.cropro.2009.01.001
  • 发表时间:
    2009-05-01
  • 期刊:
  • 影响因子:
  • 作者:
    Robert A. Spotts;Maryna Serdani;Kelly M. Wallis;Monika Walter;Trish Harris-Virgin;Kim Spotts;David Sugar;Chang Lin Xiao;Annie Qu
  • 通讯作者:
    Annie Qu
Dynamic Tensor Recommender Systems
动态张量推荐系统
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Yanqing Zhang;Xuan Bi;Niansheng Tang;Annie Qu
  • 通讯作者:
    Annie Qu
Dynamic Tensor Recommender System
动态张量推荐系统
  • DOI:
    10.11159/icsta19.09
  • 发表时间:
    2019-08
  • 期刊:
  • 影响因子:
    6
  • 作者:
    Yanqing Zhang;Xuan Bi;Niansheng Tang;Annie Qu
  • 通讯作者:
    Annie Qu
Imputed Factor Regression for High-dimensional Block-wise Missing Data
高维分块缺失数据的估算因子回归
  • DOI:
    10.5705/ss.202018.0008
  • 发表时间:
    2020
  • 期刊:
  • 影响因子:
    1.4
  • 作者:
    Yanqing Zhang;Niansheng Tang;Annie Qu
  • 通讯作者:
    Annie Qu
Discussion of Fan et al.’s paper “Gaining efficiency via weighted estimators for multivariate failure time data”
  • DOI:
    10.1007/s11425-009-0135-2
  • 发表时间:
    2009-06-01
  • 期刊:
  • 影响因子:
    1.500
  • 作者:
    Annie Qu;Lan Xue
  • 通讯作者:
    Lan Xue

Annie Qu的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Annie Qu', 18)}}的其他基金

Collaborative Research: Integrative Heterogeneous Learning for Intensive Complex Longitudinal Data
协作研究:密集复杂纵向数据的综合异构学习
  • 批准号:
    2210640
  • 财政年份:
    2022
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Collaborative Research: New Statistical Learning for Complex Heterogeneous Data
协作研究:复杂异构数据的新统计学习
  • 批准号:
    2019461
  • 财政年份:
    2020
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Conference on Statistical Learning and Data Science
统计学习与数据科学会议
  • 批准号:
    1818546
  • 财政年份:
    2018
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Collaborative Research: New Statistical Learning for Complex Heterogeneous Data
协作研究:复杂异构数据的新统计学习
  • 批准号:
    1821198
  • 财政年份:
    2018
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Collaborative Research: New Statistical Learning and Scalable Computing for Large Unstructured Data
协作研究:大型非结构化数据的新统计学习和可扩展计算
  • 批准号:
    1415308
  • 财政年份:
    2014
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Personalized classification, moment selection, and time-varying networks for large-scale longitudinal data
大规模纵向数据的个性化分类、矩选择和时变网络
  • 批准号:
    1308227
  • 财政年份:
    2013
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
Model selection and efficient learning for high dimensional clustered data
高维聚类数据的模型选择和高效学习
  • 批准号:
    0906660
  • 财政年份:
    2009
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
CAREER: Semiparametric and Non-Parametric Models for Correlated Data
职业:相关数据的半参数和非参数模型
  • 批准号:
    0902232
  • 财政年份:
    2008
  • 资助金额:
    $ 25万
  • 项目类别:
    Continuing Grant
CAREER: Semiparametric and Non-Parametric Models for Correlated Data
职业:相关数据的半参数和非参数模型
  • 批准号:
    0348764
  • 财政年份:
    2004
  • 资助金额:
    $ 25万
  • 项目类别:
    Continuing Grant
Semiparametric Models for Correlated Data: The Quadratic Inference Function Approach
相关数据的半参数模型:二次推理函数方法
  • 批准号:
    0103513
  • 财政年份:
    2001
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant

相似海外基金

FRG: Collaborative Research: New birational invariants
FRG:协作研究:新的双有理不变量
  • 批准号:
    2244978
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Continuing Grant
FRG: Collaborative Research: Singularities in Incompressible Flows: Computer Assisted Proofs and Physics-Informed Neural Networks
FRG:协作研究:不可压缩流中的奇异性:计算机辅助证明和物理信息神经网络
  • 批准号:
    2245017
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
FRG: Collaborative Research: Variationally Stable Neural Networks for Simulation, Learning, and Experimental Design of Complex Physical Systems
FRG:协作研究:用于复杂物理系统仿真、学习和实验设计的变稳定神经网络
  • 批准号:
    2245111
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Continuing Grant
FRG: Collaborative Research: Variationally Stable Neural Networks for Simulation, Learning, and Experimental Design of Complex Physical Systems
FRG:协作研究:用于复杂物理系统仿真、学习和实验设计的变稳定神经网络
  • 批准号:
    2245077
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Continuing Grant
FRG: Collaborative Research: Singularities in Incompressible Flows: Computer Assisted Proofs and Physics-Informed Neural Networks
FRG:协作研究:不可压缩流中的奇异性:计算机辅助证明和物理信息神经网络
  • 批准号:
    2244879
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
FRG: Collaborative Research: New Birational Invariants
FRG:合作研究:新的双理性不变量
  • 批准号:
    2245171
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Continuing Grant
FRG: Collaborative Research: Singularities in Incompressible Flows: Computer Assisted Proofs and Physics-Informed Neural Networks
FRG:协作研究:不可压缩流中的奇异性:计算机辅助证明和物理信息神经网络
  • 批准号:
    2403764
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
FRG: Collaborative Research: Singularities in Incompressible Flows: Computer Assisted Proofs and Physics-Informed Neural Networks
FRG:协作研究:不可压缩流中的奇异性:计算机辅助证明和物理信息神经网络
  • 批准号:
    2245021
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Standard Grant
FRG: Collaborative Research: Variationally Stable Neural Networks for Simulation, Learning, and Experimental Design of Complex Physical Systems
FRG:协作研究:用于复杂物理系统仿真、学习和实验设计的变稳定神经网络
  • 批准号:
    2245097
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Continuing Grant
FRG: Collaborative Research: Variationally Stable Neural Networks for Simulation, Learning, and Experimental Design of Complex Physical Systems
FRG:协作研究:用于复杂物理系统仿真、学习和实验设计的变稳定神经网络
  • 批准号:
    2245147
  • 财政年份:
    2023
  • 资助金额:
    $ 25万
  • 项目类别:
    Continuing Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了