FRG: Collaborative Research: Generative Learning on Unstructured Data with Applications to Natural Language Processing and Hyperlink Prediction
FRG:协作研究:非结构化数据的生成学习及其在自然语言处理和超链接预测中的应用
基本信息
- 批准号:1952386
- 负责人:
- 金额:$ 25万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2020
- 资助国家:美国
- 起止时间:2020-07-01 至 2024-06-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This project addresses the pressing needs of analyzing “big” unstructured data and tackles some artificial intelligence questions from the statistical perspective, which requires the focused and synergistic efforts of a collaborative team. Specifically, the project develops generative models for statistical learning and leverages dependence relations modeled by graphical models in hyperlink prediction, which are applicable to topic sentence generation and protein structure identification. It will lead to a substantial improvement in the accuracy of generative learning based on numerical embeddings, particularly in topic sentence generation and hyperlink prediction. The integrated program of research and education will have significant impacts on machine learning and data science, social and political sciences, and biomedical and genomic research, among others. The project requires extensive algorithm and software development for natural language processing and multimedia data integration. The PIs, their postdocs, and students will develop innovative computational algorithms and software for the analysis of large-scale unstructured complex data. The advanced computational tools will be disseminated to facilitate technology transfer. The project will address some fundamental issues in two important areas of unstructured data analysis in machine learning and intelligence. In particular, the proposed research will develop a statistical framework for generative learning, which is primarily motivated by applications for unstructured data, namely topic sentence generation and high-order hyperlink prediction. The research will develop powerful generative methods for generating instances or examples to describe and interpret the corresponding learning model. Moreover, it will develop network models for modeling high-order interactions and relations of units by identifying hidden structures in networks. It will proceed in two areas: (1) instance generation and topic sentence generation; (2) hyperlink prediction for multiway relations in hypergraphs. In the first area, instance generation, particularly sentence generation, will be performed collaboratively with numerical embeddings in categorization and regression. In the second area, hyperlinks will be predicted based on observed pairwise as well as unobserved high-order relations, characterized by graphical models with hidden structures. Special effort will be devoted to inverse learning, the integration of data from multiple sources, and extracting latent structures of networks. Finally, the research will develop computational tools and design practical methods that have desirable statistical properties.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目解决了分析“大”非结构化数据的迫切需求,并从统计角度解决了一些人工智能问题,这需要协作团队的专注和协同工作。具体而言,该项目开发了用于统计学习的生成模型,并利用超链接预测中的图形模型建模的依赖关系,适用于主题句生成和蛋白质结构识别。这将导致基于数值嵌入的生成学习的准确性的实质性提高,特别是在主题句生成和超链接预测方面。研究和教育的综合计划将对机器学习和数据科学,社会和政治科学,生物医学和基因组研究等产生重大影响。该项目需要广泛的算法和软件开发的自然语言处理和多媒体数据集成。PI,他们的博士后和学生将开发用于分析大规模非结构化复杂数据的创新计算算法和软件。将传播先进的计算工具,以促进技术转让。该项目将解决机器学习和智能中非结构化数据分析两个重要领域的一些基本问题。特别是,拟议的研究将开发一个统计框架生成学习,这主要是由非结构化数据的应用程序,即主题句生成和高阶超链接预测的动机。该研究将开发强大的生成方法,用于生成实例或示例来描述和解释相应的学习模型。 此外,它将开发网络模型,通过识别网络中的隐藏结构来模拟高阶相互作用和单元关系。主要从两个方面进行:(1)实例生成和主题句生成;(2)超图中多向关系的超链接预测。在第一个领域,实例生成,特别是句子生成,将与分类和回归中的数值嵌入协同进行。在第二个领域中,超链接将根据观察到的成对以及未观察到的高阶关系进行预测,其特征在于具有隐藏结构的图形模型。特别的努力将致力于逆向学习,整合来自多个来源的数据,并提取网络的潜在结构。 最后,研究将开发计算工具和设计具有理想统计特性的实用方法。该奖项反映了NSF的法定使命,并通过使用基金会的知识价值和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
An Equation for the Identification of Average Causal Effect in Nonlinear Models
非线性模型中平均因果效应识别方程
- DOI:10.5705/ss.202021.0191
- 发表时间:2023
- 期刊:
- 影响因子:1.4
- 作者:Wong, Wing Hung
- 通讯作者:Wong, Wing Hung
Comprehensive tissue deconvolution of cell-free DNA by deep learning for disease diagnosis and monitoring.
通过深度学习疾病诊断和监测,无细胞DNA的全面组织反卷积。
- DOI:10.1073/pnas.2305236120
- 发表时间:2023-07-11
- 期刊:
- 影响因子:11.1
- 作者:Li, Shuo;Zeng, Weihua;Ni, Xiaohui;Liu, Qiao;Li, Wenyuan;Stackpole, Mary L.;Zhou, Yonggang;Gower, Arjan;Krysan, Kostyantyn;Ahuja, Preeti;Lu, David S.;Raman, Steven S.;Hsu, William;Aberle, Denise R.;Magyar, Clara E.;French, Samuel W.;Han, Steven -Huy B.;Garon, Edward B.;Agopian, Vatche G.;Wong, Wing Hung;Dubinett, Steven M.;Zhoua, Xianghong Jasmine
- 通讯作者:Zhoua, Xianghong Jasmine
Convergence Rates of a Class of Multivariate Density Estimation Methods Based on Adaptive Partitioning
一类基于自适应划分的多元密度估计方法的收敛率
- DOI:
- 发表时间:2023
- 期刊:
- 影响因子:6
- 作者:Liu, Linxi;Li, Dangna;Wong, Wing Hung
- 通讯作者:Wong, Wing Hung
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Wing Hung Wong其他文献
Modeling combinatorial regulation from single-cell multi-omics provides regulatory units underpinning cell type landscape using cRegulon
- DOI:
10.1186/s13059-025-03680-w - 发表时间:
2025-07-24 - 期刊:
- 影响因子:9.400
- 作者:
Zhanying Feng;Xi Chen;Zhana Duren;Jingxue Xin;Hao Miao;Qiuyue Yuan;Yong Wang;Wing Hung Wong - 通讯作者:
Wing Hung Wong
Time course regulatory analysis based on paired expression and chromatin accessibility data
- DOI:
http://www.genome.org/cgi/doi/10.1101/gr.257063.119 - 发表时间:
2020 - 期刊:
- 影响因子:
- 作者:
Zhana Duren;Xi Chen;Jingxue Xin;Yong Wang;Wing Hung Wong - 通讯作者:
Wing Hung Wong
EpiGePT: a pretrained transformer-based language model for context-specific human epigenomics
- DOI:
10.1186/s13059-024-03449-7 - 发表时间:
2024-12-18 - 期刊:
- 影响因子:9.400
- 作者:
Zijing Gao;Qiao Liu;Wanwen Zeng;Rui Jiang;Wing Hung Wong - 通讯作者:
Wing Hung Wong
Simultaneous deep generative modelling and clustering of single-cell genomic data
单细胞基因组数据的同时深度生成建模与聚类
- DOI:
10.1038/s42256-021-00333-y - 发表时间:
2021-05-10 - 期刊:
- 影响因子:23.900
- 作者:
Qiao Liu;Shengquan Chen;Rui Jiang;Wing Hung Wong - 通讯作者:
Wing Hung Wong
Author Correction: Regulatory analysis of single cell multiome gene expression and chromatin accessibility data with scREG
- DOI:
10.1186/s13059-022-02786-9 - 发表时间:
2022-10-13 - 期刊:
- 影响因子:9.400
- 作者:
Zhana Duren;Fengge Chang;Fnu Naqing;Jingxue Xin;Qiao Liu;Wing Hung Wong - 通讯作者:
Wing Hung Wong
Wing Hung Wong的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Wing Hung Wong', 18)}}的其他基金
Efficient Monte Carlo Algorithms for Bayesian Inference
用于贝叶斯推理的高效蒙特卡罗算法
- 批准号:
1811920 - 财政年份:2018
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
Collaborative Research: Automatic Video Interpretation and Description
合作研究:自动视频解释和描述
- 批准号:
1721550 - 财政年份:2017
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Statistical learning via multivariate density estimation
通过多元密度估计进行统计学习
- 批准号:
1407557 - 财政年份:2014
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
EAGER: Algorithm-Hardware Co-Design for Multivariate Data Analysis
EAGER:用于多元数据分析的算法-硬件协同设计
- 批准号:
1330132 - 财政年份:2013
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
Monte Carlo and reconfigurable computing in Bayesian inference
贝叶斯推理中的蒙特卡洛和可重构计算
- 批准号:
0906044 - 财政年份:2009
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
Infrastructure for computing with massive datasets in modern statistics
现代统计中海量数据集的计算基础设施
- 批准号:
0821823 - 财政年份:2008
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
Evolutionary and energy-domain Monte Carlo algorithms and their applications
演化和能量域蒙特卡罗算法及其应用
- 批准号:
0505732 - 财政年份:2005
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
Computational Inference, Monte Carlo, and Scientific Applications
计算推理、蒙特卡洛和科学应用
- 批准号:
0090166 - 财政年份:2001
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
Protein Fold Modeling and Recognition From Multiple Structures
多种结构的蛋白质折叠建模和识别
- 批准号:
0196176 - 财政年份:2000
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
相似海外基金
FRG: Collaborative Research: New birational invariants
FRG:协作研究:新的双有理不变量
- 批准号:
2244978 - 财政年份:2023
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
FRG: Collaborative Research: Singularities in Incompressible Flows: Computer Assisted Proofs and Physics-Informed Neural Networks
FRG:协作研究:不可压缩流中的奇异性:计算机辅助证明和物理信息神经网络
- 批准号:
2245017 - 财政年份:2023
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
FRG: Collaborative Research: Variationally Stable Neural Networks for Simulation, Learning, and Experimental Design of Complex Physical Systems
FRG:协作研究:用于复杂物理系统仿真、学习和实验设计的变稳定神经网络
- 批准号:
2245111 - 财政年份:2023
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
FRG: Collaborative Research: Variationally Stable Neural Networks for Simulation, Learning, and Experimental Design of Complex Physical Systems
FRG:协作研究:用于复杂物理系统仿真、学习和实验设计的变稳定神经网络
- 批准号:
2245077 - 财政年份:2023
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
FRG: Collaborative Research: Singularities in Incompressible Flows: Computer Assisted Proofs and Physics-Informed Neural Networks
FRG:协作研究:不可压缩流中的奇异性:计算机辅助证明和物理信息神经网络
- 批准号:
2244879 - 财政年份:2023
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
FRG: Collaborative Research: New Birational Invariants
FRG:合作研究:新的双理性不变量
- 批准号:
2245171 - 财政年份:2023
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
FRG: Collaborative Research: Singularities in Incompressible Flows: Computer Assisted Proofs and Physics-Informed Neural Networks
FRG:协作研究:不可压缩流中的奇异性:计算机辅助证明和物理信息神经网络
- 批准号:
2403764 - 财政年份:2023
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
FRG: Collaborative Research: Singularities in Incompressible Flows: Computer Assisted Proofs and Physics-Informed Neural Networks
FRG:协作研究:不可压缩流中的奇异性:计算机辅助证明和物理信息神经网络
- 批准号:
2245021 - 财政年份:2023
- 资助金额:
$ 25万 - 项目类别:
Standard Grant
FRG: Collaborative Research: Variationally Stable Neural Networks for Simulation, Learning, and Experimental Design of Complex Physical Systems
FRG:协作研究:用于复杂物理系统仿真、学习和实验设计的变稳定神经网络
- 批准号:
2245097 - 财政年份:2023
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant
FRG: Collaborative Research: Variationally Stable Neural Networks for Simulation, Learning, and Experimental Design of Complex Physical Systems
FRG:协作研究:用于复杂物理系统仿真、学习和实验设计的变稳定神经网络
- 批准号:
2245147 - 财政年份:2023
- 资助金额:
$ 25万 - 项目类别:
Continuing Grant