RI: Medium: Collaborative Research:Algorithmic High-Dimensional Statistics: Optimality, Computtional Barriers, and High-Dimensional Corrections
RI:中:协作研究:算法高维统计:最优性、计算障碍和高维校正
基本信息
- 批准号:1900140
- 负责人:
- 金额:$ 38.5万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2019
- 资助国家:美国
- 起止时间:2019-08-01 至 2022-04-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This research aims to address the pressing challenges on learning and inference from large-dimensional data. Contemporary sensing and data acquisition technologies produce data at an unprecedented rate. A ubiquitous challenge in modern data applications is thus to efficiently and reliably extract relevant information and associated insights from a deluge of data. In the meantime, this challenge is exacerbated by the unprecedented growth of relevant features one needs to reason about, which oftentimes even outpaces the growth of data samples. Classical statistical inference paradigms, which either only work in the presence of an enormous number of data samples, or ignore the computational cost of the estimators at all, become highly insufficient, or even unreliable, for many emerging applications of machine learning and big-data analytics. To address the above pressing issues in high dimensions, novel theoretical tools need to be brought in the picture in order to provide a comprehensive understanding of the performance limits of various algorithms and tasks. The goal of this project is four-fold: First, to develop a modern theory to characterize precise performance of classical statistical algorithms in high dimensions. Second, to suggest proper corrections of classical statistical inference procedures to accommodate the sample-starved regime. Third, to develop computationally efficient algorithms that can provably attain the fundamental statistical limits, if possible. Finally, forth, to identify potential computational barriers if the fundamental statistical limits cannot be met. The transformative potential of the proposed research program is in the development of foundational statistical data analytics theory through a novel combination of statistics, approximation theory, statistical physics, mathematical optimization, and information theory, offering scalable statistical inference and learning algorithms. The theory and algorithms developed within this project will have direct impact on various engineering and science applications such as large-scale machine learning, DNA sequencing, genetic disease analysis, and natural language processing. This collaborative program provides cross-university opportunities for students training, and we are committed to engaging and helping underrepresented and women students in STEM through long-term mentorships and outreach activities.This research aims to address the pressing challenges on learning and inference from large-dimensional data. Contemporary sensing and data acquisition technologies produce data at an unprecedented rate. A ubiquitous challenge in modern data applications is thus to efficiently and reliably extract relevant information and associated insights from a deluge of data. In the meantime, this challenge is exacerbated by the unprecedented growth of relevant features one needs to reason about, which oftentimes even outpaces the growth of data samples. Classical statistical inference paradigms, which either only work in the presence of an enormous number of data samples, or ignore the computational cost of the estimators at all, become highly insufficient, or even unreliable, for many emerging applications of machine learning and big-data analytics. To address the above pressing issues in high dimensions, novel theoretical tools need to be brought in the picture in order to provide a comprehensive understanding of the performance limits of various algorithms and tasks. The goal of this project is four-fold: First, to develop a modern theory to characterize precise performance of classical statistical algorithms in high dimensions. Second, to suggest proper corrections of classical statistical inference procedures to accommodate the sample-starved regime. Third, to develop computationally efficient algorithms that can provably attain the fundamental statistical limits, if possible. Finally, forth, to identify potential computational barriers if the fundamental statistical limits cannot be met. The transformative potential of the proposed research program is in the development of foundational statistical data analytics theory through a novel combination of statistics, approximation theory, statistical physics, mathematical optimization, and information theory, offering scalable statistical inference and learning algorithms. The theory and algorithms developed within this project will have direct impact on various engineering and science applications such as large-scale machine learning, DNA sequencing, genetic disease analysis, and natural language processing. This collaborative program provides cross-university opportunities for students training, and we are committed to engaging and helping underrepresented and women students in STEM through long-term mentorships and outreach activities.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该研究旨在解决从大维度数据中学习和推理的紧迫挑战。当代传感和数据采集技术以前所未有的速度产生数据。因此,现代数据应用中普遍存在的挑战是从大量数据中有效可靠地提取相关信息和相关见解。与此同时,人们需要推理的相关特征的前所未有的增长加剧了这一挑战,这往往超过了数据样本的增长。经典的统计推断范式,要么只在存在大量数据样本的情况下工作,要么完全忽略估计器的计算成本,对于许多新兴的机器学习和大数据分析应用来说,变得非常不足,甚至不可靠。为了解决上述高维度的紧迫问题,需要引入新的理论工具,以便全面了解各种算法和任务的性能限制。这个项目的目标是四重:第一,发展一个现代的理论来描述经典统计算法在高维中的精确性能。第二,建议适当纠正经典的统计推断程序,以适应样本匮乏的制度。第三,如果可能的话,开发计算效率高的算法,可以证明达到基本的统计极限。最后,第四,确定潜在的计算障碍,如果基本的统计限制不能得到满足。拟议的研究计划的变革潜力是通过统计学,近似理论,统计物理学,数学优化和信息论的新组合来发展基础统计数据分析理论,提供可扩展的统计推断和学习算法。 该项目开发的理论和算法将直接影响各种工程和科学应用,如大规模机器学习,DNA测序,遗传疾病分析和自然语言处理。该合作项目为学生培训提供了跨大学的机会,我们致力于通过长期的导师和外展活动吸引和帮助STEM领域代表性不足的学生和女性学生。这项研究旨在解决从大规模数据中学习和推理的紧迫挑战。当代传感和数据采集技术以前所未有的速度产生数据。因此,现代数据应用中普遍存在的挑战是从大量数据中有效可靠地提取相关信息和相关见解。与此同时,人们需要推理的相关特征的前所未有的增长加剧了这一挑战,这往往超过了数据样本的增长。经典的统计推断范式,要么只在存在大量数据样本的情况下工作,要么完全忽略估计器的计算成本,对于许多新兴的机器学习和大数据分析应用来说,变得非常不足,甚至不可靠。为了解决上述高维度的紧迫问题,需要引入新的理论工具,以便全面了解各种算法和任务的性能限制。这个项目的目标是四重:第一,发展一个现代的理论来描述经典统计算法在高维中的精确性能。第二,建议适当纠正经典的统计推断程序,以适应样本匮乏的制度。第三,如果可能的话,开发计算效率高的算法,可以证明达到基本的统计极限。最后,第四,确定潜在的计算障碍,如果基本的统计限制不能得到满足。拟议的研究计划的变革潜力是通过统计学,近似理论,统计物理学,数学优化和信息论的新组合来发展基础统计数据分析理论,提供可扩展的统计推断和学习算法。 该项目开发的理论和算法将直接影响各种工程和科学应用,如大规模机器学习,DNA测序,遗传疾病分析和自然语言处理。这个合作项目为学生提供跨大学的培训机会,我们致力于通过长期的指导和推广活动,吸引和帮助在STEM领域代表性不足的学生和女性学生。这个奖项反映了NSF的法定使命,并通过使用基金会的智力价值和更广泛的影响力审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(18)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Inference and uncertainty quantification for noisy matrix completion
- DOI:10.1073/pnas.1910053116
- 发表时间:2019-11-12
- 期刊:
- 影响因子:11.1
- 作者:Chen, Yuxin;Fan, Jianqing;Yan, Yuling
- 通讯作者:Yan, Yuling
Nonconvex Low-Rank Symmetric Tensor Completion from Noisy Data
- DOI:
- 发表时间:2019-11
- 期刊:
- 影响因子:0
- 作者:Changxiao Cai;Gen Li;H. Poor;Yuxin Chen
- 通讯作者:Changxiao Cai;Gen Li;H. Poor;Yuxin Chen
Nonconvex Matrix Factorization From Rank-One Measurements
- DOI:10.1109/tit.2021.3050427
- 发表时间:2018-02
- 期刊:
- 影响因子:2.5
- 作者:Yuanxin Li;Cong Ma;Yuxin Chen;Yuejie Chi
- 通讯作者:Yuanxin Li;Cong Ma;Yuxin Chen;Yuejie Chi
Uncertainty Quantification for Nonconvex Tensor Completion: Confidence Intervals, Heteroscedasticity and Optimality
- DOI:10.1109/tit.2022.3205781
- 发表时间:2020-06
- 期刊:
- 影响因子:2.5
- 作者:Changxiao Cai;H. Poor;Yuxin Chen
- 通讯作者:Changxiao Cai;H. Poor;Yuxin Chen
Communication-Efficient Distributed Optimization in Networks with Gradient Tracking and Variance Reduction
- DOI:
- 发表时间:2019-09
- 期刊:
- 影响因子:0
- 作者:Boyue Li;Shicong Cen;Yuxin Chen;Yuejie Chi
- 通讯作者:Boyue Li;Shicong Cen;Yuxin Chen;Yuejie Chi
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Yuxin Chen其他文献
Settling the Sample Complexity of Model-Based Offline Reinforcement Learning
解决基于模型的离线强化学习的样本复杂度
- DOI:
10.48550/arxiv.2204.05275 - 发表时间:
2022 - 期刊:
- 影响因子:0
- 作者:
Gen Li;Laixi Shi;Yuxin Chen;Yuejie Chi;Yuting Wei - 通讯作者:
Yuting Wei
Simultaneous determination of polycyclic musks in blood and urine by solid supported liquid–liquid extraction and gas chromatography–tandem mass spectrometry
固载液-液萃取气相色谱-串联质谱法同时测定血液和尿液中的多环麝香
- DOI:
10.1016/j.jchromb.2015.04.028 - 发表时间:
2015 - 期刊:
- 影响因子:3
- 作者:
Hongtao Liu;Liping Huang;Yuxin Chen;Liman Guo;Limin Li;Haiyun Zhou;Tiangang Luan - 通讯作者:
Tiangang Luan
Intelligent GP fusion from multiple sources for text classification
多源智能 GP 融合用于文本分类
- DOI:
10.1145/1099554.1099688 - 发表时间:
2005 - 期刊:
- 影响因子:0
- 作者:
Baoping Zhang;Yuxin Chen;Weiguo Fan;E. Fox;Marcos André Gonçalves;Marco Cristo;P. Calado - 通讯作者:
P. Calado
UNDERSTANDING USER INTENTIONS IN VERTICAL IMAGE SEARCH
了解垂直图像搜索中的用户意图
- DOI:
- 发表时间:
2011 - 期刊:
- 影响因子:0
- 作者:
Yuxin Chen - 通讯作者:
Yuxin Chen
Coptis chinensis inflorescence extract protection against ultraviolet-B-induced phototoxicity, and HPLC-MS analysis of its chemical composition.
黄连花序提取物对紫外线B诱导的光毒性的保护作用及其化学成分的HPLC-MS分析。
- DOI:
- 发表时间:
2012 - 期刊:
- 影响因子:4.3
- 作者:
Lingxin Zhu;Bo Huang;Xiaoquan Ban;Jingsheng He;Yuxin Chen;Li Han;Youwei Wang - 通讯作者:
Youwei Wang
Yuxin Chen的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Yuxin Chen', 18)}}的其他基金
Collaborative Research: RI: Small: Foundations of Few-Round Active Learning
协作研究:RI:小型:少轮主动学习的基础
- 批准号:
2313131 - 财政年份:2023
- 资助金额:
$ 38.5万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Medium: Statistical and Algorithmic Foundations of Efficient Reinforcement Learning
合作研究:CIF:媒介:高效强化学习的统计和算法基础
- 批准号:
2221009 - 财政年份:2022
- 资助金额:
$ 38.5万 - 项目类别:
Continuing Grant
RI: Medium: Collaborative Research:Algorithmic High-Dimensional Statistics: Optimality, Computtional Barriers, and High-Dimensional Corrections
RI:中:协作研究:算法高维统计:最优性、计算障碍和高维校正
- 批准号:
2218713 - 财政年份:2022
- 资助金额:
$ 38.5万 - 项目类别:
Standard Grant
RI: Small: Uncertainty Quantification for Nonconvex Low-Complexity Models
RI:小:非凸低复杂度模型的不确定性量化
- 批准号:
2218773 - 财政年份:2022
- 资助金额:
$ 38.5万 - 项目类别:
Standard Grant
Collaborative Research: CIF: Medium: Statistical and Algorithmic Foundations of Efficient Reinforcement Learning
合作研究:CIF:媒介:高效强化学习的统计和算法基础
- 批准号:
2106739 - 财政年份:2021
- 资助金额:
$ 38.5万 - 项目类别:
Continuing Grant
RI: Small: Uncertainty Quantification for Nonconvex Low-Complexity Models
RI:小:非凸低复杂度模型的不确定性量化
- 批准号:
2100158 - 财政年份:2021
- 资助金额:
$ 38.5万 - 项目类别:
Standard Grant
Collaborative Research: Fine-Grained Statistical Inference in High Dimension: Actionable Information, Bias Reduction, and Optimality
协作研究:高维细粒度统计推断:可操作信息、减少偏差和最优性
- 批准号:
2014279 - 财政年份:2020
- 资助金额:
$ 38.5万 - 项目类别:
Standard Grant
CIF: Small: Taming Nonconvexity in High-Dimensional Statistical Estimation
CIF:小:驯服高维统计估计中的非凸性
- 批准号:
1907661 - 财政年份:2019
- 资助金额:
$ 38.5万 - 项目类别:
Standard Grant
相似国自然基金
水-土-固废多介质中典型新污染物筛查评估与多场景协同治理关键技术研发与应用
- 批准号:
- 批准年份:2025
- 资助金额:0.0 万元
- 项目类别:省市级项目
数据驱动多介质协同碳纳米管负载过渡
族金属化合物选择性去除新污染物
- 批准号:
- 批准年份:2025
- 资助金额:10.0 万元
- 项目类别:省市级项目
裂隙介质中核素Sr与胶体协同运移的机理研究
- 批准号:42302274
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
天然气掺氢输送环境多介质协同的管线钢氢渗透机制与氢损伤判据
- 批准号:52301075
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
高温强场下接枝亲电子体储能电介质短时击穿与长时耐久协同提升机制
- 批准号:52307022
- 批准年份:2023
- 资助金额:30.00 万元
- 项目类别:青年科学基金项目
基于多目标参数协同优化的大气压介质阻挡放电双频谐波调控技术研究
- 批准号:52377141
- 批准年份:2023
- 资助金额:50 万元
- 项目类别:面上项目
非均质软体机器人介质分布与肌腱布置的协同设计原理与方法
- 批准号:52305014
- 批准年份:2023
- 资助金额:30 万元
- 项目类别:青年科学基金项目
流化态催化剂提升介质阻挡放电与催化剂协同效应及生物质焦油转化研究
- 批准号:52377147
- 批准年份:2023
- 资助金额:52 万元
- 项目类别:面上项目
亚熔盐介质低氧压碱浸软锰矿制备锰酸钾多相反应/传递协同增效机制
- 批准号:52364045
- 批准年份:2023
- 资助金额:33 万元
- 项目类别:地区科学基金项目
Nd-Fe-B介质/缺陷诱导下晶界扩散迁移行为及协同调控机制研究
- 批准号:52361033
- 批准年份:2023
- 资助金额:32 万元
- 项目类别:地区科学基金项目
相似海外基金
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312841 - 财政年份:2023
- 资助金额:
$ 38.5万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312842 - 财政年份:2023
- 资助金额:
$ 38.5万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
- 批准号:
2313151 - 财政年份:2023
- 资助金额:
$ 38.5万 - 项目类别:
Continuing Grant
Collaborative Research: RI: Medium: Principles for Optimization, Generalization, and Transferability via Deep Neural Collapse
合作研究:RI:中:通过深度神经崩溃实现优化、泛化和可迁移性的原理
- 批准号:
2312840 - 财政年份:2023
- 资助金额:
$ 38.5万 - 项目类别:
Standard Grant
Collaborative Research: CompCog: RI: Medium: Understanding human planning through AI-assisted analysis of a massive chess dataset
合作研究:CompCog:RI:中:通过人工智能辅助分析海量国际象棋数据集了解人类规划
- 批准号:
2312374 - 财政年份:2023
- 资助金额:
$ 38.5万 - 项目类别:
Standard Grant
Collaborative Research: CompCog: RI: Medium: Understanding human planning through AI-assisted analysis of a massive chess dataset
合作研究:CompCog:RI:中:通过人工智能辅助分析海量国际象棋数据集了解人类规划
- 批准号:
2312373 - 财政年份:2023
- 资助金额:
$ 38.5万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
- 批准号:
2313149 - 财政年份:2023
- 资助金额:
$ 38.5万 - 项目类别:
Continuing Grant
Collaborative Research: RI: Medium: Superhuman Imitation Learning from Heterogeneous Demonstrations
合作研究:RI:媒介:异质演示中的超人模仿学习
- 批准号:
2312955 - 财政年份:2023
- 资助金额:
$ 38.5万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Informed, Fair, Efficient, and Incentive-Aware Group Decision Making
协作研究:RI:媒介:知情、公平、高效和具有激励意识的群体决策
- 批准号:
2313137 - 财政年份:2023
- 资助金额:
$ 38.5万 - 项目类别:
Standard Grant
Collaborative Research: RI: Medium: Lie group representation learning for vision
协作研究:RI:中:视觉的李群表示学习
- 批准号:
2313150 - 财政年份:2023
- 资助金额:
$ 38.5万 - 项目类别:
Continuing Grant