权益分类	功能权益	普通用户	{{item.name}}会员
{{category.name}}	{{benefitItem.name}}

RI: Medium: Collaborative Research: Algorithmic High-Dimensional Statistics: Statistical Optimality, Computational Barriers, and High-Dimensional Corrections

RI：中：协作研究：算法高维统计：统计最优性、计算障碍和高维校正

基本信息

批准号：
1901252
负责人：
Michael Jordan
金额：
$ 75.5万
依托单位：
University of California-Berkeley
依托单位国家：
美国
项目类别：
Standard Grant
财政年份：
2019
资助国家：
美国
起止时间：
2019-08-01 至 2024-07-31
项目状态：
已结题

来源：
https://www.nsf.gov/awardsearch/showAward?AWD_ID=1901252&HistoricalAwards=false
关键词：
RI Medium Collaborative Research Algorithmic

项目摘要

This research aims to address the pressing challenges on learning and inference from large-dimensional data. Contemporary sensing and data acquisition technologies produce data at an unprecedented rate. A ubiquitous challenge in modern data applications is thus to efficiently and reliably extract relevant information and associated insights from a deluge of data. In the meantime, this challenge is exacerbated by the unprecedented growth of relevant features one needs to reason about, which oftentimes even outpaces the growth of data samples. Classical statistical inference paradigms, which either only work in the presence of an enormous number of data samples, or ignore the computational cost of the estimators at all, become highly insufficient, or even unreliable, for many emerging applications of machine learning and big-data analytics. To address the above pressing issues in high dimensions, novel theoretical tools need to be brought in the picture in order to provide a comprehensive understanding of the performance limits of various algorithms and tasks. The goal of this project is four-fold: First, to develop a modern theory to characterize precise performance of classical statistical algorithms in high dimensions. Second, to suggest proper corrections of classical statistical inference procedures to accommodate the sample-starved regime. Third, to develop computationally efficient algorithms that can provably attain the fundamental statistical limits, if possible. Finally, forth, to identify potential computational barriers if the fundamental statistical limits cannot be met. The transformative potential of the proposed research program is in the development of foundational statistical data analytics theory through a novel combination of statistics, approximation theory, statistical physics, mathematical optimization, and information theory, offering scalable statistical inference and learning algorithms. The theory and algorithms developed within this project will have direct impact on various engineering and science applications such as large-scale machine learning, DNA sequencing, genetic disease analysis, and natural language processing. This collaborative program provides cross-university opportunities for students training, and we are committed to engaging and helping underrepresented and women students in STEM through long-term mentorships and outreach activities.This research aims to address the pressing challenges on learning and inference from large-dimensional data. Contemporary sensing and data acquisition technologies produce data at an unprecedented rate. A ubiquitous challenge in modern data applications is thus to efficiently and reliably extract relevant information and associated insights from a deluge of data. In the meantime, this challenge is exacerbated by the unprecedented growth of relevant features one needs to reason about, which oftentimes even outpaces the growth of data samples. Classical statistical inference paradigms, which either only work in the presence of an enormous number of data samples, or ignore the computational cost of the estimators at all, become highly insufficient, or even unreliable, for many emerging applications of machine learning and big-data analytics. To address the above pressing issues in high dimensions, novel theoretical tools need to be brought in the picture in order to provide a comprehensive understanding of the performance limits of various algorithms and tasks. The goal of this project is four-fold: First, to develop a modern theory to characterize precise performance of classical statistical algorithms in high dimensions. Second, to suggest proper corrections of classical statistical inference procedures to accommodate the sample-starved regime. Third, to develop computationally efficient algorithms that can provably attain the fundamental statistical limits, if possible. Finally, forth, to identify potential computational barriers if the fundamental statistical limits cannot be met. The transformative potential of the proposed research program is in the development of foundational statistical data analytics theory through a novel combination of statistics, approximation theory, statistical physics, mathematical optimization, and information theory, offering scalable statistical inference and learning algorithms. The theory and algorithms developed within this project will have direct impact on various engineering and science applications such as large-scale machine learning, DNA sequencing, genetic disease analysis, and natural language processing. This collaborative program provides cross-university opportunities for students training, and we are committed to engaging and helping underrepresented and women students in STEM through long-term mentorships and outreach activities.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.

这项研究旨在解决大维数据学习和推理的紧迫挑战。当代传感和数据采集技术以前所未有的速度产生数据。因此，现代数据应用中普遍存在的挑战是从海量数据中高效、可靠地提取相关信息和相关见解。与此同时，需要推理的相关特征的空前增长加剧了这一挑战，这种增长常常甚至超过了数据样本的增长。经典的统计推理范式要么仅在存在大量数据样本的情况下起作用，要么根本忽略估计器的计算成本，对于机器学习和大数据分析的许多新兴应用来说变得非常不足，甚至不可靠。为了解决高维度的上述紧迫问题，需要引入新颖的理论工具，以便全面了解各种算法和任务的性能限制。该项目的目标有四个：首先，开发一种现代理论来表征高维经典统计算法的精确性能。其次，建议对经典统计推断程序进行适当的修正，以适应样本匮乏的情况。第三，如果可能的话，开发计算高效的算法，可以证明达到基本的统计极限。最后，如果无法满足基本统计限制，则确定潜在的计算障碍。拟议研究计划的变革潜力在于通过统计学、近似理论、统计物理学、数学优化和信息论的新颖组合来发展基础统计数据分析理论，提供可扩展的统计推理和学习算法。该项目开发的理论和算法将对各种工程和科学应用产生直接影响，例如大规模机器学习、DNA测序、遗传疾病分析和自然语言处理。该合作项目为学生培训提供了跨大学的机会，我们致力于通过长期的指导和推广活动吸引和帮助 STEM 领域中代表性不足的女学生。这项研究旨在解决从大维数据中学习和推理的紧迫挑战。当代传感和数据采集技术以前所未有的速度产生数据。因此，现代数据应用中普遍存在的挑战是从海量数据中高效、可靠地提取相关信息和相关见解。与此同时，需要推理的相关特征的空前增长加剧了这一挑战，这种增长常常甚至超过了数据样本的增长。经典的统计推理范式要么仅在存在大量数据样本的情况下起作用，要么根本忽略估计器的计算成本，对于机器学习和大数据分析的许多新兴应用来说变得非常不足，甚至不可靠。为了解决高维度的上述紧迫问题，需要引入新颖的理论工具，以便全面了解各种算法和任务的性能限制。该项目的目标有四个：首先，开发一种现代理论来表征高维经典统计算法的精确性能。其次，建议对经典统计推断程序进行适当的修正，以适应样本匮乏的情况。第三，如果可能的话，开发计算高效的算法，可以证明达到基本的统计极限。最后，如果无法满足基本统计限制，则确定潜在的计算障碍。拟议研究计划的变革潜力在于通过统计学、近似理论、统计物理学、数学优化和信息论的新颖组合来发展基础统计数据分析理论，提供可扩展的统计推理和学习算法。该项目开发的理论和算法将对各种工程和科学应用产生直接影响，例如大规模机器学习、DNA测序、遗传疾病分析和自然语言处理。该合作项目为学生培训提供跨大学的机会，我们致力于通过长期指导和推广活动吸引和帮助 STEM 领域中代表性不足的女学生。该奖项反映了 NSF 的法定使命，并通过使用基金会的智力优势和更广泛的影响审查标准进行评估，被认为值得支持。

项目成果

期刊论文数量（17）

专著数量（0）

科研奖励数量（0）

会议论文数量（0）

专利数量（0）

Robust Estimation for Nonparametric Families via Generative Adversarial Networks

通过生成对抗网络对非参数族进行稳健估计

DOI：
发表时间：
2022
期刊：
IEEE International Symposium on Information Theory 2022
影响因子：
0
作者：
Zhu, Banghua;Jiao, Jiantao;Jordan, Michael
通讯作者：
Jordan, Michael

SLIP: Learning to Predict in Unknown Dynamical Systems with Long-Term Memory

SLIP：利用长期记忆学习在未知动态系统中进行预测

DOI：
发表时间：
2020
期刊：
Canada
影响因子：
0
作者：
Rashidiejad, Paria;Jiao, Jiantao;Russell, Stuart
通讯作者：
Russell, Stuart

On estimation of $$L_{r}$$-norms in Gaussian white noise models

高斯白噪声模型中 $$L_{r}$$-范数的估计

DOI：
10.1007/s00440-020-00982-x
发表时间：
2020
期刊：
Probability Theory and Related Fields
影响因子：
2
作者：
Han, Yanjun;Jiao, Jiantao;Mukherjee, Rajarshi
通讯作者：
Mukherjee, Rajarshi

Private Prediction Sets

DOI：
10.1162/99608f92.16c71dad
发表时间：
2021-02
期刊：
ArXiv
影响因子：
0
作者：
Anastasios Nikolas Angelopoulos;Stephen Bates;Tijana Zrnic;Michael I. Jordan
通讯作者：
Anastasios Nikolas Angelopoulos;Stephen Bates;Tijana Zrnic;Michael I. Jordan

On the Value of Interaction and Function Approximation in Imitation Learning

DOI：
发表时间：
2021
期刊：
影响因子：
0
作者：
Nived Rajaraman;Yanjun Han;L. Yang;Jingbo Liu;Jiantao Jiao;K. Ramchandran
通讯作者：
Nived Rajaraman;Yanjun Han;L. Yang;Jingbo Liu;Jiantao Jiao;K. Ramchandran

DOI：
{{ item.doi }}
发表时间：
{{ item.publish_year }}
期刊：
{{ item.journal_name }}
影响因子：
{{ item.factor }}
作者：
{{ item.authors }}
通讯作者：
{{ item.author }}

数据更新时间：{{ journalArticles.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ monograph.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ sciAawards.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ conferencePapers.updateTime }}

作者：
{{ item.author }}

数据更新时间：{{ patent.updateTime }}

Michael Jordan其他文献

Age Impacts Risk of Mixed Chimerism Following RIC HCT for Non-SCID Inborn Errors of Immunity.

年龄影响针对非 SCID 先天性免疫缺陷的 RIC HCT 后混合嵌合现象的风险。

DOI：
10.1016/j.jtct.2023.09.024
发表时间：
2023
期刊：
Transplantation and cellular therapy
影响因子：
3.2
作者：
Taylor Fitch;Adam Lane;John C McDonnell;J. Bleesing;Michael Jordan;Ashish Kumar;P. Khandelwal;Ruby Khoury;R. Marsh;S. Chandra
通讯作者：
S. Chandra

Lymphopenia in Patients with Hemophagocytic Lymphohistiocytosis: Are B Cells Suppressed in These Patients?

DOI：
10.1016/j.bbmt.2013.12.270
发表时间：
2014-02-01
期刊：
Conference abstract
影响因子：
作者：
Sharat Chandra;Alexandra Filipovich;Michael Jordan
通讯作者：
Michael Jordan

<strong>Different monitoring patterns in treated and untreated patients with Fabry disease: Analysis of a United States claims database</strong>

DOI：
10.1016/j.ymgme.2023.107947
发表时间：
2024-02-01
期刊：
Conference abstract
影响因子：
作者：
Irina Maksimova;Alexandra Dumitriu;Ana Crespo;Gandarvaka Miles;Andrea Ocampo;Queeny Ip;Michael Jordan;Natalia Petruski-Ivleva;Roberto Araujo
通讯作者：
Roberto Araujo

strongDifferent monitoring patterns in treated and untreated patients with Fabry disease: Analysis of a United States claims database/strong

经过治疗和未经治疗的Fabry疾病患者的强度监测模式：美国的分析声称数据库/强

DOI：
10.1016/j.ymgme.2023.107947
发表时间：
2024-02-01
期刊：
MOLECULAR GENETICS AND METABOLISM
影响因子：
3.500
作者：
Irina Maksimova;Alexandra Dumitriu;Ana Crespo;Gandarvaka Miles;Andrea Ocampo;Queeny Ip;Michael Jordan;Natalia Petruski-Ivleva;Roberto Araujo
通讯作者：
Roberto Araujo