Random Matrices in Multivariate Statistics: Theoretical Developments and Applications

多元统计中的随机矩阵:理论发展和应用

基本信息

  • 批准号:
    0605169
  • 负责人:
  • 金额:
    $ 24万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2006
  • 资助国家:
    美国
  • 起止时间:
    2006-07-01 至 2010-06-30
  • 项目状态:
    已结题

项目摘要

This research program is currently focused on the development of data analysis methods for the new paradigm of high-dimensional problems. The associated theoretical problems are concerned with eigenvalues of large dimensional random matrices. More precisely, three related directions seem of particular interest: 1) further our understanding of the spectral properties of the relevant random matrices; 2) make practical use of the results obtained, combined with some more classical results from random matrix theory; 3) find and contribute to area of applications where this framework is relevant. More specifically, it is now very often that statisticians are faced with ``n times p" data matrices X, for which p, is of the same order of magnitude as n, and p and n are both large. The sample covariance matrix computed from this data is of great importance to a number of applications, as it underlies widely used methods like principal components analysis. However, the theoretical results which underly the method fail to apply in the "large n, large p" setting just described. Hence, a thorough study of sample covariance matrices in this setting is needed. Eigenvalues of such large dimensional matrices are of particular interest. The largest and smallest eigenvalues of these matrices are, from the point of view of applications, particularly interesting. The aim of the study is to obtain central limit type theorems for these extreme eigenvalues and use them in Statistics for, for instance, hypothesis testing, having a notion of power, etc... A more applied part of this work concerns efficiently using results from random matrix theory - new and old - to better estimate the eigenvalues of the population covariance with the ultimate aim of better estimating the whole covariance matrix when p and n are both large.Technological progress allows us to store and use massive amounts of data about many aspects of our daily lives. An interesting problem is to use this data to understand how certain traits depend on each other. In the stock market, we might be interested in how the behavior of one stock affects the behavior of another stock;understanding all these interrelationships leads to having a measure of the risk taken by investing in portfolios that use the corresponding stocks. Statisticians have a number of tools to deal with all these interrelationships. We can discover ways to look at the data so that, even if all interrelationships are small or weak, so each trait "should" not help us learn too much about any other trait, we might find combinations of the traits that carry enormous amounts of information. We also know what are typical values for these combinations, so we might be able to detect unusual things in the data by looking at it the right way. Those statistical techniques have very wide applications in various fields of science, ranging from climatology to genetics, image recognition etc... Thousands of research papers are published each year that use these techniques. However, the theory that underlies these statistical techniques was created in an era where massive datasets just did not exist, as they were not storable. This research project is focusing on theories and their applications that are better suited to handle our current massive datasets. The applications should allow us to see structure where the classical tools fail to see any and tell us when there is no structure when the classical tools tell us there is. We also have increasing evidence that our standard tools give us often very inaccurate results about our standard measures of risk or amount of information carried in combination of traits. It seems that risks might be underestimated and amount of information might be overestimated. Part of this research program will be dedicated to measuring how inaccurate the classical results are for large datasets and how can a more relevant theory be used for correcting these inaccuracies.
该研究计划目前专注于为高维问题的新范式开发数据分析方法。与之相关的理论问题涉及到高维随机矩阵的特征值问题。更确切地说,三个相关的方向似乎特别感兴趣:1)进一步了解相关随机矩阵的谱性质; 2)实际使用所获得的结果,结合随机矩阵理论的一些更经典的结果; 3)找到并有助于该框架相关的应用领域。更具体地说,现在统计学家经常面对"n乘p”的数据矩阵X,其中p与n具有相同的数量级,并且p和n都很大。从这些数据计算的样本协方差矩阵对于许多应用非常重要,因为它是广泛使用的方法(如主成分分析)的基础。然而,该方法所依据的理论结果不能应用于刚刚描述的“大n,大p”设置。因此,在这种情况下,样本协方差矩阵的深入研究是必要的。这种大维度矩阵的特征值特别令人感兴趣。这些矩阵的最大和最小特征值,从应用的角度来看,特别有趣。这项研究的目的是获得这些极端特征值的中心极限类型定理,并将其用于统计学,例如假设检验,具有权力的概念等。这项工作的一个更实用的部分涉及有效地使用随机矩阵理论的结果-新的和旧的-更好地估计总体协方差的特征值,最终目的是更好地估计整个协方差矩阵时,p和n都很大。技术进步使我们能够存储和使用大量的数据,我们日常生活的许多方面。一个有趣的问题是使用这些数据来理解某些特征是如何相互依赖的。在股票市场上,我们可能会对一只股票的行为如何影响另一只股票的行为感兴趣;了解所有这些相互关系可以衡量投资于使用相应股票的投资组合所承担的风险。统计学家有许多工具来处理所有这些相互关系。我们可以找到查看数据的方法,即使所有的相互关系都很小或很弱,所以每个特征“不应该”帮助我们了解任何其他特征,我们可能会发现携带大量信息的特征组合。我们也知道这些组合的典型值是什么,所以我们可以通过正确的方式来检测数据中的异常情况。 这些统计技术在各个科学领域都有非常广泛的应用,从气候学到遗传学,图像识别等。每年都有成千上万的研究论文使用这些技术。然而,这些统计技术背后的理论是在一个不存在大规模数据集的时代创建的,因为它们不可存储。这个研究项目的重点是更适合处理我们当前大量数据集的理论及其应用。应用程序应该允许我们看到经典工具看不到的结构,并在经典工具告诉我们没有结构时告诉我们。我们也有越来越多的证据表明,我们的标准工具经常给我们非常不准确的结果,关于我们的风险标准措施或特征组合所携带的信息量。似乎风险可能被低估,信息量可能被高估。该研究计划的一部分将致力于测量大型数据集的经典结果有多不准确,以及如何使用更相关的理论来纠正这些不准确。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Noureddine El Karoui其他文献

Kernel density estimation with Berkson error
使用 Berkson 误差进行核密度估计
  • DOI:
    10.1002/cjs.11281
  • 发表时间:
    2014
  • 期刊:
  • 影响因子:
    0
  • 作者:
    J. P. Long;Noureddine El Karoui;J. Rice
  • 通讯作者:
    J. Rice
Revenue-Maximizing Auctions: A Bidder’s Standpoint
收入最大化拍卖:投标人的立场
  • DOI:
    10.2139/ssrn.3827136
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Thomas Nedelec;Clément Calauzènes;Vianney Perchet;Noureddine El Karoui
  • 通讯作者:
    Noureddine El Karoui

Noureddine El Karoui的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Noureddine El Karoui', 18)}}的其他基金

High-dimensional M-estimation: Understanding risk, improving performance and assessing resampling
高维 M 估计:了解风险、提高性能和评估重采样
  • 批准号:
    1510172
  • 财政年份:
    2015
  • 资助金额:
    $ 24万
  • 项目类别:
    Continuing Grant
CAREER: Random matrices and High-dimensional statistics
职业:随机矩阵和高维统计
  • 批准号:
    0847647
  • 财政年份:
    2009
  • 资助金额:
    $ 24万
  • 项目类别:
    Continuing Grant

相似海外基金

Reconfigurable Intelligent Surfaces 2.0 for 6G: Beyond Diagonal Phase Shift Matrices
适用于 6G 的可重构智能表面 2.0:超越对角相移矩阵
  • 批准号:
    EP/Y004086/1
  • 财政年份:
    2024
  • 资助金额:
    $ 24万
  • 项目类别:
    Research Grant
Designing synthetic matrices for enhanced organoid development: A step towards better disease understanding
设计合成基质以增强类器官发育:更好地了解疾病的一步
  • 批准号:
    MR/Y033760/1
  • 财政年份:
    2024
  • 资助金额:
    $ 24万
  • 项目类别:
    Research Grant
2024 Signal Transduction in Engineered Extracellular Matrices Gordon Research Conference and Seminar; Southern New Hampshire University, Manchester, New Hampshire; 20-26 July 2024
2024年工程细胞外基质信号转导戈登研究会议及研讨会;
  • 批准号:
    2414497
  • 财政年份:
    2024
  • 资助金额:
    $ 24万
  • 项目类别:
    Standard Grant
Electrospun mucoadhesive matrices for polymersome-mediated mRNA vaccine delivery
用于聚合物囊泡介导的 mRNA 疫苗递送的电纺粘膜粘附基质
  • 批准号:
    BB/Y007514/1
  • 财政年份:
    2024
  • 资助金额:
    $ 24万
  • 项目类别:
    Research Grant
Random Matrices and Functional Inequalities on Spaces of Graphs
图空间上的随机矩阵和函数不等式
  • 批准号:
    2331037
  • 财政年份:
    2023
  • 资助金额:
    $ 24万
  • 项目类别:
    Continuing Grant
Collaborative Research: Random Matrices and Algorithms in High Dimension
合作研究:高维随机矩阵和算法
  • 批准号:
    2306438
  • 财政年份:
    2023
  • 资助金额:
    $ 24万
  • 项目类别:
    Continuing Grant
Some topics in Analysis and Probability in Metric Measure Spaces, Random Matrices, and Diffusions
度量测度空间、随机矩阵和扩散中的分析和概率中的一些主题
  • 批准号:
    2247117
  • 财政年份:
    2023
  • 资助金额:
    $ 24万
  • 项目类别:
    Standard Grant
Novel Bioprinted Neural Stem Cell-Embedded Hydrogel Matrices for Enhanced Treatment of Glioblastoma
新型生物打印神经干细胞嵌入水凝胶基质,用于增强胶质母细胞瘤的治疗
  • 批准号:
    10749330
  • 财政年份:
    2023
  • 资助金额:
    $ 24万
  • 项目类别:
Random Matrices, Random Graphs, and Deep Neural Networks
随机矩阵、随机图和深度神经网络
  • 批准号:
    2331096
  • 财政年份:
    2023
  • 资助金额:
    $ 24万
  • 项目类别:
    Standard Grant
Asymptotics of Toeplitz determinants, soft Riemann-Hilbert problems and generalised Hilbert matrices (HilbertToeplitz)
Toeplitz 行列式的渐进性、软黎曼-希尔伯特问题和广义希尔伯特矩阵 (HilbertToeplitz)
  • 批准号:
    EP/X024555/1
  • 财政年份:
    2023
  • 资助金额:
    $ 24万
  • 项目类别:
    Fellowship
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了