Sparse Principal Component Analysis via the Sparsest Element in a Subspace

通过子空间中最稀疏元素的稀疏主成分分析

基本信息

  • 批准号:
    1418971
  • 负责人:
  • 金额:
    $ 13.38万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2014
  • 资助国家:
    美国
  • 起止时间:
    2014-09-01 至 2014-11-30
  • 项目状态:
    已结题

项目摘要

Sparse principal component analysis (PCA) is a technique that allows biologists and other scientists to interpret experimental data in terms of very few variables. For example, it can help identify which among thousands of genes are important in distinguishing different types of cancer. In order for scientists and engineers to select the best algorithm for finding sparse principal components, it is important to have a theoretical understanding of the performance of many algorithms under a realistic data model. Most existing theoretical understanding focuses on the simple case where there is a single component that happens to be sparse. The proposed work will introduce a new model in which there are multiple components, of which one is sparse. For a special case of this more realistic model, the proposed work attempts to understand if there are any conditions under which sophisticated convex programs are provably better than very simple algorithms. Either outcome would be informative in helping researchers decide between the many algorithms for sparse PCA. In this project, sparse PCA will be studied from the perspective of finding the sparsest element in a subspace. This perspective is motivated by a multispike data model, which the PI calls a sparse-dense model. Under this model, the infinite data limit of sparse PCA becomes the sparsest element problem, which is nontrivial. The objective of this research is to understand the computational-statistical tradeoff in finding the sparsest element in a subspace under the sparse-dense model. The PI would like to determine if there is a scaling gap between the information theoretic limit and the best performance by a computationally efficient algorithm. Ultimately, we would like to understand when sophisticated convex methods are provably better than simple thresholding methods. This objective will be explored by semidefinite relaxations, polynomial optimization, and reductions to the planted clique problem.
稀疏主成分分析(PCA)是一种允许生物学家和其他科学家根据很少的变量来解释实验数据的技术。例如,它可以帮助识别在数千个基因中哪些对区分不同类型的癌症是重要的。为了让科学家和工程师选择寻找稀疏主成分的最佳算法,从理论上了解许多算法在实际数据模型下的性能是很重要的。大多数现有的理论理解都集中在一个简单的情况下,即存在一个碰巧是稀疏的单个组件。提出的工作将引入一个新的模型,其中有多个组件,其中一个是稀疏的。对于这种更现实的模型的一个特殊情况,提出的工作试图理解是否存在任何条件下,复杂的凸规划被证明比非常简单的算法更好。这两种结果都有助于研究人员在稀疏PCA的多种算法之间做出选择。本课题将从寻找子空间中最稀疏元素的角度来研究稀疏主成分分析。这种观点是由多尖峰数据模型驱动的,PI称之为稀疏密集模型。在该模型下,稀疏主成分分析的无限数据极限问题成为最稀疏元素问题,具有非平凡性。本研究的目的是了解在稀疏-密集模型下寻找子空间中最稀疏元素的计算-统计权衡。PI希望通过计算效率高的算法确定在信息理论极限和最佳性能之间是否存在缩放差距。最后,我们想了解复杂的凸方法在什么情况下比简单的阈值方法更好。这个目标将通过半定松弛、多项式优化和对种植团问题的约简来探讨。

项目成果

期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Paul Hand其他文献

Simultaneous Phase Retrieval and Blind Deconvolution via Convex Programming
通过凸规划进行同步相位检索和盲反卷积
PhaseLift is robust to a constant fraction of arbitrary errors
ShapeFit: Exact Location Recovery from Corrupted Pairwise Directions
ShapeFit:从损坏的成对方向中恢复精确位置
Analysis of Catastrophic Forgetting for Random Orthogonal Transformation Tasks in the Overparameterized Regime
超参数化机制中随机正交变换任务的灾难性遗忘分析
  • DOI:
    10.48550/arxiv.2207.06475
  • 发表时间:
    2022
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Daniel Goldfarb;Paul Hand
  • 通讯作者:
    Paul Hand
Photoperiod effect on bud burst in Prunus is phase dependent: significance for early photosynthetic development.
光周期对李属芽萌发的影响是相位依赖性的:对早期光合作用发育具有重要意义。
  • DOI:
    10.1093/treephys/16.5.491
  • 发表时间:
    1996
  • 期刊:
  • 影响因子:
    4
  • 作者:
    R. Besford;Paul Hand;Christine M. Richardson;S. D. Peppitt
  • 通讯作者:
    S. D. Peppitt

Paul Hand的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Paul Hand', 18)}}的其他基金

Collaborative Research: CDS&E-MSS: Deep Network Compression and Continual Learning: Theory and Application
合作研究:CDS
  • 批准号:
    2053448
  • 财政年份:
    2021
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Continuing Grant
Foundations of Data Science Institute
数据科学研究所基础
  • 批准号:
    2022205
  • 财政年份:
    2020
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Continuing Grant
CAREER: Signal Recovery from Generative Priors
职业:从生成先验中恢复信号
  • 批准号:
    1848087
  • 财政年份:
    2019
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Continuing Grant
A Systems Approach to Disease Resistance Against Necrotrophic Fungal Pathogens
针对坏死性真菌病原体的抗病系统方法
  • 批准号:
    BB/M017729/1
  • 财政年份:
    2015
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Research Grant
Sparse Principal Component Analysis via the Sparsest Element in a Subspace
通过子空间中最稀疏元素的稀疏主成分分析
  • 批准号:
    1464525
  • 财政年份:
    2014
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Standard Grant
PostDoctoral Research Fellowship
博士后研究奖学金
  • 批准号:
    1104000
  • 财政年份:
    2011
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Fellowship Award
Accelerated breeding of black rot resistant brassicas for the benefit of east African smallholders
加速培育抗黑腐病芸苔属植物,造福东非小农
  • 批准号:
    BB/F004338/2
  • 财政年份:
    2010
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Research Grant
Bacterial and plant factors that influence adhesion of enterohaemorrhagic E. coli and Salmonella enterica to salad leaves
影响肠出血性大肠杆菌和沙门氏菌对沙拉叶粘附的细菌和植物因素
  • 批准号:
    BB/G014175/2
  • 财政年份:
    2010
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Research Grant
Bacterial and plant factors that influence adhesion of enterohaemorrhagic E. coli and Salmonella enterica to salad leaves
影响肠出血性大肠杆菌和沙门氏菌对沙拉叶粘附的细菌和植物因素
  • 批准号:
    BB/G014175/1
  • 财政年份:
    2009
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Research Grant
Accelerated breeding of black rot resistant brassicas for the benefit of east African smallholders
加速培育抗黑腐病芸苔属植物,造福东非小农
  • 批准号:
    BB/F004338/1
  • 财政年份:
    2008
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Research Grant

相似国自然基金

使用倾向分(Propensity Score)和主分层(Principal Stratification)进行因果推断
  • 批准号:
    10401003
  • 批准年份:
    2004
  • 资助金额:
    11.0 万元
  • 项目类别:
    青年科学基金项目

相似海外基金

Collaborative Research: Randomized Numerical Linear Algebra for Large Scale Inversion, Sparse Principal Component Analysis, and Applications
合作研究:大规模反演的随机数值线性代数、稀疏主成分分析及应用
  • 批准号:
    2152661
  • 财政年份:
    2022
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Standard Grant
Collaborative Research: Randomized Numerical Linear Algebra for Large Scale Inversion, Sparse Principal Component Analysis, and Applications
合作研究:大规模反演的随机数值线性代数、稀疏主成分分析及应用
  • 批准号:
    2152704
  • 财政年份:
    2022
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Standard Grant
Solid Solution Enhanced Synthesis of Multi-Principal Component Alloys via Oxide Reduction
通过氧化物还原固溶强化合成多主成分合金
  • 批准号:
    2217692
  • 财政年份:
    2022
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Standard Grant
Collaborative Research: Randomized Numerical Linear Algebra for Large Scale Inversion, Sparse Principal Component Analysis, and Applications
合作研究:大规模反演的随机数值线性代数、稀疏主成分分析及应用
  • 批准号:
    2152687
  • 财政年份:
    2022
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Standard Grant
Development of an Injury Prediction Tool using Principal Component Analysis
使用主成分分析开发伤害预测工具
  • 批准号:
    535113-2019
  • 财政年份:
    2021
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Establishment of molecular design guidelines for enzyme mutants by principal component analysis aiming at improving enzyme electrochemical reaction
通过主成分分析建立酶突变体分子设计指南,旨在改善酶电化学反应
  • 批准号:
    21K14782
  • 财政年份:
    2021
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Grant-in-Aid for Early-Career Scientists
Development of an Injury Prediction Tool using Principal Component Analysis
使用主成分分析开发伤害预测工具
  • 批准号:
    535113-2019
  • 财政年份:
    2020
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Development of Orthonormal principal component analysis for categorical data and its applications
分类数据正交主成分分析的发展及其应用
  • 批准号:
    20K03303
  • 财政年份:
    2020
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Grant-in-Aid for Scientific Research (C)
Development of an Injury Prediction Tool using Principal Component Analysis
使用主成分分析开发伤害预测工具
  • 批准号:
    535113-2019
  • 财政年份:
    2019
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Postgraduate Scholarships - Doctoral
Kernel principal component analysis in high dimension, low sample size and its applications
高维、小样本核主成分分析及其应用
  • 批准号:
    19J10175
  • 财政年份:
    2019
  • 资助金额:
    $ 13.38万
  • 项目类别:
    Grant-in-Aid for JSPS Fellows
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了