Nonparametric classification, tuning parameter selection, and asymptotic stability for high-dimensional data
高维数据的非参数分类、调整参数选择和渐近稳定性
基本信息
- 批准号:1308566
- 负责人:
- 金额:$ 13万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Continuing Grant
- 财政年份:2013
- 资助国家:美国
- 起止时间:2013-07-01 至 2016-06-30
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Technological innovations have provided a primary force in advancement of scientific research and in social progress. High-throughput data of unprecedented size and complexity are frequently seen in diverse fields of science and humanity, ranging from computational biology and health studies to financial engineering and risk management. Such high-dimensional data have initiated many important problems in contemporary statistics where feature selection plays pivotal roles. The proposed project has the following three interrelated objectives in the theme of high-dimensional data with applications in classification and variable selection. (1) To introduce a nonparametric classification framework for high-dimensional data. The target of this research is to integrate the nonparametric component to the classical parametric methods for classification (e.g., penalized logistic regression, linear discriminant analysis) under high-dimensional settings without incurring much computational burden. Asymptotic properties are investigated regarding the excess risk. (2) To investigate the asymptotic properties of cross-validation for tuning parameter selection in high-dimensional variable selection. The goal here is to perform a systematic study on the asymptotic behavior of major cross-validation methods for choosing the tuning parameter when various penalty functions (LASSO, SCAD, MCP, etc.) are used. By delineating the properties of the classical cross-validation, a new modified cross-validation method for the purpose of choosing the optimal tuning parameter in the solution path is developed that achieves model selection consistency. (3) To introduce the notion of asymptotic stability for maximum penalized likelihood estimators. Despite the extensive literature on the maximum penalized likelihood estimators in high-dimensional settings, the research on the stability of the estimators has been very limited. The investigators aim to introduce the notion of asymptotic stability for a general class of maximum penalized likelihood estimators, study the behavior and evaluate the performance of these estimators when different penalty functions are applied.The analysis of "big data" now pervasive across many scientific disciplines poses challenges as well as opportunities to the field of statistics. A major goal of this proposal is to make methodological and theoretical contributions to the important and challenging topic of high-dimensional classification and variable selection. The proposed research will have broad impacts on many disciplines of science, including health/life sciences, economics, finance, astronomy and sociology, among others. In these fields, variable selection, feature extraction, sparsity explorations are crucial for knowledge discovery. The investigators have been interacting with researchers at New York State Psychiatric Institute at the Columbia University Medical Center, Computational Biology Center of the Memorial Sloan-Kettering Cancer Center and Center for Computational Learning Systems at Columbia University. The results of the proposed investigations will be used for understanding mental health issues, for identifying risk factors in diseases of cancer and for predicting failures in complex engineering systems. On the educational side, the proposed work will be incorporated into new courses on the state-of-the-art high-dimensional statistical learning. It will also be integrated into the training of undergraduate and graduate students, especially of under-represented groups, in terms of Ph.D. dissertations and undergraduate research projects.
技术创新是推动科学研究和社会进步的主要力量。从计算生物学和健康研究到金融工程和风险管理,在科学和人类的各个领域经常看到前所未有的规模和复杂性的高通量数据。这种高维数据引发了当代统计学中的许多重要问题,其中特征选择起着举足轻重的作用。在高维数据分类和变量选择应用的主题中,拟议的项目有以下三个相互关联的目标。(1)引入高维数据的非参数分类框架。本研究的目标是在不增加计算负担的情况下,将非参数成分整合到高维环境下的经典参数分类方法(如惩罚逻辑回归、线性判别分析)中。研究了关于超额风险的渐近性质。(2)研究了高维变量选择中优化参数选择的交叉验证的渐近性质。这里的目标是对使用各种惩罚函数(LASSO, SCAD, MCP等)时选择调优参数的主要交叉验证方法的渐近行为进行系统研究。通过描述经典交叉验证的性质,提出了一种新的改进交叉验证方法,用于在解路径中选择最优调优参数,从而实现模型选择的一致性。(3)引入极大惩罚似然估计的渐近稳定性的概念。尽管关于高维环境下最大惩罚似然估计的文献很多,但对估计量稳定性的研究却非常有限。研究了一类极大惩罚似然估计的渐近稳定性,研究了这些估计在不同惩罚函数作用下的行为,并评价了它们的性能。如今,对“大数据”的分析在许多科学学科中无处不在,这给统计学领域带来了挑战,也带来了机遇。本提案的主要目标是对高维分类和变量选择这一重要且具有挑战性的主题做出方法和理论贡献。拟议的研究将对许多科学学科产生广泛影响,包括健康/生命科学、经济学、金融学、天文学和社会学等。在这些领域中,变量选择、特征提取、稀疏度探索是知识发现的关键。研究人员一直在与哥伦比亚大学医学中心的纽约州精神病学研究所、纪念斯隆-凯特琳癌症中心的计算生物学中心和哥伦比亚大学计算学习系统中心的研究人员进行交流。拟议的调查结果将用于理解心理健康问题,用于识别癌症疾病的风险因素,并用于预测复杂工程系统的故障。在教育方面,拟议的工作将纳入关于最先进的高维统计学习的新课程。它还将纳入对本科生和研究生的培训,特别是在博士论文和本科生研究项目方面,对代表性不足的群体的培训。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Yang Feng其他文献
New remote sensing image fusion for exploring spatiotemporal evolution of urban land use and land cover
用于探索城市土地利用和土地覆盖时空演变的新型遥感图像融合
- DOI:
10.1117/1.jrs.16.034527 - 发表时间:
2022-07 - 期刊:
- 影响因子:1.7
- 作者:
Liu Linfeng;Zhang Chengcai;Luo Weiran;Chen Shaodan;Yang Feng;Liu Jisheng - 通讯作者:
Liu Jisheng
The effect of personal and microclimatic variables on outdoor thermal comfort: A field study in cold season in Lujiazui CBD, Shanghai
个人和微气候变量对室外热舒适度的影响:上海陆家嘴CBD寒冷季节的现场研究
- DOI:
10.1016/j.scs.2018.02.025 - 发表时间:
2018 - 期刊:
- 影响因子:11.7
- 作者:
Yao JiaWei;Yang Feng;Zhuang Zhi;Shao YuHan;Yuan Feng - 通讯作者:
Yuan Feng
A Novel Encrypted Computing-in-Memory (eCIM) by Implementing Random Telegraph Noise (RTN) as Keys Based on 55 nm NOR Flash Technology
基于 55 nm NOR 闪存技术的以随机电报噪声 (RTN) 作为密钥的新型加密内存计算 (eCIM)
- DOI:
10.1109/led.2022.3190267 - 发表时间:
2022-09 - 期刊:
- 影响因子:4.9
- 作者:
Yang Feng;Jixuan Wu;Xuepeng Zhan;Jing Liu;Zhaohui Sun;Junyu Zhang;Masaharu Kobayashi;Jiezhi Chen - 通讯作者:
Jiezhi Chen
Bad Seed or Good Seed? A Content Analysis of the Main Antagonists in Walt Disney- and Studio Ghibli-Animated Films
坏种子还是好种子?
- DOI:
10.1080/17482798.2015.1058279 - 发表时间:
2015 - 期刊:
- 影响因子:3
- 作者:
Yang Feng;Jiwoo Park - 通讯作者:
Jiwoo Park
Detection of antimicrobial resistance and virulence-related genes in Streptococcus uberis and Streptococcus parauberis isolated from clinical bovine mastitis cases in northwestern China
西北地区牛乳腺炎临床病例中乳房链球菌和副乳房链球菌耐药性及毒力相关基因的检测
- DOI:
10.1016/s2095-3119(20)63185-9 - 发表时间:
2020-01 - 期刊:
- 影响因子:4.8
- 作者:
Zhang Hang;Yang Feng;Li Xin-pu;Luo Jin-yin;Wang Ling;Zhou Yu-long;Yan Yong;Wang Xu-rong;Li Hong-sheng - 通讯作者:
Li Hong-sheng
Yang Feng的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Yang Feng', 18)}}的其他基金
Collaborative Research: New Theory and Methods for High-Dimensional Multi-Task and Transfer Learning Inference
合作研究:高维多任务和迁移学习推理的新理论和新方法
- 批准号:
2324489 - 财政年份:2023
- 资助金额:
$ 13万 - 项目类别:
Continuing Grant
CAREER: Statistical inference of network and relational data
职业:网络和关系数据的统计推断
- 批准号:
2013789 - 财政年份:2019
- 资助金额:
$ 13万 - 项目类别:
Continuing Grant
CAREER: Statistical inference of network and relational data
职业:网络和关系数据的统计推断
- 批准号:
1554804 - 财政年份:2016
- 资助金额:
$ 13万 - 项目类别:
Continuing Grant
相似国自然基金
基于传孢类型藓类植物系统的修订
- 批准号:30970188
- 批准年份:2009
- 资助金额:26.0 万元
- 项目类别:面上项目
相似海外基金
Experience-dependent tuning of socially selective neural circuits
社会选择性神经回路的经验依赖调节
- 批准号:
10723708 - 财政年份:2023
- 资助金额:
$ 13万 - 项目类别:
Tuning Up Memory-related Brain Potentials using Real-time Neurofeedback in Older Veterans"
使用实时神经反馈调节老年退伍军人与记忆相关的大脑潜力”
- 批准号:
10223461 - 财政年份:2020
- 资助金额:
$ 13万 - 项目类别:
Tuning Up Memory-related Brain Potentials using Real-time Neurofeedback in Older Veterans"
使用实时神经反馈调节老年退伍军人与记忆相关的大脑潜力”
- 批准号:
10699951 - 财政年份:2020
- 资助金额:
$ 13万 - 项目类别:
Tuning Up Memory-related Brain Potentials using Real-time Neurofeedback in Older Veterans"
使用实时神经反馈调节老年退伍军人与记忆相关的大脑潜力”
- 批准号:
10015496 - 财政年份:2020
- 资助金额:
$ 13万 - 项目类别:
Tuning Up Memory-related Brain Potentials using Real-time Neurofeedback in Older Veterans"
使用实时神经反馈调节老年退伍军人与记忆相关的大脑潜力”
- 批准号:
10847383 - 财政年份:2020
- 资助金额:
$ 13万 - 项目类别:
Moving With and Tuning In: A participatory mixed methods study to foster social inclusion of individuals with dementia and their carers
移动和调整:一项参与式混合方法研究,旨在促进痴呆症患者及其护理人员的社会融入
- 批准号:
359069 - 财政年份:2016
- 资助金额:
$ 13万 - 项目类别:
Operating Grants
Classification: methodology for variable selection and efficient tuning and comparasion of models
分类:变量选择和模型高效调整和比较的方法
- 批准号:
36462-2008 - 财政年份:2012
- 资助金额:
$ 13万 - 项目类别:
Discovery Grants Program - Individual
Classification: methodology for variable selection and efficient tuning and comparasion of models
分类:变量选择和模型高效调整和比较的方法
- 批准号:
36462-2008 - 财政年份:2011
- 资助金额:
$ 13万 - 项目类别:
Discovery Grants Program - Individual
Classification: methodology for variable selection and efficient tuning and comparasion of models
分类:变量选择和模型高效调整和比较的方法
- 批准号:
36462-2008 - 财政年份:2010
- 资助金额:
$ 13万 - 项目类别:
Discovery Grants Program - Individual
Classification: methodology for variable selection and efficient tuning and comparasion of models
分类:变量选择和模型高效调整和比较的方法
- 批准号:
36462-2008 - 财政年份:2009
- 资助金额:
$ 13万 - 项目类别:
Discovery Grants Program - Individual