Collaborative Research: Development of Classification Theory and Methods for Objective Asymmetry, Sample Size Limitation, Labeling Ambiguity, and Feature Importance

合作研究:针对客观不对称性、样本量限制、标签歧义和特征重要性的分类理论和方法的发展

基本信息

  • 批准号:
    2113500
  • 负责人:
  • 金额:
    $ 12万
  • 依托单位:
  • 依托单位国家:
    美国
  • 项目类别:
    Standard Grant
  • 财政年份:
    2021
  • 资助国家:
    美国
  • 起止时间:
    2021-07-01 至 2024-06-30
  • 项目状态:
    已结题

项目摘要

Classification is a popular data analytical technique in disciplines ranging from biomedical sciences to information technologies. This project will develop theory-backed statistical methods and algorithms to address pressing challenges in the application of classification. These challenges are related to imperfect aspects of training data, which are widespread in high-stake applications such as disease diagnosis and cybersecurity. In particular, this project will focus on the so-called asymmetric classification problems where a particular class is of greater importance than other classes, and the methods and algorithms will aim to control the classification error of missing the most important class in the population, not just in a particular dataset. This property will make the methods and algorithms powerful for medical diagnosis, for which the primary goal is diagnosis accuracy in the population. Moreover, this project will provide a suite of projects, ranging from theory to applications, that are suitable for training graduate and undergraduate students. The interdisciplinary nature of this project is expected to attract students from diverse background to join the PIs’ efforts.The PIs will develop a suite of application-driven, theory-backed methods and algorithms to address pressing data challenges including sample size limitations, sampling biases, and ambiguous class labels. The development will be primarily under the Neyman-Pearson (NP) classification paradigm, which was designed to control the population-level false-negative rate (p-FNR) under a desired level while minimizing the population-level false-positive rate (p-FPR). This project will integrate the NP classification into cutting-edge statistical learning tasks and enable it to address the aforementioned real-world data challenges. Specifically, this project will include the following four overarching goals. First, the PIs will use random matrix theory to address a long-standing problem in the NP classification methodology: whether NP classifiers can be constructed without a sample-splitting step to improve data efficiency. Second, because the NP paradigm has an invariance property to sampling bias, the PIs will develop NP classifiers to address the sampling bias issue in biomedical applications. These classifiers can be trained on biased samples but still achieve the p-FNR control. Third, the PIs will develop a model-free feature ranking framework to incorporate multiple classification paradigms including the NP paradigm and to reflect prediction objectives. Fourth, the PIs will develop the first NP umbrella algorithm under the label noise setting and the first information-theoretic criteria that combine ambiguous classes in multi-class classification. To disseminate the project outcomes, the PIs will give research talks, organize conference sessions, share open-source software packages with tutorials, and reach out to practitioners of classification methods.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
分类是从生物医学科学到信息技术等学科中一种流行的数据分析技术。该项目将开发有理论支持的统计方法和算法,以解决分类应用中的紧迫挑战。这些挑战与训练数据的不完善方面有关,这些数据广泛存在于疾病诊断和网络安全等高风险应用程序中。特别是,这个项目将专注于所谓的非对称分类问题,其中特定的类比其他类更重要,方法和算法的目标是控制在总体中遗漏最重要的类的分类错误,而不仅仅是在特定的数据集中。这一性质将使医疗诊断的方法和算法变得强大,其首要目标是在人群中获得诊断的准确性。此外,该项目还将提供一套从理论到应用的项目,适合培养研究生和本科生。这个项目的跨学科性质预计将吸引来自不同背景的学生加入PIS的努力。PIS将开发一套由应用程序驱动、有理论支持的方法和算法,以解决紧迫的数据挑战,包括样本量限制、采样偏差和模糊的类别标签。这一发展将主要在Neyman-Pearson(NP)分类范式下进行,该范式旨在将人口水平的假阴性率(p-FNR)控制在所需水平下,同时将人口水平的假阳性率(p-fPR)降至最低。该项目将把国家统计分类纳入前沿的统计学习任务,并使其能够应对上述现实世界的数据挑战。具体地说,该项目将包括以下四个总体目标。首先,PI将使用随机矩阵理论来解决NP分类方法中的一个长期存在的问题:是否可以在不进行样本分割的情况下构建NP分类器,以提高数据效率。其次,由于NP范式对采样偏差具有不变性,PI将开发NP分类器来解决生物医学应用中的采样偏差问题。这些分类器可以对有偏样本进行训练,但仍然可以实现p-FNR控制。第三,PIS将开发一个无模型的特征排名框架,以纳入包括NP范例在内的多种分类范例,并反映预测目标。第四,PI将开发第一个标签噪声环境下的NP伞形算法和第一个在多类分类中结合模糊类的信息论准则。为了传播项目成果,私人投资机构将进行研究演讲,组织会议,分享开源软件包的教程,并接触分类方法的实践者。该奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。

项目成果

期刊论文数量(2)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Information-theoretic Classification Accuracy: A Criterion that Guides Data-driven Combination of Ambiguous Outcome Labels in Multi-class Classification
  • DOI:
  • 发表时间:
    2021-09
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Chihao Zhang;Y. Chen;Shihua Zhang;Jingyi Jessica Li
  • 通讯作者:
    Chihao Zhang;Y. Chen;Shihua Zhang;Jingyi Jessica Li
{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

数据更新时间:{{ journalArticles.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ monograph.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ sciAawards.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ conferencePapers.updateTime }}

{{ item.title }}
  • 作者:
    {{ item.author }}

数据更新时间:{{ patent.updateTime }}

Xin Tong其他文献

A Gradient-Based Adaptive Interpolation Filter for Multiple View Synthesis
用于多视图合成的基于梯度的自适应插值滤波器
  • DOI:
    10.1007/978-3-642-10467-1_49
  • 发表时间:
    2009
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Ping Yang;Xin Tong;Xiaozhen Zheng;Jianhua Zheng;Yun He
  • 通讯作者:
    Yun He
Acupuncture for the Treatment of Hiccups following Stroke: A Systematic Review and Meta-Analysis
针灸治疗中风后打嗝:系统评价和荟萃分析
  • DOI:
    10.1136/acupmed-2015-011024
  • 发表时间:
    2017
  • 期刊:
  • 影响因子:
    2.5
  • 作者:
    J. Yue;Ming Liu;Jun Li;Yuming Wang;E. Hung;Xin Tong;Zhong;Qin;B. Golianu
  • 通讯作者:
    B. Golianu
Pd-doped La0.6Sr0.4Co0.2Fe0.8O3−δ perovskite oxides as cathodes for intermediate temperature solid oxide fuel cells
Pd掺杂La0.6Sr0.4Co0.2Fe0.8O3-钙钛矿氧化物作为中温固体氧化物燃料电池的阴极
  • DOI:
    10.1016/j.ssi.2018.01.044
  • 发表时间:
    2018-06
  • 期刊:
  • 影响因子:
    3.2
  • 作者:
    Feng Zhou;Lihong Zhou;Mingyao Hu;Xin Tong;Yihui Liu;Haizhao Li;Shengbing Yang;Mingrui Wei
  • 通讯作者:
    Mingrui Wei
Detail-Preserving Controllable Deformation from Sparse Examples
稀疏示例中保留细节的可控变形
Topologization and Functional Analytification II: $infty$-Categorical Motivic Constructions for Homotopical Contexts
拓扑化和泛函分析 II:同伦上下文的 $infty$-分类动机构造
  • DOI:
  • 发表时间:
    2021
  • 期刊:
  • 影响因子:
    0
  • 作者:
    Xin Tong
  • 通讯作者:
    Xin Tong

Xin Tong的其他文献

{{ item.title }}
{{ item.translation_title }}
  • DOI:
    {{ item.doi }}
  • 发表时间:
    {{ item.publish_year }}
  • 期刊:
  • 影响因子:
    {{ item.factor }}
  • 作者:
    {{ item.authors }}
  • 通讯作者:
    {{ item.author }}

{{ truncateString('Xin Tong', 18)}}的其他基金

Collaborative Research: Transfer Learning for Large-Scale Inference: General Framework and Data-Driven Algorithms
协作研究:大规模推理的迁移学习:通用框架和数据驱动算法
  • 批准号:
    2015339
  • 财政年份:
    2020
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
Robust and Interpretable Bayesian Quantile Longitudinal Analysis in Social and Behavioral Sciences
社会和行为科学中稳健且可解释的贝叶斯分位数纵向分析
  • 批准号:
    1951038
  • 财政年份:
    2020
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
Development of a general classification framework under the Neyman-Pearson Paradigm, with biomedical and social applications
在内曼-皮尔逊范式下开发通用分类框架,并具有生物医学和社会应用
  • 批准号:
    1613338
  • 财政年份:
    2016
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant

相似国自然基金

Research on Quantum Field Theory without a Lagrangian Description
  • 批准号:
    24ZR1403900
  • 批准年份:
    2024
  • 资助金额:
    0.0 万元
  • 项目类别:
    省市级项目
Cell Research
  • 批准号:
    31224802
  • 批准年份:
    2012
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research
  • 批准号:
    31024804
  • 批准年份:
    2010
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Cell Research (细胞研究)
  • 批准号:
    30824808
  • 批准年份:
    2008
  • 资助金额:
    24.0 万元
  • 项目类别:
    专项基金项目
Research on the Rapid Growth Mechanism of KDP Crystal
  • 批准号:
    10774081
  • 批准年份:
    2007
  • 资助金额:
    45.0 万元
  • 项目类别:
    面上项目

相似海外基金

Collaborative Research: RESEARCH-PGR: Development of epigenetic editing for crop improvement
合作研究:RESEARCH-PGR:用于作物改良的表观遗传编辑的开发
  • 批准号:
    2331437
  • 财政年份:
    2024
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
Collaborative Research: Broadening Instructional Innovation in the Chemistry Laboratory through Excellence in Curriculum Development
合作研究:通过卓越的课程开发扩大化学实验室的教学创新
  • 批准号:
    2337028
  • 财政年份:
    2024
  • 资助金额:
    $ 12万
  • 项目类别:
    Continuing Grant
Collaborative Research: CAS: Exploration and Development of High Performance Thiazolothiazole Photocatalysts for Innovating Light-Driven Organic Transformations
合作研究:CAS:探索和开发高性能噻唑并噻唑光催化剂以创新光驱动有机转化
  • 批准号:
    2400166
  • 财政年份:
    2024
  • 资助金额:
    $ 12万
  • 项目类别:
    Continuing Grant
Collaborative Research: Broadening Instructional Innovation in the Chemistry Laboratory through Excellence in Curriculum Development
合作研究:通过卓越的课程开发扩大化学实验室的教学创新
  • 批准号:
    2337027
  • 财政年份:
    2024
  • 资助金额:
    $ 12万
  • 项目类别:
    Continuing Grant
Collaborative Research: RESEARCH-PGR: Development of epigenetic editing for crop improvement
合作研究:RESEARCH-PGR:用于作物改良的表观遗传编辑的开发
  • 批准号:
    2331438
  • 财政年份:
    2024
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
Collaborative Research: A Multi-Lab Investigation of the Conceptual Foundations of Early Number Development
合作研究:早期数字发展概念基础的多实验室调查
  • 批准号:
    2405548
  • 财政年份:
    2024
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
Collaborative Research: CAS: Exploration and Development of High Performance Thiazolothiazole Photocatalysts for Innovating Light-Driven Organic Transformations
合作研究:CAS:探索和开发高性能噻唑并噻唑光催化剂以创新光驱动有机转化
  • 批准号:
    2400165
  • 财政年份:
    2024
  • 资助金额:
    $ 12万
  • 项目类别:
    Continuing Grant
Collaborative Research: HNDS-I. Mobility Data for Communities (MD4C): Uncovering Segregation, Climate Resilience, and Economic Development from Cell-Phone Records
合作研究:HNDS-I。
  • 批准号:
    2420945
  • 财政年份:
    2024
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
SBP: Collaborative Research: Improving Engagement with Professional Development Programs by Attending to Teachers' Psychosocial Experiences
SBP:协作研究:通过关注教师的社会心理体验来提高对专业发展计划的参与度
  • 批准号:
    2314254
  • 财政年份:
    2023
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
Collaborative Research: Frameworks: FZ: A fine-tunable cyberinfrastructure framework to streamline specialized lossy compression development
合作研究:框架:FZ:一个可微调的网络基础设施框架,用于简化专门的有损压缩开发
  • 批准号:
    2311878
  • 财政年份:
    2023
  • 资助金额:
    $ 12万
  • 项目类别:
    Standard Grant
{{ showInfoDetail.title }}

作者:{{ showInfoDetail.author }}

知道了