Knockoff Feature Selection Techniques for Robust Inference in Supervised and Unsupervised Learning
监督和无监督学习中鲁棒推理的仿冒特征选择技术
基本信息
- 批准号:2310955
- 负责人:
- 金额:$ 20万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2023
- 资助国家:美国
- 起止时间:2023-08-01 至 2026-07-31
- 项目状态:未结题
- 来源:
- 关键词:
项目摘要
This project aims to develop a new methodology for selecting key features among a large pool of potential variables that are predictive of the final outcomes. When applied to the biomedical field, these methods will enable the discovery of determinants of patient health, thus improving the prevention, treatment, and management of diseases. When used in fields such as engineering, psychology, sociology, economics, and environmental sciences, these methods can improve manufacturing processes, social programs that focus on diversity and equity, the care and management of mental health, and the preservation of the environment and natural resources. Additionally, the new methods will also help to generate high-quality synthetic data while maintaining the confidentiality of the original information, thereby spurring new scientific discoveries and providing a valuable educational tool. The project will offer a number of unique interdisciplinary training initiatives for the future cohorts of data scientists at the interface of statistics, machine learning, and biomedical sciences.The research agenda is based on the 'knockoff method' for identifying key features predictive of the outcomes while maintaining false discovery control. The methods incorporate the microbiome phylogenetic structure in feature selection, accommodate missing values, incorporate multiple knockoffs to increase robustness, employ nonparametric Bayesian models for complex data structures, and introduce a new knockoff statistic based on conditional prediction function. The proposed statistics can be paired with state-of-the-art machine learning models to detect nonlinear relationships while accounting for feature correlation. Furthermore, by applying knockoff filtering with unsupervised learning models, this research can identify determinants of the feature space and provide insights into unsupervised clustering and learning.This award reflects NSF's statutory mission and has been deemed worthy of support through evaluation using the Foundation's intellectual merit and broader impacts review criteria.
该项目旨在开发一种新的方法,从大量预测最终结果的潜在变量中选择关键特征。当应用于生物医学领域时,这些方法将能够发现患者健康的决定因素,从而改进疾病的预防、治疗和管理。当用于工程、心理学、社会学、经济学和环境科学等领域时,这些方法可以改善制造过程、注重多样性和公平的社会项目、精神健康的护理和管理以及环境和自然资源的保护。此外,新方法还将有助于生成高质量的合成数据,同时保持原始信息的机密性,从而促进新的科学发现并提供宝贵的教育工具。该项目将为未来的数据科学家群体提供一些独特的跨学科培训举措,涉及统计学、机器学习和生物医学科学。研究议程基于“仿冒方法”,用于识别预测结果的关键特征,同时保持错误发现控制。该方法在特征选择中引入微生物组系统发育结构,适应缺失值,结合多重假冒以增加稳健性,使用非参数贝叶斯模型处理复杂数据结构,并引入基于条件预测函数的新假冒统计量。提出的统计量可以与最先进的机器学习模型配对,以检测非线性关系,同时考虑到特征相关性。此外,通过将仿冒过滤与无监督学习模型相结合,这项研究可以识别特征空间的决定因素,并为无监督聚类和学习提供见解。这一奖项反映了NSF的法定使命,并通过使用基金会的智力优势和更广泛的影响审查标准进行评估,被认为值得支持。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Yushu Shi其他文献
MRI radiomics model for predicting tumor immune microenvironment types and efficacy of anti-PD-1/PD-L1 therapy in hepatocellular carcinoma
- DOI:
10.1186/s12880-025-01751-9 - 发表时间:
2025-07-01 - 期刊:
- 影响因子:3.200
- 作者:
Rui Zhang;Wei Peng;Yao Wang;Yunping Jiang;Junli Wang;Siying Zhang;Zhi Li;Yushu Shi;Feng Chen;Zhan Feng;Wenbo Xiao - 通讯作者:
Wenbo Xiao
Combined Beta Metric for Unsupervised Clustering of Microbiome Data
用于微生物组数据无监督聚类的组合 Beta 度量
- DOI:
- 发表时间:
2020 - 期刊:
- 影响因子:0
- 作者:
Yushu Shi;Liangliang Zhang;Christine B. Peterson;K. Do;R. Jenq - 通讯作者:
R. Jenq
Clinical Observation of 242 Cases of Polycystic Ovary Syndrome
多囊卵巢综合征242例临床观察
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Yushu Shi;Cunjian Yi - 通讯作者:
Cunjian Yi
A dependent Dirichlet process model for survival data with competing risks
具有竞争风险的生存数据的依赖狄利克雷过程模型
- DOI:
10.1007/s10985-020-09506-0 - 发表时间:
2020 - 期刊:
- 影响因子:1.3
- 作者:
Yushu Shi;Purushottam W. Laud;J. Neuner - 通讯作者:
J. Neuner
Photoelectrochemical determination of Hg(II) via dual signal amplification involving SPR enhancement and a folding-based DNA probe
通过涉及 SPR 增强和基于折叠的 DNA 探针的双信号放大对 Hg(II) 进行光电化学测定
- DOI:
10.1007/s00604-017-2141-3 - 发表时间:
2017-02 - 期刊:
- 影响因子:0
- 作者:
Yushu Shi;Guoqing Zhang;Jiaojiao Li;Yong Zhang;Yanbo Yu;Qin Wei - 通讯作者:
Qin Wei
Yushu Shi的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
相似海外基金
Development of Integrated Quantum Inspired Algorithms for Shapley Value based Fast and Interpretable Feature Subset Selection
基于 Shapley 值的快速且可解释的特征子集选择的集成量子启发算法的开发
- 批准号:
24K15089 - 财政年份:2024
- 资助金额:
$ 20万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Feature selection in several challenging directions
几个具有挑战性的方向的特征选择
- 批准号:
2310668 - 财政年份:2023
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
III: Small: Deep Interactive Reinforcement Learning for Self-optimizing Feature Selection
III:小:用于自优化特征选择的深度交互式强化学习
- 批准号:
2152030 - 财政年份:2022
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
Modelling and Feature Selection with Applications to Big Data Problems
建模和特征选择及其在大数据问题中的应用
- 批准号:
RGPIN-2019-05963 - 财政年份:2022
- 资助金额:
$ 20万 - 项目类别:
Discovery Grants Program - Individual
Modelling and Feature Selection with Applications to Big Data Problems
建模和特征选择及其在大数据问题中的应用
- 批准号:
RGPIN-2019-05963 - 财政年份:2021
- 资助金额:
$ 20万 - 项目类别:
Discovery Grants Program - Individual
Comparison of Feature Selection Methods and Machine Learning Classifiers with Computed Tomography Radiomics-based Features for Predicting Chronic Obstructive Pulmonary Disease
特征选择方法和机器学习分类器与基于计算机断层扫描放射组学特征的预测慢性阻塞性肺疾病的比较
- 批准号:
466971 - 财政年份:2021
- 资助金额:
$ 20万 - 项目类别:
Studentship Programs
Matrix Completion with Non-uniform Missing Patterns, a New Measure of Conditional Dependence, and Applications to Feature Selection
具有非均匀缺失模式的矩阵补全、条件依赖性的新度量以及在特征选择中的应用
- 批准号:
2113242 - 财政年份:2021
- 资助金额:
$ 20万 - 项目类别:
Standard Grant
Fast flexible feature selection for high dimensional challenging data
针对高维挑战性数据快速灵活的特征选择
- 批准号:
DP210100521 - 财政年份:2021
- 资助金额:
$ 20万 - 项目类别:
Discovery Projects
Modelling and Feature Selection with Applications to Big Data Problems
建模和特征选择及其在大数据问题中的应用
- 批准号:
RGPIN-2019-05963 - 财政年份:2020
- 资助金额:
$ 20万 - 项目类别:
Discovery Grants Program - Individual
High-dimension, low-sample-size asymptotic theory for nonlinear feature selection
用于非线性特征选择的高维、低样本量渐近理论
- 批准号:
20K22305 - 财政年份:2020
- 资助金额:
$ 20万 - 项目类别:
Grant-in-Aid for Research Activity Start-up














{{item.name}}会员




