Model selection and efficient learning for high dimensional clustered data
高维聚类数据的模型选择和高效学习
基本信息
- 批准号:0906660
- 负责人:
- 金额:$ 21.01万
- 依托单位:
- 依托单位国家:美国
- 项目类别:Standard Grant
- 财政年份:2009
- 资助国家:美国
- 起止时间:2009-09-01 至 2013-08-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
This research project is aimed at developing statistical theory and practical methodology for complex high-dimensional clustered data where the number of variables is larger than the sample size. This problem is especially important and relevant in microarray data where there are thousands of genes involved. The focus of this research will be to show how to efficiently and accurately extract information from a large quantity of often noisy information consisting of high-dimensional data, so as to identify and select significant variables of scientific interest. The PI and her collaborators will develop estimation procedures, statistical inference functions, model selection and classification procedures by incorporating correlation into the models. The specific goals for this research plan are: (1) To propose flexible estimation procedures for the link function and the marginal variance function when their forms are unknown in the generalized linear models; (2) To develop semiparametric classification for time-course gene expression data; (3) To propose model selection criteria for choosing informative correlation structures; (4) To develop efficient and consistent model selection procedures for generalized additive models where the likelihood is unspecified; (5) To develop a sufficient dimension reduction method for correlated data and retain the full regression information without imposing parametric models.The research project will help to tackle fundamental questions in statistical science and will stimulate interest from a large group of scientists in the fields of longitudinal and cluster data analysis. It will also enhance the development of, and makes connections between, theory and method in statistics, biostatistics and computer science. This research will have significant impact and many applications in biomedical studies, genome research, econometrics, environmental studies, oceanography, social science and public health where correlated data often arise. The PI will integrate the proposed research areas substantially into educational activities through the development of new university courses, and through presenting short courses at major statistical meetings. The research will advance undergraduate and graduate students' learning and training for handling high-dimensional correlated data.
本研究旨在为变量数大于样本量的复杂高维聚类数据发展统计理论和实用方法。这个问题在涉及数千个基因的微阵列数据中特别重要和相关。本研究的重点将是展示如何有效和准确地从大量的高维数据组成的噪声信息中提取信息,从而识别和选择科学感兴趣的重要变量。PI及其合作者将通过将相关性纳入模型来开发估计程序、统计推断功能、模型选择和分类程序。本研究计划的具体目标是:(1)提出广义线性模型中连接函数和边际方差函数在形式未知时的灵活估计方法,(2)发展时程基因表达数据的半参数分类,(3)提出选择信息相关结构的模型选择准则,(4)提出基于时间序列的基因表达数据的半参数分类方法。(4)为可能性未指定的广义加性模型开发高效且一致的模型选择程序;(五)发展一个足够的降维方法,以处理相关数据,并保留全部回归资料,而不需采用参数模型。本研究计划将有助解决统计学中的基本问题科学,并将激发大量科学家对纵向和聚类数据分析领域的兴趣。它还将促进统计学,生物统计学和计算机科学的理论和方法的发展,并将其联系起来。这项研究将在生物医学研究、基因组研究、计量经济学、环境研究、海洋学、社会科学和公共卫生等领域产生重大影响和许多应用,这些领域经常出现相关数据。 方案研究所将通过制定新的大学课程和在主要统计会议上举办短期课程,将拟议的研究领域实质性地纳入教育活动。该研究将促进本科生和研究生处理高维相关数据的学习和培训。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Annie Qu其他文献
At-harvest prediction of grey mould risk in pear fruit in long-term cold storage
- DOI:
10.1016/j.cropro.2009.01.001 - 发表时间:
2009-05-01 - 期刊:
- 影响因子:
- 作者:
Robert A. Spotts;Maryna Serdani;Kelly M. Wallis;Monika Walter;Trish Harris-Virgin;Kim Spotts;David Sugar;Chang Lin Xiao;Annie Qu - 通讯作者:
Annie Qu
Dynamic Tensor Recommender Systems
动态张量推荐系统
- DOI:
- 发表时间:
2021 - 期刊:
- 影响因子:0
- 作者:
Yanqing Zhang;Xuan Bi;Niansheng Tang;Annie Qu - 通讯作者:
Annie Qu
Dynamic Tensor Recommender System
动态张量推荐系统
- DOI:
10.11159/icsta19.09 - 发表时间:
2019-08 - 期刊:
- 影响因子:6
- 作者:
Yanqing Zhang;Xuan Bi;Niansheng Tang;Annie Qu - 通讯作者:
Annie Qu
Imputed Factor Regression for High-dimensional Block-wise Missing Data
高维分块缺失数据的估算因子回归
- DOI:
10.5705/ss.202018.0008 - 发表时间:
2020 - 期刊:
- 影响因子:1.4
- 作者:
Yanqing Zhang;Niansheng Tang;Annie Qu - 通讯作者:
Annie Qu
Discussion of Fan et al.’s paper “Gaining efficiency via weighted estimators for multivariate failure time data”
- DOI:
10.1007/s11425-009-0135-2 - 发表时间:
2009-06-01 - 期刊:
- 影响因子:1.500
- 作者:
Annie Qu;Lan Xue - 通讯作者:
Lan Xue
Annie Qu的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Annie Qu', 18)}}的其他基金
Collaborative Research: Integrative Heterogeneous Learning for Intensive Complex Longitudinal Data
协作研究:密集复杂纵向数据的综合异构学习
- 批准号:
2210640 - 财政年份:2022
- 资助金额:
$ 21.01万 - 项目类别:
Standard Grant
Collaborative Research: New Statistical Learning for Complex Heterogeneous Data
协作研究:复杂异构数据的新统计学习
- 批准号:
2019461 - 财政年份:2020
- 资助金额:
$ 21.01万 - 项目类别:
Standard Grant
FRG: Collaborative Research: Generative Learning on Unstructured Data with Applications to Natural Language Processing and Hyperlink Prediction
FRG:协作研究:非结构化数据的生成学习及其在自然语言处理和超链接预测中的应用
- 批准号:
1952406 - 财政年份:2020
- 资助金额:
$ 21.01万 - 项目类别:
Standard Grant
Conference on Statistical Learning and Data Science
统计学习与数据科学会议
- 批准号:
1818546 - 财政年份:2018
- 资助金额:
$ 21.01万 - 项目类别:
Standard Grant
Collaborative Research: New Statistical Learning for Complex Heterogeneous Data
协作研究:复杂异构数据的新统计学习
- 批准号:
1821198 - 财政年份:2018
- 资助金额:
$ 21.01万 - 项目类别:
Standard Grant
Collaborative Research: New Statistical Learning and Scalable Computing for Large Unstructured Data
协作研究:大型非结构化数据的新统计学习和可扩展计算
- 批准号:
1415308 - 财政年份:2014
- 资助金额:
$ 21.01万 - 项目类别:
Standard Grant
Personalized classification, moment selection, and time-varying networks for large-scale longitudinal data
大规模纵向数据的个性化分类、矩选择和时变网络
- 批准号:
1308227 - 财政年份:2013
- 资助金额:
$ 21.01万 - 项目类别:
Standard Grant
CAREER: Semiparametric and Non-Parametric Models for Correlated Data
职业:相关数据的半参数和非参数模型
- 批准号:
0902232 - 财政年份:2008
- 资助金额:
$ 21.01万 - 项目类别:
Continuing Grant
CAREER: Semiparametric and Non-Parametric Models for Correlated Data
职业:相关数据的半参数和非参数模型
- 批准号:
0348764 - 财政年份:2004
- 资助金额:
$ 21.01万 - 项目类别:
Continuing Grant
Semiparametric Models for Correlated Data: The Quadratic Inference Function Approach
相关数据的半参数模型:二次推理函数方法
- 批准号:
0103513 - 财政年份:2001
- 资助金额:
$ 21.01万 - 项目类别:
Standard Grant
相似国自然基金
Intelligent Patent Analysis for Optimized Technology Stack Selection:Blockchain BusinessRegistry Case Demonstration
- 批准号:
- 批准年份:2024
- 资助金额:万元
- 项目类别:外国学者研究基金项目
基于microRNA前体性质的microRNA演化研究
- 批准号:31100951
- 批准年份:2011
- 资助金额:20.0 万元
- 项目类别:青年科学基金项目
最优证券设计及完善中国资本市场的路径选择
- 批准号:70873012
- 批准年份:2008
- 资助金额:27.0 万元
- 项目类别:面上项目
收缩估计作为模型选择方法的有效性研究
- 批准号:10771006
- 批准年份:2007
- 资助金额:21.0 万元
- 项目类别:面上项目
连锁群选育法(Linkage Group Selection)在柔嫩艾美耳球虫表型相关基因研究中应用
- 批准号:30700601
- 批准年份:2007
- 资助金额:17.0 万元
- 项目类别:青年科学基金项目
控制厚皮甜瓜花性型基因“A“的精细构图及标记辅助育种
- 批准号:30471113
- 批准年份:2004
- 资助金额:21.0 万元
- 项目类别:面上项目
相似海外基金
CRII: CIF: Sequential Decision-Making Algorithms for Efficient Subset Selection in Multi-Armed Bandits and Optimization of Black-Box Functions
CRII:CIF:多臂老虎机中高效子集选择和黑盒函数优化的顺序决策算法
- 批准号:
2246187 - 财政年份:2023
- 资助金额:
$ 21.01万 - 项目类别:
Standard Grant
Rapid Acute Leukemia Genomic Profiling with CRISPR enrichment and Real-time long-read sequencing
利用 CRISPR 富集和实时长读长测序进行快速急性白血病基因组分析
- 批准号:
10651543 - 财政年份:2023
- 资助金额:
$ 21.01万 - 项目类别:
Identifying the longitudinal outcomes of suicide loss in a population-based cohort
确定基于人群的队列中自杀损失的纵向结果
- 批准号:
10716673 - 财政年份:2023
- 资助金额:
$ 21.01万 - 项目类别:
Development and validation of efficient cognitive composite scores of digital tools for the detection of early pathophysiological changes in Alzheimers disease
开发和验证数字工具的有效认知综合评分,用于检测阿尔茨海默病的早期病理生理变化
- 批准号:
10642370 - 财政年份:2023
- 资助金额:
$ 21.01万 - 项目类别:
Development and implementation of a small-scale and highly efficient genomic selection method using "look-ahead" based on reinforcement learning
基于强化学习的“前瞻”小规模高效基因组选择方法的开发和实施
- 批准号:
22H02306 - 财政年份:2022
- 资助金额:
$ 21.01万 - 项目类别:
Grant-in-Aid for Scientific Research (B)
Synthesizability-constrained expansion and multi-objective evolution of antitubercular compounds
抗结核化合物的可合成性约束扩展和多目标进化
- 批准号:
10430402 - 财政年份:2022
- 资助金额:
$ 21.01万 - 项目类别:
Identifying Optimal Antibiotic Regimens to Treat Urinary Tract Infections During Pregnancy
确定治疗妊娠期尿路感染的最佳抗生素方案
- 批准号:
10522361 - 财政年份:2022
- 资助金额:
$ 21.01万 - 项目类别:
Novel strategies for efficient selection of lines with synchronised development by using properties of the circadian clock
利用生物钟的特性有效选择具有同步发育的品系的新策略
- 批准号:
BB/V006665/1 - 财政年份:2022
- 资助金额:
$ 21.01万 - 项目类别:
Research Grant
Model Selection and Efficient Estimation in Semiparametric Regression Models with Complex and High-Dimensional Data
复杂高维数据半参数回归模型的模型选择和高效估计
- 批准号:
RGPIN-2018-06466 - 财政年份:2022
- 资助金额:
$ 21.01万 - 项目类别:
Discovery Grants Program - Individual
Synthesizability-constrained expansion and multi-objective evolution of antitubercular compounds
抗结核化合物的可合成性约束扩展和多目标进化
- 批准号:
10594577 - 财政年份:2022
- 资助金额:
$ 21.01万 - 项目类别: