Theoretical guarantees on functional Lloyd's algorithm and functional k-means clustering
函数式 Lloyd 算法和函数式 k 均值聚类的理论保证
基本信息
- 批准号:EP/W003716/1
- 负责人:
- 金额:$ 3.85万
- 依托单位:
- 依托单位国家:英国
- 项目类别:Research Grant
- 财政年份:2022
- 资助国家:英国
- 起止时间:2022 至 无数据
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Functional data analysis is a statistical area analysing the data in the form of functions, images, shapes or even more general random objects. Thanks to the advance of modern technology, more and more such data are being routinely collected; and thanks to the fast improvement of computational machine learning, these data are being analysed and influencing our life in many aspects. For instance, when unlocking a smart phone using cameras capturing your facial characteristics or sensors reading your finger prints, your smart phone is collecting images, detecting the signals and comparing them to the pre-stored information. As another example, one way to understand the subtypes of the attention deficit hyperactivity disorder (ADHD) is to study the shapes of Corpus Callosum, which often serve as a guidance on diagnosing. Despite the vitality and importance of analysing functional data, the theoretical guarantees of handy statistical methods are often lacking when applying them to functional data. Without theoretical guarantees, interpreting the analysis results is misleading and can be dangerous. The state-of-the-art theoretical developments in functional data analysis, especially functional clustering methods, are suffering from the following issues.1, The majority of the exiting literature relies on the functional principal component analysis, which maps the infinite-dimensional covariance operator to a low-dimensional space, and the analysis in the infinite-dimensional functional space is transformed to a manageable space. However, the success of such transformation relies on the assumption that there is an upper bound on the number of non-zero eigenvalues of the covariance operator. This is a strong condition, since it excludes many standard functional spaces, e.g. Sobolev spaces.2, The majority of the existing theoretical results are asymptotic, in the sense that the results state the asymptotic performances of some statistical procedures, without detailing how fast these procedures reach a desirable rate, or how large the sample size needs to be in order to reach a certain accuracy level. Lacking fixed sample results also hinders the analysis of high-dimensional data. In this research proposal, I will start with a specific problem -- providing theoretical guarantees of the convergence of the functional Lloyd's algorithm, which is the default of the k-means clustering method. With this in hand, I will then provide fixed sample version of the error controls of the functional k-means clustering methods. The success of these two steps will shed light on providing theoretical guarantees on clustering more general objects in manifold learning, which will be the starting point of a further programme. The agenda seems standard, because k-means clustering method is standard and handy in many application areas. But in the functional spaces, there is no theoretical guarantee on the convergence, no theoretical understanding on when the algorithms should converge, not to mention knowing how good the final clustering estimators are, how many iterations are needed, how many samples are needed, what kind of function sampling schemes is the best. This work will provide an answer to these questions.
Functional data analysis is a statistical area analysing the data in the form of functions, images, shapes or even more general random objects. Thanks to the advance of modern technology, more and more such data are being routinely collected; and thanks to the fast improvement of computational machine learning, these data are being analysed and influencing our life in many aspects. For instance, when unlocking a smart phone using cameras capturing your facial characteristics or sensors reading your finger prints, your smart phone is collecting images, detecting the signals and comparing them to the pre-stored information. As another example, one way to understand the subtypes of the attention deficit hyperactivity disorder (ADHD) is to study the shapes of Corpus Callosum, which often serve as a guidance on diagnosing. Despite the vitality and importance of analysing functional data, the theoretical guarantees of handy statistical methods are often lacking when applying them to functional data. Without theoretical guarantees, interpreting the analysis results is misleading and can be dangerous. The state-of-the-art theoretical developments in functional data analysis, especially functional clustering methods, are suffering from the following issues.1, The majority of the exiting literature relies on the functional principal component analysis, which maps the infinite-dimensional covariance operator to a low-dimensional space, and the analysis in the infinite-dimensional functional space is transformed to a manageable space. However, the success of such transformation relies on the assumption that there is an upper bound on the number of non-zero eigenvalues of the covariance operator. This is a strong condition, since it excludes many standard functional spaces, e.g. Sobolev spaces.2, The majority of the existing theoretical results are asymptotic, in the sense that the results state the asymptotic performances of some statistical procedures, without detailing how fast these procedures reach a desirable rate, or how large the sample size needs to be in order to reach a certain accuracy level. Lacking fixed sample results also hinders the analysis of high-dimensional data. In this research proposal, I will start with a specific problem -- providing theoretical guarantees of the convergence of the functional Lloyd's algorithm, which is the default of the k-means clustering method. With this in hand, I will then provide fixed sample version of the error controls of the functional k-means clustering methods. The success of these two steps will shed light on providing theoretical guarantees on clustering more general objects in manifold learning, which will be the starting point of a further programme. The agenda seems standard, because k-means clustering method is standard and handy in many application areas. But in the functional spaces, there is no theoretical guarantee on the convergence, no theoretical understanding on when the algorithms should converge, not to mention knowing how good the final clustering estimators are, how many iterations are needed, how many samples are needed, what kind of function sampling schemes is the best. This work will provide an answer to these questions.
项目成果
期刊论文数量(3)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
Functional Linear Regression with Mixed Predictors
- DOI:
- 发表时间:2020-12
- 期刊:
- 影响因子:0
- 作者:Daren Wang;Zifeng Zhao;Yi Yu;R. Willett
- 通讯作者:Daren Wang;Zifeng Zhao;Yi Yu;R. Willett
Change-point Detection for Sparse and Dense Functional Data in General Dimensions
- DOI:
- 发表时间:2022-05
- 期刊:
- 影响因子:0
- 作者:Carlos Misael Madrid Padilla;Daren Wang;Zifeng Zhao;Yi Yu
- 通讯作者:Carlos Misael Madrid Padilla;Daren Wang;Zifeng Zhao;Yi Yu
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
Yi Yu其他文献
Simple and fast synthesis of polyaniline nanofibers/carbon paper composites as supercapacitor electrodes
简单快速合成聚苯胺纳米纤维/碳纸复合材料作为超级电容器电极
- DOI:
10.1016/j.est.2016.05.011 - 发表时间:
2016-08 - 期刊:
- 影响因子:9.4
- 作者:
Guangjin Wang;Yue Zhang;Fen Zhou;Zixu Sun;Fei Huang;Yi Yu;Lei Chen;Mu Pan - 通讯作者:
Mu Pan
Semantic Deep Hiding for Robust Unlearnable Examples
语义深度隐藏鲁棒的不可学习的例子
- DOI:
10.1109/tifs.2024.3421273 - 发表时间:
2024 - 期刊:
- 影响因子:0
- 作者:
Ruohan Meng;Chenyu Yi;Yi Yu;Siyuan Yang;Bingquan Shen;Alex C. Kot - 通讯作者:
Alex C. Kot
Study on Vibration Characteristics and Human Riding Comfort of a Special Equipment Cab
特种设备驾驶室振动特性及人体乘坐舒适性研究
- DOI:
- 发表时间:
2018 - 期刊:
- 影响因子:0
- 作者:
Wu Ren;Bo Peng;Jiefen Shen;Yang Li;Yi Yu - 通讯作者:
Yi Yu
Thermal analysis and 1.38 μm CW laser performances based on a new tungstate crystal Nd3+:Na2La4(WO4)7
基于新型钨酸盐晶体 Nd3 :Na2La4(WO4)7 的热分析和 1.38 μm CW 激光器性能
- DOI:
10.1016/j.jlumin.2019.116928 - 发表时间:
2020-04 - 期刊:
- 影响因子:3.6
- 作者:
Yi Yu;Xiurong Zhu;Xianke Zhang;Jvjun Yuan;Huajun Yu;Na Xu;Yu Dong;Jiegang Duan;Yeqing Wang - 通讯作者:
Yeqing Wang
The ocean-atmosphere interaction over a summer upwelling system in the South China Sea
南海夏季上升流系统的海气相互作用
- DOI:
10.1016/j.jmarsys.2020.103360 - 发表时间:
2020-04 - 期刊:
- 影响因子:2.8
- 作者:
Yi Yu;Yuntao Wang;Lu Cao;Rui Tang;Fei Chai - 通讯作者:
Fei Chai
Yi Yu的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('Yi Yu', 18)}}的其他基金
DMS-EPSRC: Change Point Detection and Localization in High-Dimensions: Theory and Methods
DMS-EPSRC:高维变化点检测和定位:理论和方法
- 批准号:
EP/V013432/1 - 财政年份:2021
- 资助金额:
$ 3.85万 - 项目类别:
Research Grant
相似海外基金
CAREER: Theoretical and Computational Advances for Enabling Robust Numerical Guarantees in Linear and Mixed Integer Programming Solvers
职业:在线性和混合整数规划求解器中实现鲁棒数值保证的理论和计算进展
- 批准号:
2340527 - 财政年份:2024
- 资助金额:
$ 3.85万 - 项目类别:
Continuing Grant
CAREER: Formal Guarantees for Neurosymbolic Programs via Conformal Prediction
职业:通过保形预测对神经符号程序提供正式保证
- 批准号:
2338777 - 财政年份:2024
- 资助金额:
$ 3.85万 - 项目类别:
Continuing Grant
CAREER: Dual Reinforcement Learning: A Unifying Framework with Guarantees
职业:双重强化学习:有保证的统一框架
- 批准号:
2340651 - 财政年份:2024
- 资助金额:
$ 3.85万 - 项目类别:
Continuing Grant
Collaborative Research: FMitF: Track I: DeepSmith: Scheduling with Quality Guarantees for Efficient DNN Model Execution
合作研究:FMitF:第一轨:DeepSmith:为高效 DNN 模型执行提供质量保证的调度
- 批准号:
2349461 - 财政年份:2023
- 资助金额:
$ 3.85万 - 项目类别:
Standard Grant
Theoretical Guarantees of Machine Learning Methods for High Dimensional Partial Differential Equations: Numerical Analysis and Uncertainty Quantification
高维偏微分方程机器学习方法的理论保证:数值分析和不确定性量化
- 批准号:
2343135 - 财政年份:2023
- 资助金额:
$ 3.85万 - 项目类别:
Standard Grant
Collaborative Research: Coordinating Offline Resource Allocation Decisions and Real-Time Operational Policies in Online Retail with Performance Guarantees
协作研究:在绩效保证下协调在线零售中的线下资源分配决策和实时运营策略
- 批准号:
2226901 - 财政年份:2023
- 资助金额:
$ 3.85万 - 项目类别:
Standard Grant
Collaborative Research: Coordinating Offline Resource Allocation Decisions and Real-Time Operational Policies in Online Retail with Performance Guarantees
协作研究:在绩效保证下协调在线零售中的线下资源分配决策和实时运营策略
- 批准号:
2226900 - 财政年份:2023
- 资助金额:
$ 3.85万 - 项目类别:
Standard Grant
CAREER: Towards Tight Guarantees of Markov Chain Sampling Algorithms in High Dimensional Statistical Inference
职业:高维统计推断中马尔可夫链采样算法的严格保证
- 批准号:
2237322 - 财政年份:2023
- 资助金额:
$ 3.85万 - 项目类别:
Continuing Grant
Performance Guarantees for Electric Vehicle Fast Charging Station Management
电动汽车快速充电站管理的绩效保证
- 批准号:
2312196 - 财政年份:2023
- 资助金额:
$ 3.85万 - 项目类别:
Standard Grant
RI: Medium: Techniques for Massive-Scale Strategic Reasoning: Imperfect-Information Subgame Solving and Offering Guarantees in Simulation-Based Games
RI:中:大规模战略推理技术:不完美信息子博弈解决并在模拟游戏中提供保证
- 批准号:
2312342 - 财政年份:2023
- 资助金额:
$ 3.85万 - 项目类别:
Standard Grant