Parameter Estimation for Non-Gaussian Model-Based Clustering with High-Dimensional Data
基于非高斯模型的高维数据聚类参数估计
基本信息
- 批准号:RGPIN-2017-05258
- 负责人:
- 金额:$ 2.04万
- 依托单位:
- 依托单位国家:加拿大
- 项目类别:Discovery Grants Program - Individual
- 财政年份:2022
- 资助国家:加拿大
- 起止时间:2022-01-01 至 2023-12-31
- 项目状态:已结题
- 来源:
- 关键词:
项目摘要
Big data is an important issue for modern data and statistical analysis. Computers can store huge amounts of data; however, methods to accurately and quickly analyze the data have not kept pace with improvements to modern storage technology. In some cases, data are discarded without being analyzed. Improved statistical analysis of big data will benefit any field dealing with massive amounts of data, such as biological sciences (e.g., genomics), finance and informatics, astronomy, cosmology, and climate science.My proposed research will utilize a form of computer programming called Evolutionary Computation (EC). EC uses techniques copied from the biological theory of evolution by natural selection. In biology, the goal usually is to produce as many fit offspring as possible, who go on to produce their own fit offspring. Random mutations to the genome will make some children more fit, or less fit, than their parents. The fitter children are more likely to produce healthy offspring, so their genes get passed on. For my research, the measure of "fitness" used is how well the algorithm searches for optimum solutions with regard to clustering big data. (Clustering involves accounting for the underlying structure that links data points, so that they can be put into correct groups, or labelled correctly, e.g., linking gene expression to types of cancer.) Techniques such as cross-over and mutation are copied from biology, and are used to "evolve" the algorithm and make it fitter each time it runs. Under the proposed research, evolutionary algorithms (EAs) will be developed, as alternatives to the almost ubiquitous expectation-maximization (EM) algorithm and its variants, for Gaussian and non-Gaussian mixture model-based approaches to clustering. EAs will be developed for the mixture of factor analyzers model, the mixture of variance-gamma distributions, and the mixture of variance-gamma factor analyzers models. Other short term objectives include the development of a mixture of multiple scaled variance-gamma distributions. This will bring a phenomenal level of modelling flexibility, while also guaranteeing cluster convexity -- the resulting components are hypercubiods so that the rate of decay can differ in each dimension. The mixture of multiple scaled variance-gamma distributions model will be extended to the mixture of multiple scaled variance-gamma factor analyzers model, for application to high-dimensional data. EAs will then be developed for the mixture of multiple scaled variance-gamma distributions and mixture of multiple scaled variance-gamma factor analyzers models and investigated as alternatives to alternating expectation-conditional maximization algorithms.
大数据是现代数据和统计分析的重要课题。计算机可以存储大量的数据;然而,准确、快速分析数据的方法并没有跟上现代存储技术的进步。在某些情况下,数据未经分析就被丢弃。改进大数据的统计分析将有利于处理大量数据的任何领域,如生物科学(如基因组学)、金融和信息学、天文学、宇宙学和气候科学。我提出的研究将利用一种称为进化计算(EC)的计算机编程形式。欧共体使用的技术是从自然选择的生物进化理论中复制过来的。在生物学中,目标通常是产生尽可能多的适合的后代,这些后代继续产生自己的适合的后代。基因组的随机突变会使一些孩子比他们的父母更健康,或更不健康。更健康的孩子更有可能生出健康的后代,这样他们的基因就会遗传下去。在我的研究中,使用的“适应度”度量是算法在聚类大数据方面搜索最佳解决方案的效果。(聚类涉及对连接数据点的潜在结构进行计算,以便将它们放入正确的组中,或正确标记,例如,将基因表达与癌症类型联系起来。)交叉和突变等技术是从生物学中复制过来的,用于“进化”算法,使其每次运行时都更适合。根据提出的研究,进化算法(EAs)将被开发,作为替代几乎无处不在的期望最大化(EM)算法及其变体,用于基于高斯和非高斯混合模型的聚类方法。ea将用于混合因子分析模型、混合方差-伽马分布和混合方差-伽马因子分析模型。其他短期目标包括发展多尺度方差-伽马分布的混合。这将带来惊人的建模灵活性,同时也保证了簇的凹凸性——生成的组件是超立方体,因此每个维度的衰减率都不同。混合多尺度方差- γ分布模型将扩展为混合多尺度方差- γ因子分析模型,以应用于高维数据。然后将开发用于混合多尺度方差- γ分布和混合多尺度方差- γ因子分析模型的ea,并作为交替期望-条件最大化算法的替代方案进行研究。
项目成果
期刊论文数量(0)
专著数量(0)
科研奖励数量(0)
会议论文数量(0)
专利数量(0)
数据更新时间:{{ journalArticles.updateTime }}
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
数据更新时间:{{ journalArticles.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ monograph.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ sciAawards.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ conferencePapers.updateTime }}
{{ item.title }}
- 作者:
{{ item.author }}
数据更新时间:{{ patent.updateTime }}
McNicholas, Sharon其他文献
McNicholas, Sharon的其他文献
{{
item.title }}
{{ item.translation_title }}
- DOI:
{{ item.doi }} - 发表时间:
{{ item.publish_year }} - 期刊:
- 影响因子:{{ item.factor }}
- 作者:
{{ item.authors }} - 通讯作者:
{{ item.author }}
{{ truncateString('McNicholas, Sharon', 18)}}的其他基金
Parameter Estimation for Non-Gaussian Model-Based Clustering with High-Dimensional Data
基于非高斯模型的高维数据聚类参数估计
- 批准号:
RGPIN-2017-05258 - 财政年份:2021
- 资助金额:
$ 2.04万 - 项目类别:
Discovery Grants Program - Individual
Parameter Estimation for Non-Gaussian Model-Based Clustering with High-Dimensional Data
基于非高斯模型的高维数据聚类参数估计
- 批准号:
RGPIN-2017-05258 - 财政年份:2020
- 资助金额:
$ 2.04万 - 项目类别:
Discovery Grants Program - Individual
Parameter Estimation for Non-Gaussian Model-Based Clustering with High-Dimensional Data
基于非高斯模型的高维数据聚类参数估计
- 批准号:
RGPIN-2017-05258 - 财政年份:2018
- 资助金额:
$ 2.04万 - 项目类别:
Discovery Grants Program - Individual
Parameter Estimation for Non-Gaussian Model-Based Clustering with High-Dimensional Data
基于非高斯模型的高维数据聚类参数估计
- 批准号:
RGPIN-2017-05258 - 财政年份:2017
- 资助金额:
$ 2.04万 - 项目类别:
Discovery Grants Program - Individual
A Design Theoretic Approach to Multi-Objective Evolutionary Optimization
多目标进化优化的设计理论方法
- 批准号:
425548-2012 - 财政年份:2014
- 资助金额:
$ 2.04万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral
A Design Theoretic Approach to Multi-Objective Evolutionary Optimization
多目标进化优化的设计理论方法
- 批准号:
425548-2012 - 财政年份:2013
- 资助金额:
$ 2.04万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral
A Design Theoretic Approach to Multi-Objective Evolutionary Optimization
多目标进化优化的设计理论方法
- 批准号:
425548-2012 - 财政年份:2012
- 资助金额:
$ 2.04万 - 项目类别:
Alexander Graham Bell Canada Graduate Scholarships - Doctoral
相似海外基金
A shape-constrained approach for non-parametric variance estimation for Markov Chains
马尔可夫链非参数方差估计的形状约束方法
- 批准号:
2311141 - 财政年份:2023
- 资助金额:
$ 2.04万 - 项目类别:
Continuing Grant
Construction of a receptivity estimation model for risky utterance strategies in non-task-oriented conversational systems
非面向任务的会话系统中风险话语策略的接受度估计模型的构建
- 批准号:
23K16923 - 财政年份:2023
- 资助金额:
$ 2.04万 - 项目类别:
Grant-in-Aid for Early-Career Scientists
Progenitor Star Estimation of Non-thermal Dominated Core-Collapse Supernova Remnant Probed by High-resolution X-ray Spectroscopy
高分辨率 X 射线光谱探测非热主导核心塌陷超新星遗迹的祖星估计
- 批准号:
23KJ0296 - 财政年份:2023
- 资助金额:
$ 2.04万 - 项目类别:
Grant-in-Aid for JSPS Fellows
Non-Contact Sleep Stage Estimation: Machine Learning in Multi-Imbalance Data for Improvements in Accuracy and Interpretability
非接触式睡眠阶段估计:多重不平衡数据中的机器学习,以提高准确性和可解释性
- 批准号:
22KJ1367 - 财政年份:2023
- 资助金额:
$ 2.04万 - 项目类别:
Grant-in-Aid for JSPS Fellows
PFI-TT: Novel Non-Contacting Position Estimation System for Long-Stroke Actuators
PFI-TT:用于长行程执行器的新型非接触式位置估计系统
- 批准号:
2329798 - 财政年份:2023
- 资助金额:
$ 2.04万 - 项目类别:
Standard Grant
Non-parametric estimation under covariate shift: From fundamental bounds to efficient algorithms
协变量平移下的非参数估计:从基本界限到高效算法
- 批准号:
2311072 - 财政年份:2023
- 资助金额:
$ 2.04万 - 项目类别:
Standard Grant
Non-parametric identification, estimation and inference: generalized functions approach
非参数识别、估计和推理:广义函数方法
- 批准号:
RGPIN-2020-05444 - 财政年份:2022
- 资助金额:
$ 2.04万 - 项目类别:
Discovery Grants Program - Individual
Continuous estimation of pulmonary artery pressure fluctuations based on non-invasive measurement using microwave radar
基于微波雷达无创测量的肺动脉压力波动连续估计
- 批准号:
22K12917 - 财政年份:2022
- 资助金额:
$ 2.04万 - 项目类别:
Grant-in-Aid for Scientific Research (C)
Non-destructive estimation of plant biomass using LiDAR data
使用激光雷达数据无损估算植物生物量
- 批准号:
574895-2022 - 财政年份:2022
- 资助金额:
$ 2.04万 - 项目类别:
University Undergraduate Student Research Awards
Investigating new approaches for narrowband but nevertheless high-precision wireless locating in multipath environments by means of iterative recursive non-linear state estimation techniques based on aperture synthesis and phase difference analysis in ant
基于ant中孔径合成和相位差分析的迭代递归非线性状态估计技术,研究多路径环境中窄带但高精度无线定位的新方法
- 批准号:
450697408 - 财政年份:2021
- 资助金额:
$ 2.04万 - 项目类别:
Research Grants