基于异构二部图模型的多源大规模数据聚类集成算法研究

国基评审专家1V1指导 中标率高出同行96.8%
结合最新热点,提供专业选题建议
深度指导申报书撰写,确保创新可行
指导项目中标800+,快速提高中标率
微信扫码咨询
中文摘要
聚类集成因其融合多聚类结果以得更优聚类的能力,近年来已成为聚类分析领域的一个热点研究方向。但是,当前聚类集成算法往往针对普通规模的数据集而设计,且大多仅考虑单源场景,难以应对现实数据经常存在的多源性与大规模性特点。在多源大规模数据场景下,目前聚类集成研究仍面临多源建模、高阶融合、高效运算等若干亟待解决的关键问题。针对于此,本项目拟以聚类集成与二部图模型为切入点,研究基于异构二部图模型的多源大规模数据聚类集成新框架,着重开展三个方面的理论研究,分别是:1)多源大规模二部图高效构建及其高阶融合;2)基于两类异构二部图的无监督多源大规模聚类集成;3)基于非平衡类标传播的半监督多源大规模聚类集成。进一步,本项目拟开展所建立算法在多源监控视频数据、多源癌症基因数据以及多源社交网络数据上的应用研究。本项目工作将丰富数据挖掘与大数据分析的理论与方法,特别是推动大规模聚类与聚类集成研究的深入发展。
英文摘要
Due to its ability of combining multiple clusterings into a better clustering result, the clustering ensemble technique has in recent years become a popular research topic in the area of clustering analysis. However, most, if not all, of the existing clustering ensemble algorithms are designed for moderate-scale and probably single-source datasets, which unfortunately lack the desirable ability to handle multi-source large-scale datasets in real-world scenarios. When dealing with multi-source large-scale datasets, the current clustering ensemble research is still faced with several crucial problems that remain to be solved, such as the multi-source modeling problem, the higher-order integration problem, and the computational efficiency problem. Inspired by the clustering ensemble and the bipartite graph model, in this project, we plan to conduct research on clustering ensemble algorithms for multi-source large-scale data based on heterogeneous bipartite graph models. In particular, our theoretical research mainly focuses on three sub-topics, namely, (1) the efficient construction and higher-order integration of multi-source large-scale bipartite graphs, (2) unsupervised clustering ensemble for multi-source large-scale data based on two types of heterogeneous bipartite graphs, and (3) semi-supervised clustering ensemble for multi-source large-scale data based on imbalanced label propagation. In terms of application research, we plan to apply our newly designed clustering ensemble algorithms to analyze the multi-source visual surveillance datasets, the multi-source cancer gene expression datasets, and the multi-source social network datasets. The research work of this project will enrich the theory and methodology in the fields of data mining and big data analytics, and particularly promote the development of the large-scale clustering and the clustering ensemble research.
期刊论文列表
专著列表
科研奖励列表
会议论文列表
专利列表
DOI:10.1109/tbdata.2023.3325045
发表时间:2024-02
期刊:IEEE Transactions on Big Data
影响因子:7.2
作者:Jinghuan Lao;Dong Huang;Changdong Wang;Jian-Huang Lai
通讯作者:Jinghuan Lao;Dong Huang;Changdong Wang;Jian-Huang Lai
DOI:10.1109/tetci.2023.3306233
发表时间:2023-08-28
期刊:IEEE TRANSACTIONS ON EMERGING TOPICS IN COMPUTATIONAL INTELLIGENCE
影响因子:5.3
作者:Fang, Si-Guo;Huang, Dong;Tang, Yong
通讯作者:Tang, Yong
Toward Multidiversified Ensemble Clustering of High-Dimensional Data: From Subspaces to Metrics and Beyond
走向高维数据的多元集成聚类:从子空间到度量及其他
DOI:10.1109/tcyb.2021.3049633
发表时间:2021-05-07
期刊:IEEE TRANSACTIONS ON CYBERNETICS
影响因子:11.8
作者:Huang, Dong;Wang, Chang-Dong;Kwoh, Chee-Keong
通讯作者:Kwoh, Chee-Keong
DOI:--
发表时间:2023
期刊:计算机应用
影响因子:--
作者:劳景欢;黄栋;王昌栋;赖剑煌
通讯作者:赖剑煌
DOI:10.1007/s10489-021-02365-8
发表时间:2021-05
期刊:Applied Intelligence
影响因子:5.3
作者:Guang-Yu Zhang;Xiao-Wei Chen;Yu-Ren Zhou;Changdong Wang;Dong Huang;Xiaoyu He
通讯作者:Guang-Yu Zhang;Xiao-Wei Chen;Yu-Ren Zhou;Changdong Wang;Dong Huang;Xiaoyu He
面向大规模图数据的高效集成聚类算法
研究
- 批准号:--
- 项目类别:省市级项目
- 资助金额:10.0万元
- 批准年份:2025
- 负责人:黄栋
- 依托单位:
复杂带缺失多视图数据下的高效集成聚类算法研究
- 批准号:--
- 项目类别:省市级项目
- 资助金额:10.0万元
- 批准年份:2021
- 负责人:黄栋
- 依托单位:
面向多源异构流数据的在线聚类集成算法研究及其应用
- 批准号:61602189
- 项目类别:青年科学基金项目
- 资助金额:20.0万元
- 批准年份:2016
- 负责人:黄栋
- 依托单位:
国内基金
海外基金
