华南理工大学学报(自然科学版) ›› 2022, Vol. 50 ›› Issue (12): 60-70.doi: 10.12141/j.issn.1000-565X.220069

所属专题: 2022年计算机科学与技术

• 计算机科学与技术 • 上一篇    下一篇

基于锚点图的低秩缺失多视图子空间聚类

刘小兰 石宗宇 叶泽慧 梁勇   

  1. 华南理工大学 数学学院,广东 广州 510640
  • 收稿日期:2022-02-21 出版日期:2022-12-25 发布日期:2022-07-15
  • 通信作者: 梁勇(1978-),男,博士,讲师,主要从事非线性波、优化算法研究。 E-mail:dyliang@scut.edu.cn
  • 作者简介:刘小兰(1979-),女,博士,副教授,主要从事优化算法与机器学习研究.E-mail:liuxl@scut.edu.cn.
  • 基金资助:
    广东省自然科学基金资助项目(2020A1515010699)

Anchor Graph Based Low-Rank Incomplete Multi-View Subspace Clustering

LIU Xiaolan SHI Zongyu YE Zehui LIANG Yong    

  1. School of Mathematics,South China University of Technology,Guangzhou 510460,Guangdong,China
  • Received:2022-02-21 Online:2022-12-25 Published:2022-07-15
  • Contact: 梁勇(1978-),男,博士,讲师,主要从事非线性波、优化算法研究。 E-mail:dyliang@scut.edu.cn
  • About author:刘小兰(1979-),女,博士,副教授,主要从事优化算法与机器学习研究.E-mail:liuxl@scut.edu.cn.
  • Supported by:
    the Natural Science Foundation of Guangdong Province(2020A1515010699)

摘要:

经典的多视图聚类任务针对的是完整数据,然而实际任务中受限于信息的获取渠道,某些视图往往含有缺失数据,由此引出了缺失多视图聚类问题。针对此问题,现有的聚类模型大多基于非负矩阵分解或距离构造聚类图,存在着联合优化使解的性能得不到保障以及无法对全局结构进行充分刻画的不足。为了提升聚类图的性能,基于低秩子空间聚类和锚点图,文中提出了一种低秩缺失多视图子空间聚类算法ALIMSC。该算法先通过基于锚点图的缺失多视图聚类(APMC)算法得到数据的基准相似矩阵,将其嵌入低秩子空间聚类模型,通过升维对齐和加权融合的方法得到相似矩阵,再通过让相似矩阵与基准相似矩阵尽可能的一致来求得最终的聚类图。ALIMSC算法通过对每个视图的相似矩阵施加秩最小化约束来刻画高维数据的低维子空间分布,在原有锚点图的基础上进一步强调了数据的子空间算法结构,即聚类图中所体现的块对角性。在多个公开数据集上的实验结果表明,本文算法的聚类性能优于所对比的缺失多视图聚类算法。

关键词: 聚类算法, 低秩表示, 缺失多视图聚类, 子空间聚类

Abstract:

Traditional multi-view clustering task is for complete data. However, in practical tasks, due to the limitation of the information acquisition method, some views tend to contain missing data, and this leads to the problem of incomplete multi-view clustering. In view of this problem, most of the existing clustering models are based on non-negative matrix factorization or distance graph, and their co-optimization strategy can easily make the performance of the solution insecure and the global structure can’t be fully characterized. In order to improve the performance of clustering graph, this paper proposed an incomplete multi-view clustering algorithm ALIMSC based on low-rank subspace clustering and anchor graph. The algorithm first obtained the benchmark similarity matrix of data by incomplete multi-view subspace clustering algorithm APMC based on anchor graph, which was embedded in the low-rank subspace clustering model. The similarity matrix was obtained by dimensionality ascending alignment and weighted fusion, and the final clustering graph was obtained by making the similarity matrix as consistent as possible with the benchmark similarity matrix. ALIMSC algorithm characterized the low-dimensional subspace distribution of high-dimensional data by imposing rank minimization constraint on the similarity matrix of each view and emphasized the subspace structure of the data on the basis of the original anchor graph, that is, the block diagonality reflected in the cluster graph. Experimental results on several public datasets show that the proposed algorithm outperforms the classical incomplete multi-view algorithms.

Key words: clustering algorithm, low-rank representation, incomplete multi-view clustering, subspace clustering

中图分类号: