基于样本互补锚点图的缺失多视图聚类算法

doi:10.12141/j.issn.1000-565X.250145

摘要/Abstract

摘要：

随着多视图数据在现实场景中得到广泛应用，如何处理缺失视图下的聚类问题已成为机器学习领域的重要挑战。传统锚点图聚类算法依赖完整实例构建锚点图，这导致其在高缺失率下因锚点不足难以表征数据结构，在低缺失率时又无法发挥锚点的优势。针对传统锚点图聚类算法中存在的锚点选择受限、权重分配僵化和计算复杂度高的问题，该文提出了一种基于样本互补锚点图的缺失多视图聚类算法（IMVC-SAC）。该算法首先设计跨视图锚点互补机制，通过在共有样本与视图特有样本中自适应选取锚点，以解决高缺失率下数据结构表征不足的问题；然后建立缺失模式感知的权重模型，依据样本的缺失模式与程度调整视图对相似矩阵的贡献度；最后利用双随机非负矩阵可分解特性，将谱聚类的时间复杂度从样本规模的立方阶复杂度优化至线性阶复杂度。在5个公开数据集上的实验结果表明，该算法的聚类性能优于目前主流算法，尤其在高缺失率下仍能保持较好的聚类效果，验证了其鲁棒性与有效性。

关键词: 缺失多视图聚类, 锚点图, 样本互补, 相似矩阵融合, 谱聚类

Abstract:

With the widespread application of multi-view data in real-world scenarios, clustering with incomplete views has emerged as a significant challenge in machine learning. Traditional anchor graph-based clustering algorithms rely on complete instances to build the anchor graphs. This dependency leads to insufficient anchors for capturing the underlying data structure under high missing rates, while failing to fully leverage the benefits of anchors when missing rate is low. To address the limitations of traditional methods, including restricted anchor selection, inflexible weight assignment, and high computational complexity, this paper proposed an incomplete multi-view clustering algorithm based on a Sample-Complementary Anchor Graphs (IMVC-SAC). First, the algorithm introduces a cross-view anchor complementation mechanism, which adaptively selects anchors from both shared samples and view-specific samples to enhance data structure representation, particularly under high missing rates. Second, it establishes a missing pattern-aware weighting model that dynamically adjusts the contribution of each view to the similarity matrix based on the missing pattern and degree of the samples. Finally, by leveraging the properties of doubly stochastic non-negative matrix factorization, the time complexity of spectral clustering is reduced from cubic to linear with respect to the sample size. Experimental results on five public datasets demonstrate that the proposed IMVC-SAC algorithm outperforms state-of-the-art methods in clustering performance. Notably, it maintains robust and effective clustering even under high missing rates, validating its superiority.

Key words: incomplete multi-view clustering, anchor graph, sample complementarity, similarity matrix fusion, spectral clustering

中图分类号:

TP391

刘小兰, 徐宇鸿. 基于样本互补锚点图的缺失多视图聚类算法[J]. 华南理工大学学报(自然科学版), 2026, 54(2): 16-24.

LIU Xiaolan, XU Yuhong. Incomplete Multi-View Clustering Algorithm Based on Sample Complementary Anchor Graph[J]. Journal of South China University of Technology(Natural Science Edition), 2026, 54(2): 16-24.

图/表 6

图1

图2

表1

表2

表3

8种算法在数据集UCI Digit和BBC Sports上的聚类结果"

数据集	算法	A_CC					NMI
数据集	算法	r_PD=0.1	r_PD=0.3	r_PD=0.5	r_PD=0.7	r_PD=0.9	r_PD=0.1	r_PD=0.3	r_PD=0.5	r_PD=0.7	r_PD=0.9
UCI Digit （二视图）	APMC	0.821 0	0.846 5	0.801 5	0.814 5	0.801 0	0.833 8	0.841 8	0.794 7	0.780 6	0.696 2
	IMVC-CBG	0.630 5	0.627 0	0.632 3	0.562 5	0.557 0	0.589 2	0.585 3	0.591 2	0.511 6	0.505 4
	PVC	0.581 9	0.551 4	0.561 6	0.576 5	0.522 6	0.595 6	0.540 7	0.549 0	0.539 9	0.486 9
	MIC	0.781 5	0.701 5	0.591 5	0.531 0	0.442 0	0.670 2	0.601 7	0.527 8	0.478 1	0.443 3
	HCP-IMSC	0.803 2	0.840 4	0.812 5	0.749 4	0.735 6	0.746 5	0.743 7	0.726 8	0.677 6	0.657 6
	OPIMC	0.224 5	0.261 0	0.189 5	0.231 5	0.252 5	0.378 9	0.301 1	0.220 5	0.195 4	0.238 9
	GIMC_FLSD	0.597 3	0.451 1	0.360 9	0.375 0	0.384 7	0.547 2	0.439 4	0.351 6	0.362 7	0.384 8
	IMVC-SAC	0.898 5± 0.056 0	0.860 5± 0.026 3	0.837 5± 0.004 8	0.835 0± 0.007 0	0.837 5± 0.001 7	0.832 0± 0.037 1	0.818 9± 0.011 9	0.795 2± 0.008 9	0.806 7± 0.007 5	0.749 9± 0.003 5
UCI Digit （三视图）	APMC	0.930 0	0.923 0	0.901 5	0.892 0	0.889 5	0.880 0	0.865 0	0.842 6	0.829 3	0.817 7
	IMVC-CBG	0.755 3	0.754 7	0.761 9	0.761 4	0.760 7	0.742 2	0.740 6	0.748 1	0.747 2	0.745 6
	PVC	-	-	-	-	-	-	-	-	-	-
	MIC	0.833 5	0.712 0	0.655 5	0.623 0	0.522 5	0.729 9	0.632 4	0.576 7	0.547 5	0.521 3
	HCP-IMSC	0.714 0	0.657 0	0.591 5	0.483 0	0.491 0	0.636 5	0.592 3	0.523 3	0.457 9	0.456 3
	OPIMC	0.452 5	0.379 0	0.485 5	0.448 5	0.361 5	0.552 8	0.471 7	0.520 5	0.431 0	0.375 3
	GIMC_FLSD	0.580 1	0.355 6	0.380 1	0.321 4	0.264 0	0.581 0	0.363 1	0.387 6	0.306 3	0.212 4
	IMVC-SAC	0.933 0± 0.006 1	0.928 0± 0.008 1	0.914 0± 0.005 6	0.900 0± 0.003 4	0.893 0± 0.002 9	0.892 3± 0.007 0	0.876 8± 0.010 2	0.861 6± 0.005 0	0.850 0± 0.004 3	0.825 0± 0.002 9
BBC Sports	APMC	0.979 8	0.970 6	0.974 3	0.935 7	0.869 5	0.930 4	0.905 5	0.911 4	0.815 4	0.666 1
	IMVC-CBG	0.818 4	0.798 5	0.817 6	0.777 2	0.798 7	0.756 0	0.733 0	0.756 6	0.702 4	0.731 2
	PVC	0.531 8	0.521 5	0.492 5	0.412 6	0.401 7	0.348 4	0.335 1	0.270 2	0.213 9	0.186 5
	MIC	0.619 5	0.761 0	0.590 1	0.902 6	0.648 9	0.495 7	0.745 0	0.520 3	0.773 6	0.641 6
	HCP-IMSC	0.941 2	0.954 0	0.968 8	0.935 7	0.819 9	0.841 3	0.859 9	0.900 2	0.828 4	0.702 1
	OPIMC	0.577 2	0.549 6	0.606 6	0.676 5	0.652 6	0.566 7	0.491 8	0.457 4	0.499 3	0.378 7
	GIMC_FLSD	0.779 4	0.797 8	0.834 6	0.735 7	0.655 1	0.726 0	0.677 3	0.665 4	0.640 2	0.574 1
	IMVC-SAC	0.985 3± 0.005 2	0.972 4± 0.055 2	0.979 8± 0.004 0	0.959 6± 0.005 2	0.957 7± 0.075 2	0.946 8± 0.014 4	0.909 1± 0.028 2	0.926 5± 0.012 3	0.877 9± 0.012 4	0.877 9± 0.066 0

表3

表4

参考文献 28

[1]	WANG Z， LI L， NING X，et al ．Incomplete multi-view clustering via structure exploration and missing-view infe-rence［J］．Information Fusion，2024，103：102123/1-12.
[2]	LI S Y， JIANG Y， ZHOU Z H ．Partial multi-view clustering［C］∥ Proceedings of the 28th AAAI Conference on Artificial Intelligence．Québec City：AAAI，2014：1968-1974．
[3]	SHAO W， HE L， YU P S ．Multiple incomplete views clustering via weighted nonnegative matrix factorization with regularization［C］∥ Proceedings of 2015 European Conference on Machine Learning and Knowledge Disco-very in Databases．Porto：Springer，2015：318-334.
[4]	WEN J， XU Y， LIU H ．Incomplete multiview spectral clustering with adaptive graph learning［J］．IEEE Tran-sactions on Cybernetics，2018，50（4）：1418-1429.
[5]	WEN J， SUN H， FEI L，et al ．Consensus guided incomplete multi-view spectral clustering［J］．Neural Networks，2021，133：207-219.
[6]	GUO J， YE J ．Anchors bring ease：an embarrassingly simple approach to partial multi-view clustering［C］∥ Proceedings of the 33rd AAAI Conference on Artificial Intelligence．Honolulu：AAAI，2019：118-125．
[7]	YU X， JIANG Y， CHAO G，et al ．Deep contrastive multi-view subspace clustering with representation and cluster interactive learning［J］．IEEE Transactions on Knowledge and Data Engineering，2024，37（1）：188-199.
[8]	CHAO G， XU K， XIE X，et al ．Global graph propagation with hierarchical information transfer for incomplete contrastive multi-view clustering［C］∥ Proceedings of the 39th AAAI Conference on Artificial Intelligence．Philadelphia：AAAI，2025：15713-15721.
[9]	SHEN Q， GUO Z， WANG H，et al ．Reliable entropy-induced anchor learning for incomplete multi-view subspace clustering［J］．IEEE Transactions on Circuits and Systems for Video Technology，2025，35（6）：5293-5306.
[10]	WEI K， LI H， LIU Q，et al ．Self-supervised，multi-view，semantics-aware anchor clustering［J］．Electronics，2024，13（23）：4782/1-18.
[11]	YU S， WANG S， ZHANG P，et al ．DVSAI：diverse view-shared anchors based incomplete multi-view clustering［C］∥ Proceedings of the 38th AAAI Conference on Artificial Intelligence．Vancouver：AAAI，2024：16568-16577.
[12]	MI Y， CHEN H， YUAN Z，et al ．Fast multi-view subspace clustering with balance anchors guidance［J］．Pattern Recognition，2024，145：109895/1-11.
[13]	YANG B， WU J， ZHANG X，et al ．Discrete correntropy-based multi-view anchor-graph clustering［J］．Information Fusion，2024，103：102097/1-11.
[14]	XIA W， GAO Q， WANG Q，et al ．Tensorized bipartite graph learning for multi-view clustering［J］．IEEE Transactions on Pattern Analysis and Machine Intelligence，2022，45（4）：5187-5202.
[15]	HUANG D， WANG C D， LAI J H ．Fast multi-view clustering via ensembles：towards scalability，superio-rity，and simplicity［J］．IEEE Transactions on Know-ledge and Data Engineering，2023，35（11）：11388-11402.
[16]	赵兴旺，王淑君，刘晓琳，等．基于二部图的联合谱嵌入多视图聚类算法［J］．软件学报，2024，35（9）：4408-4424.
	ZHAO Xing-wang， WANG Shu-jun， LIU Xiao-lin，et al ．Joint spectral embedding multi-view clustering algorithm based on bipartite graphs［J］．Journal of Software，2024，35（9）：4408-4424.
[17]	LIU W， HE J， CHANG S F ．Large graph construction for scalable semi-supervised learning［C］∥ Proceedings of the 27th International Conference on Machine Learning．Haifa：Omni Press，2010：679-686.
[18]	WANG S， LIU X， LIU L，et al ．Highly-efficient incomplete large-scale multi-view clustering with consensus bipartite graph［C］∥ Proceedings of 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition．New Orleans：IEEE，2022：9776-9785.
[19]	ZHANG R， HANG S， SUN Z，et al ．Anchor-based fast spectral ensemble clustering［J］．Information Fusion，2025，113：102587/1-13.
[20]	LI Z， TANG C， ZHENG X，et al ．High-order correlation preserved incomplete multi-view subspace cluste-ring［J］．IEEE Transactions on Image Processing，2022，31：2067-2080．
[21]	HU M， CHEN S ．One-pass incomplete multi-view clustering［C］∥ Proceedings of the 33rd AAAI Confe-rence on Artificial Intelligence．Honolulu：AAAI，2019：3838-3845.
[22]	WEN J， ZHANG Z， ZHANG Z，et al ．Generalized incomplete multiview clustering with flexible locality structure diffusion［J］．IEEE Transactions on Cybernetics，2020，51（1）：101-114.
[23]	GREENE D， CUNNINGHAM P ．A matrix factorization approach for integrating multiple data views［C］∥ Proceedings of 2009 European Conference on Machine Learning and Knowledge Discovery in Databases．Bled：Springer，2009：423-438.
[24]	JAIN A K， DUIN R P W， MAO J ．Statistical pattern recognition：a review［J］．IEEE Transactions on Pattern Analysis and Machine Intelligence，2000，22（1）：4-37.
[25]	XIA R， PAN Y， DU L，et al ．Robust multi-view spectral clustering via low-rank and sparse decomposition［C］∥ Proceedings of the 28th AAAI Conference on Artificial Intelligence．Palo Alto：AAAI，2014：2149-2155.
[26]	LI F-F， ANDREETO M， RANZATO M，et al ．Caltech 101（Version 1.0）［DB/OL］．（2022-04-06）［2025-05-10］．.
[27]	刘小兰，叶泽慧．基于StarGAN和子空间学习的缺失多视图聚类［J］．华南理工大学学报（自然科学版），2020，48（11）：87-98．
	LIU Xiaolan， YE Zehui ．Partial multi-view clustering based on StarGAN and subspace learning［J］．Journal of South China University of Technology （Natural Science Edition），2020，48（11）：87-98.
[28]	ZHU J， WAN M， YANG G，et al ．INCOMPLETE multi-view clustering based on low-rank adaptive graph learning［J］．Knowledge-Based Systems，2024，305：112562/1-13.

数据集	样本数	视图数	类别数	各视图样本特征维度
UCI Digit	2 000	3	10	216，76，64
3Source	416	3	10	3 560，3 631，3 068
BBCSports	544	2	5	3 183，3 203
HW2Source	2 000	2	10	784，256
Caltech101	8 710	2	101	512，512

算法	A_CC				NMI
算法	三视图	BBC-Guardian	BBC-Reuters	Guardian-Reuters	三视图	BBC-Guardian	BBC-Reuters	Guardian-Reuters
APMC	0.863 0	0.821 8	0.830 5	0.747 4	0.717 1	0.652 4	0.700 3	0.665 8
IMVC-CBG	0.442 4	0.532 5	0.473 9	0.380 0	0.549 4	0.616 5	0.573 5	0.494 0
PVC	-	0.662 1	0.577 5	0.642 4	-	0.584 5	0.569 1	0.547 7
MIC	0.675 5	0.687 5	0.728 4	0.733 2	0.677 7	0.565 7	0.562 6	0.571 4
HCP-IMSC	0.899 0	0.817 7	0.877 4	0.810 1	0.789 0	0.658 3	0.742 8	0.648 1
OPIMC	0.670 7	0.531 3	0.528 8	0.456 7	0.520 8	0.317 7	0.342 9	0.293 6
GIMC_FLSD	0.730 8	-	-	-	0.618 2	-	-	-
IMVC-SAC	0.913 5±0.004 6	0.898 5±0.006 1	0.911 5±0.009 1	0.908 9±0.005 7	0.789 5±0.006 9	0.768 3±0.009 0	0.779 0±0.011 5	0.783 9±0.009 7

数据集	算法	A_CC					NMI
数据集	算法	r_PD=0.1	r_PD=0.3	r_PD=0.5	r_PD=0.7	r_PD=0.9	r_PD=0.1	r_PD=0.3	r_PD=0.5	r_PD=0.7	r_PD=0.9
HW2sources	APMC	0.983 5	0.968 0	0.939 5	0.898 5	0.725 5	0.960 1	0.925 6	0.868 5	0.807 1	0.636 2
	IMVC-CBG	0.597 1	0.568 0	0.567 0	0.566 4	0.565 6	0.552 2	0.519 0	0.517 9	0.517 3	0.516 3
	PVC	0.712 9	0.659 4	0.654 6	0.592 8	0.430 3	0.609 6	0.542 5	0.534 8	0.465 0	0.321 2
	MIC	0.614 0	0.641 0	0.572 5	0.509 0	0.444 0	0.607 1	0.572 3	0.513 7	0.433 7	0.340 2
	HCP-IMSC	0.539 0	0.505 1	0.451 0	0.418 0	0.343 7	0.478 6	0.429 7	0.387 6	0.351 7	0.286 8
	OPIMC	0.535 5	0.474 0	0.452 0	0.501 5	0.298 0	0.578 5	0.556 2	0.442 8	0.482 7	0.271 2
	GIMC_FLSD	0.510 4	0.488 8	0.440 1	0.382 3	0.328 0	0.487 1	0.472 8	0.412 6	0.343 8	0.309 9
	IMVC-SAC	0.976 5± 0.002 5	0.961 5± 0.002 3	0.930 5± 0.037 7	0.907 5± 0.054 0	0.824 0± 0.027 5	0.944 3± 0.005 6	0.916 1± 0.003 2	0.853 8± 0.012 1	0.814 1± 0.011 5	0.711 9± 0.010 1
Caltech101	APMC	0.631 9	0.631 2	0.621 0	0.601 6	0.556 1	0.838 7	0.832 8	0.831 0	0.816 3	0.786 3
	IMVC-CBG	0.648 0	0.636 3	0.643 3	0.653 4	0.666 6	0.831 3	0.832 8	0.832 0	0.833 8	0.832 4
	PVC	0.609 9	0.622 5	0.598 3	0.598 7	0.594 5	0.826 8	0.825 8	0.818 7	0.815 8	0.808 8
	MIC	-	-	-	-	-	-	-	-	-	-
	HCP-IMSC	-	-	-	-	-	-	-	-	-	-
	OPIMC	0.592 9	0.568 0	0.447 1	0.327 4	0.325 8	0.758 1	0.730 7	0.649 9	0.603 9	0.581 9
	GIMC_FLSD	-	-	-	-	-	-	-	-	-	-
	IMVC-SAC	0.654 0± 0.013 3	0.645 5± 0.016 6	0.627 1± 0.025 0	0.654 7± 0.012 4	0.622 9± 0.019 0	0.843 2± 0.003 3	0.837 2± 0.005 0	0.827 7± 0.010 2	0.828 3± 0.004 0	0.801 0± 0.005 5