面向流形数据的共享近邻和二阶K近邻密度峰值聚类算法

赵嘉; 陈蔚昌; 肖人彬; 潘正祥; 崔志华; 王晖

引用本文:	赵嘉,陈蔚昌,肖人彬,潘正祥,崔志华,王晖.面向流形数据的共享近邻和二阶K近邻密度峰值聚类算法[J].控制理论与应用,2026,43(2):386~394.[点击复制]
	ZHAO Jia,CHEN Wei-chang,XIAO Ren-bin,PAN Zheng-xiang,CUI Zhi-hua,WANG Hui.Density peaks clustering algorithm based on shared nearest neighbor and second-order K nearest neighbor for manifold data[J].Control Theory & Applications,2026,43(2):386~394.[点击复制]

面向流形数据的共享近邻和二阶K近邻密度峰值聚类算法

Density peaks clustering algorithm based on shared nearest neighbor and second-order K nearest neighbor for manifold data

摘要点击 124 全文点击 20 投稿时间：2023-08-22 修订日期：2025-02-25

查看全文查看/发表评论下载PDF阅读器 HTML

DOI编号 10.7641/CTA.2024.30570

2026,43(2):386-394

中文关键词密度峰值聚类逆近邻共享近邻二阶K近邻流形数据

英文关键词 density peaks clustering reverse nearest neighbor shared nearest neighbor second-order K nearest neighbor manifold data

基金项目国家自然科学基金项目(62466037, 62166027)资助.

作者	单位	E-mail
赵嘉^*	南昌工程学院信息工程学院	zhaojia925@163.com
陈蔚昌	南昌工程学院信息工程学院
肖人彬	华中科技大学人工智能与自动化学院
潘正祥	山东科技大学计算机科学与工程学院
崔志华	太原科技大学计算机科学与技术学院
王晖	南昌工程学院信息工程学院

中文摘要

密度峰值聚类算法能够快速高效处理数据集且无需迭代. 但该算法在处理流形数据时, 易错选类簇中心和错误分配样本. 因此, 本文提出面向流形数据的共享近邻和二阶K近邻密度峰值聚类(DPC–SKNN)算法. 首先, 该算法引入逆近邻和共享近邻重新定义局部密度, 充分考虑样本的局部信息和全局信息, 使算法易找到正确的流形类簇中心; 其次, 将样本的关联关系分为K近邻点、二阶K近邻点和非近邻点3种情况, 设计K近邻的分配策略, 增强同一类簇样本的相似性, 提高样本分配的准确率. 将本文算法与8种算法在流形和UCI数据集进行对比, 实验结果表明, DPC-SKNN算法在上述数据集上均获得了不错的聚类结果.

英文摘要

The density peaks clustering algorithm can deal with datasets quickly and efficiently without iteration. However, it can sometimes wrongly select cluster centers and misallocate samples when processing manifold data. Therefore, this paper proposes the density peaks clustering algorithm based on shared nearest neighbor and second-order K nearest neighbor for manifold data (DPC-SKNN) algorithm. Firstly, the algorithm introduces reverse nearest neighbors and shares nearest neighbors to redefine local density, fully considering both local and global information of samples, making the algorithm easier to identify correct cluster centers. Secondly, the association relationship of the samples is divided into three types: K nearest neighbors, second-order K nearest neighbors, and non-nearest neighbors, and design allocation strategies for K-nearest neighbors to enhance similarity among samples within the same cluster, thereby improving sample allocation accuracy. DPC-SKNN is compared with eight algorithms on manifold and UCI datasets, and the experimental results show that the DPC-SKNN algorithm obtains good clustering results on all the above datasets.