面向流形数据的共享近邻和二阶K近邻密度峰值聚类算法

赵嘉; 陈蔚昌; 肖人彬; 潘正祥; 崔志华; 王晖

引用本文:	赵嘉,陈蔚昌,肖人彬,潘正祥,崔志华,王晖.面向流形数据的共享近邻和二阶K近邻密度峰值聚类算法[J].控制理论与应用,2026,43(2):388~396.[点击复制]
	ZHAO Jia,CHEN Wei-chang,XIAO Ren-bin,PAN Zheng-xiang,CUI Zhi-hua,WANG Hui.Density peaks clustering algorithm based on shared nearest neighbor and second-order K nearest neighbor for manifold data[J].Control Theory & Applications,2026,43(2):388~396.[点击复制]

面向流形数据的共享近邻和二阶K近邻密度峰值聚类算法

Density peaks clustering algorithm based on shared nearest neighbor and second-order K nearest neighbor for manifold data

摘要点击 2184 全文点击 101 投稿时间：2023-08-22 修订日期：2025-02-25

查看全文查看/发表评论下载PDF阅读器 HTML

DOI编号 10.7641/CTA.2024.30570

2026,43(2):388-396

中文关键词密度峰值聚类逆近邻共享近邻二阶K近邻流形数据

英文关键词 density peaks clustering reverse nearest neighbor shared nearest neighbor second-order K nearest neighbor manifold data

基金项目国家自然科学基金项目(62466037, 62166027)资助.

作者	单位	E-mail
赵嘉^*	南昌工程学院信息工程学院	zhaojia925@163.com
陈蔚昌	南昌工程学院信息工程学院
肖人彬	华中科技大学人工智能与自动化学院
潘正祥	山东科技大学计算机科学与工程学院
崔志华	太原科技大学计算机科学与技术学院
王晖	南昌工程学院信息工程学院

中文摘要

密度峰值聚类算法能够快速高效处理数据集且无需迭代. 但该算法在处理流形数据时, 易错选类簇中心和错误分配样本. 因此, 本文提出面向流形数据的共享近邻和二阶K近邻密度峰值聚类(DPC–SKNN)算法. 首先, 该算法引入逆近邻和共享近邻重新定义局部密度, 充分考虑样本的局部信息和全局信息, 使算法易找到正确的流形类簇中心; 其次, 将样本的关联关系分为K近邻点、二阶K近邻点和非近邻点3种情况, 设计K近邻的分配策略, 增强同一类簇样本的相似性, 提高样本分配的准确率. 将本文算法与8种算法在流形和UCI数据集进行对比, 实验结果表明, DPC-SKNN算法在上述数据集上均获得了不错的聚类结果.

英文摘要

The density peaks clustering algorithm can deal with datasets quickly and efficiently without iteration. How-ever, it can sometimes wrongly select cluster centers and misallocate samples when processing manifold data. Therefore, this paper proposes the density peaks clustering algorithm based on shared nearest neighbor and second-order K nearest neighbor for manifold data (DPC-SKNN) algorithm. Firstly, the algorithm introduces reverse nearest neighbors and shares nearest neighbors to redefine local density, fully considering both local and global information of samples, making the al-gorithm easier to identify correct cluster centers. Secondly, the association relationship of the samples is divided into three types: K nearest neighbors, second-order K nearest neighbors, and non-nearest neighbors, and design allocation strategies for K-nearest neighbors to enhance similarity among samples within the same cluster, thereby improving sample allocation accuracy. DPC-SKNN is compared with eight algorithms on manifold and UCI datasets, and the experimental results show that the DPC-SKNN algorithm obtains good clustering results on all the above datasets.