一种时间序列数据的动态密度聚类算法
A dynamic density clustering algorithm for time series data
摘要点击 157  全文点击 208  投稿时间:2018-12-14  修订日期:2019-04-01
查看全文  查看/发表评论  下载PDF阅读器
DOI编号  10.7641/CTA.2019.80976
  2019,36(8):1304-1314
中文关键词  时间序列数据  数据关联性  动态密度聚类  类继承性
英文关键词  ime series data  Data correlation  Dynamic density clustering  Cluster succession
基金项目  国家自然科学基金项目(61876138, 61203311, 61105064), 陕西省教育厅自然科学专项(17JK0701), 陕西省网络数据分析与智能处理重点实验室开放课题基金(XUPT-KLND(201804)), 西安邮电大学研究生创新基金项目(103-602080016)
学科分类代码  
作者单位E-mail
陈皓 西安邮电大学 chenhao@xupt.edu.cn 
冀敏杰 西安邮电大学  
郭紫园 西安邮电大学  
夏雨 西安邮电大学  
中文摘要
      传统的聚类算法多是针对某个时间片上的静态数据集合进行的聚类分析, 但事实上大部分数据存在时间序列上的连续动态演变过程. 本文对时间序列数据及其类结构的演变过程进行了分析, 发现在一定条件下相邻时间片间的数据集间存在较强的关联性, 并且类簇结构间则存在一定的继承性. 故本文得出新的思想, 在前一时间片聚类结果的基础上, 通过对部分变化数据的计算和类簇结构的局部调整就有望获得对后一时间片上数据进行完全聚类相同的效果, 且运算量会显著下降. 基于此思想提出了一种时间序列数据的动态密度聚类算法(Dynamic Density Clustering Algorithm for Time Series Data, DDCA/TSD). 仿真实验中使用6种数据集对所提出算法进行了实验验证. 结果显示DDCA/TSD在保证聚类准确性的基础上相对传统聚类算法有明显的时间效率提升, 并能更有效地发现数据点的属性变化及类簇结构的演变过程.
英文摘要
      The traditional clustering algorithms are an analysis method for static data sets on a certain time slice, but most of the data sets have a continuous dynamic evolution process on the time series. A high data correlation and cluster structure succession between adjacent time slice are found by the analysis of the successional process on time series data and its class structure. Consequently, based on the clustering results of the previous time slice, it is able to obtain the same effect as the result of completely clustering data on the latter time slice through calculating part changed data and adjusting partial cluster structure, meanwhile the whole computation will significantly decrease. Based on this idea, a dynamic density clustering algorithm for time series data (DDCA/TSD) is proposed. In simulations, six kinds of data sets are used to verify the proposed algorithm. The results show that DDCA/TSD has obvious time efficiency improvement compared with the traditional clustering algorithm on the basis of the cluster accuracy. Moreover, it is effective to find the change of data points’ attribute and the evolution of cluster structure.