文章详细信息

基于影响空间的初始中心点优化K-means聚类算法
An Optimization K-means Clustering Algorithm of Initial Center Objects Based on Influence Space

赵文冲,蔡江辉,张继福

1:太原科技大学计算机科学与技术学院

摘要(Abstract)：

针对K-means聚类算法依赖初始点、聚类结果受初始点的选取影响较大的缺陷,给出了一种稳定的基于影响空间的初始点优化K-means聚类算法。该算法借助了影响空间数据结构和定义的加权距离吸引因子,将特殊中心点合并为K个微簇,并对微簇中的数据点加权平均得到K个初始中心点,然后执行K-means算法;最后,理论分析和实验结果表明,该初始点优化K-means聚类算法能够有效降低噪声数据对聚类结果的影响,在聚类结果、聚类过程效率方面有较大优势。

关键词(KeyWords)： K-means算法;影响空间;加权距离吸引因子;初始点优化

基金项目(Foundation): 国家自然科学基金(41372349);; 山西省社会发展攻关项目(20140313023-2);; 山西省高校优秀青年学术带头人项目

作者(Author): 赵文冲,蔡江辉,张继福

参考文献(References)：

[1]孙吉贵,刘杰,赵连宇.聚类算法研究[J].软件学报,2008,19(1):48-61.
[2]MACQUEEN J.Some methods for classification and analysis of multivariate observations[C]//Proceedings of the fifth Berkeley symposium on mathematical statistics and probability,1967,1(14):281-297.
[3]CELEBI M E,KINGRAVI H A,VELA P A.A comparative study of efficient initialization methods for the K-means clustering algorithm[J].Expert Systems with Applications,2013,40(1):200-210.
[4]PENA J M,LOZANO J A,LARRANAGA P.An empirical comparison of four initialization methods for theK-Means algorithm[J].Pattern recognition letters,1999,20(10):1027-1040.
[5]HE J,LAN M,TAN C L,et al.Initialization of cluster refinement algorithms:A review and comparative study[C]//Neural Networks,2004.Proceedings.2004 IEEE International Joint Conference on.IEEE,2004,1.
[6]FORGY E W.Cluster analysis of multivariate data:efficiency versus interpretability of classifications[J].Biometrics,1965,21:768-769.
[7]ZHU M,WANG W,HUANG J.Improved initial cluster center selection in K-means clustering[J].Engineering Computations,2014,31(8):1661-1667.
[8]李金宗.模式识别导论[M].北京:高等教育出版社,1994.
[9]武霞,董增寿,孟晓燕.基于大数据平台hadoop的聚类算法K均值优化研究[J].太原科技大学学报,2015,36(2):92-96.
[10]SATHIYA G,KAVITHA P.An Efficient Enhanced K-means Approach with Improved Initial Cluster Centers[J].Middle-East Journal of Scientific Research,2014,20(1):100-107.
[11]YANG S Z,LUO S W.A novel algorithm for initializing clustering centers[C]//Machine Learning and Cybernetics,2005.Proceedings of 2005 International Conference on.IEEE,2005(9):5579-5583.
[12]BIANCHI F M,LIVI L,RIZZI A.Two density-based K-means initialization algorithms for non-metric data clustering[J].Pattern Analysis and Applications,2014:1-19.
[13]BREUNING M M,KRIEGEL H P,Ng R T,et al.LOF:identifying density-based local outliers[C].ACM Sigmod Record.ACM,2000,29(2):93-104.
[14]JIN W,TUNG A K H,HAN J,et al.Ranking outliers using symmetric neighborhood relationship[J].Advances in Knowledge Discovery and Data Mining.Springer Berlin Heidelberg,2006:577-593.
[15]JARVIS R A,PATRICK E A.Clustering using a similarity measure based on shared near neighbors[J].Computers,IEEE Transactions on,1973,100(11):1025-1034.
[16]MODHA D S,SPANGLER W S.Feature weighting in K-means clustering[J].Machine learning,2003,52(3):217-237.

扩展功能

本文信息

PDF(2058K)

服务与反馈

本文关键词相关文章

本文作者相关文章

中国知网

太原科技大学学报

2016, v.37;No.157(05) 347-353

基于影响空间的初始中心点优化K-means聚类算法An Optimization K-means Clustering Algorithm of Initial Center Objects Based on Influence Space

赵文冲,蔡江辉,张继福

参考文献(References)：

基于影响空间的初始中心点优化K-means聚类算法
An Optimization K-means Clustering Algorithm of Initial Center Objects Based on Influence Space