一种基于K-means的关联规则聚类算法An Association Rule Clustering Algorithm Based on K-means
王琢,荀亚玲,张继福
摘要(Abstract):
关联规则是数据挖掘领域中的主要研究内容之一。针对高维海量数据集,尤其当支持度和置信度阈值太低时,将生成大量冗余和相似的关联规则,从而对关联规则的理解和使用造成了困难。本文采用改进的K-means思想,给出了一种关联规则聚类算法:首先重新定义了冗余关联规则,并给出了删除的方法;然后定义了一种新的规则间相似性度量;最后利用K-means思想,采用最大三角形方法选取聚类的初始点,将相似的关联规则归为一类。实验验证该算法能够帮助用户快速有效地找到有用的关联规则,提高了关联规则的可理解性。
关键词(KeyWords): 关联规则聚类算法;冗余关联规则;相似性度量;恒星光谱数据
基金项目(Foundation):
作者(Author): 王琢,荀亚玲,张继福
参考文献(References):
- [1]LIN J L,DUNHAM M H.Mining association rules:Anti-skew algorithms[C]//Data Engineering,1998.Proceedings.14th International Conference on.IEEE,1998:486-493.
- [2]KUO R J,CHAO C M,Chiu Y T.Application of particle swarm optimization to association rule mining[J].Applied Soft Computing,2011,11(1):326-336.
- [3]HO G T S,IP W H,WU C H,et al.Using a fuzzy association rule mining approach to identify the financial data association[J].Expert Systems with Applications,2012,39(10):9054-9063.
- [4]WANG H,LIU P,LI H.Application of improved association rule algorithm in the courses management[C]//Software Engineering and Service Science(ICSESS),2014 5th IEEE International Conference on.IEEE,2014:804-807.
- [5]PERI H,KUMAR P.Application of association rule mining to help determine the process of career selection[J].International Journal of Computer Application,2014,94(16):15-19.
- [6]SIMON G J,SCHROM J,CASTRO M R,et al.Survival association rule mining towards type 2 diabetes risk assessment[C]//AMIA Annual Symposium Proceedings.American Medical Informatics Association,2013:1293.
- [7]武霞,董增寿,孟晓燕.基于大数据平台hadoop的聚类算法K值优化研究[J].太原科技大学学报,2015,36(2):92-96.
- [8]HAN J,KAMBER M,数据挖掘概念与技术[M].范明,孟小峰,译.北京:机械工业出版社2004.
- [9]LIU H,LIU L,ZHANG H.A fast pruning redundant rule method using Galois connection[J].Applied Soft Computing,2011,11(1):130-137.
- [10]BERRADO A,RUNGER G C.Using metarules to organize and group discovered association rules[J].Data Mining and Knowledge Discovery,2007,14(3):409-431.
- [11]PHAM T T,LUO J,HONG T P,et al.An efficient method for mining non-redundant sequential rules using attributed prefixtrees[J].Engineering Applications of Artificial Intelligence,2014,32:88-99.
- [12]NGUYEN L T T,VO B,HONG T P,et al.Classification based on association rules:A lattice-based approach[J].Expert Systems with Applications,2012,39(13):11357-11366.
- [13]韦素云,吉根林,曲维光.关联规则的冗余删除与聚类[J].小型微型计算机系统,2006,27(1):110-113.
- [14]张爱芳.基于密度网格的关联规则开采及聚类算法[D].华中科技大学,2004.
- [15]LIU G,HUANG S,LU C,et al.An improved K-means Algorithm Based on Association Rules[J].International Journal of Computer Theory and Engineering,2014,6(2):146-150.
- [16]FENG J,LU Z,YANG P,et al.A K-means clustering algorithm based on the maximum triangle rule[C]//Mechatronics and Automation(ICMA),2012 International Conference on.IEEE,2012:1456-1461.
- [17]CELEBI M E,KINGRAVI H A,Vela P A.A comparative study of efficient initialization methods for the K-means clustering algorithm[J].Expert Systems with Applications,2013,40(1):200-210.
- [18]张继福,赵旭俊.一种基于约束FP树的天体光谱数据相关性分析方法[J].模式识别与人工智能,2009(4):639-646.