基于聚类的离群数据挖掘及应用Outliers Mining and Application Based on Clustering
蔡江辉,张继福
摘要(Abstract):
介绍了离群数据挖掘的基本概念,全面分析并总结了离群数据挖掘研究的历史与现状,以及离群数据挖掘的几类方法,并对一些典型方法进行了分析和评价,指出传统方法的优点和不足,展望了今后的研究工作。
关键词(KeyWords): 离群数据挖掘;聚类;关联规则
基金项目(Foundation): 国家"863"高技术研究发展计划基金项目资助(2003AA133060)
作者(Author): 蔡江辉,张继福
参考文献(References):
- [1] KnorrE,NgR.Algorithmsforminingdistance basedoutliersinlargedatasets[A] Proc.ofthe24VLDBConference[C].NewYork,USA,1998.392 403
- [2] KnorrE,NgR.AUnifiedapproachforminingoutliers:propertiesandcomputation[A].Proc.of1997IntConfKnowledgeDiscoveryandDataMining(KDD'97)[C] NewportBeach,California,1997.219 222
- [3] KnorrE,NgR.Findingintensionalknowledgeofdistance based[A] Proc.ofthe25thVLDBConferenceEdinburgh[C]Scotland,1999.211 222
- [4] 史东辉,张春阳,蔡庆生 离群数据的挖掘方法研究[J].小型微型计算机系统,2001,22(10):1234 1236
- [5] BarnettV,LewisT.OutliersinStatisticalData[J].NewYork:JohnWiley&Sons,1994
- [6] MacQueenJ.Somemethodsforclassificationandanalysisofmultivariateobservations[A].Proc.5thBerkeleySymp.Math.StatisticalAssociation[C] 83:715 728
- [7] KaufmanL,RousseeuwPJ.FindingGroupsinData:AnIntroductiontoClusterAnalysis[J].NewYork:JohnWiley&Sons,1990
- [8] HuangZ.Extensionstothek-meansalgorithmforclusteringlargedatasetswithcategoricalvalues[J].DataMiningandKnowledgeDiscovery,1998,(2):283 304
- [9] KarypisG,HanEH,KumarV.CHAMELEMON:AhierarchicalclusteringalgorithmusingDynamicmodeling[J].COMPUTER,(32):1999,68 75
- [10] ZhangT,RamakrishmanR,LivnyM.BIRCH:Anefficientdataclusteringmethodforverylargedatabases[A] InProc.1996ACM-SIGMODInt.ConfManagementofData(SIGMOD'96)[C] Canada,1996,103 114
- [11] HinneburgH,KeimDA.Anefficientapproachtoclusteringinlargemultimediadatabaseswithnoise[C].1998IntConf.KnowledgeDiscoveryandDataMining(KDD'98)[C] NewYork:1998,58 65
- [12] EsterM,KriegelH. P,XuX.Knowledgediscoveryinlargespatialdatabases:Focusingtechniquesforefficientclassidentification[A].4thInt.Symp.LargeSpatialDatabases(SSD'95)[C] Portland,ME,1995,67 82
- [13] AnkerstM,BreunigM,KriegelH P,SanderJ.OPTICS:Orderingpointstoidentifytheclusteringstructure[A].1999ACMSIGMODInt.Conf.ManagementofData(SIGMOD'99)[C] ,Philadelphia,PA,1999,49 60
- [14] HanJ,KamberM.DataMining:ConceptsandTechniques[A].CopyrightbyMorganKaufmannPublishers[C],Inc,2001
- [15] BreuningM,KricgelH,NgR.OPTICS OF:IdentifyingLocalOutliers[A].In:Proc.ofthe3rdEuropeanconferenceonPrinciplesandPracticeofknowledgeDiscoveryinDatabases(PKDD'99)[C].Prague,1999.262 270
- [16] ArningA,AgrawalR,RaghavanP.Alinearmethodfordeviationinlargedatabase[A] In:Proc.ofInt.Conf.DataMiningandKnowledgeDiscovery(KDD96)[C] Portland,1996,164 169
- [17] 郑建国,焦李成.偏差检测挖掘方法研究[J] 计算机工程,2001,27(8):33 35
- [18] 史东辉,蔡庆生,倪志伟,张春阳 基于规则的分类数据离群挖掘方法[J] 计算机研究与发展.2000,37(9):10941100
- [19] 姜灵敏 基于相似系数和检测孤立点的聚类算法[J] 计算机工程,2003,29(11):183 185
- [20] 张彦霞 多波段天体物理中的自动分类方法研究[D].北京:中科院研究生院.2003