自适应截断距离与样本分配的密度峰值聚类算法Density Peak Clustering Algorithm Based on Adaptive Cutoff Distance and Sample Allocation
张志壮,高文华,石慧,董增寿
摘要(Abstract):
针对密度峰值聚类算法中,样本局部密度截断距离需主观选择和样本分配策略的误差扩散问题,提出自适应截断距离和构造流形距离优化样本分配的改进型密度峰值聚类算法。该算法首先使用样本K近邻自适应的选取各点的截断距离,即在样本密度大的点,选用大截断距离,准确选取类簇中心,在样本密度小的点,选用小截断距离,判别离群点。其次对于剩余样本通过样本的连接路径构造流形距离,优化样本分配策略。最后选取人工数据集进行聚类分析算法实验,与传统的密度峰值聚类算法进行实验对比,验证所提改进算法对聚类中心选取和样本分配的准确性。
关键词(KeyWords): 密度峰值聚类;聚类中心;自适应截断距离;流形距离
基金项目(Foundation): 国家自然科学基金青年科学基金(61703297);; 山西省重点研发计划(201903D321012;201903D121023);; 山西省自然科学基金(201801D121166;201901D111264)
作者(Author): 张志壮,高文华,石慧,董增寿
参考文献(References):
- [1] 周永祥,杨海峰,蔡江辉,等.一种快速确定聚类中心的光谱聚类方法[J].太原科技大学学报,2020,41(6):425-432.
- [2] 杨华晖,孟晨,王成,等.基于目标特征选择和去除的改进K-means聚类算法[J].控制与决策,2019,34(6):101-108.
- [3] 陈迎春,李鸥,孙昱.基于聚类离散化和变精度邻域熵的属性约简[J].控制与决策,2018,33(8):66-73.
- [4] RODRIGUEZ A,LAIO A.Clustering by fast search and find of density peaks[J].Science,2014,344:1492-1496.
- [5] LI C,DING G,WANG D,et al.Clustering by Fast Search and Find of Density Peaks with Data Field[J].Chinese Journal of Electronics,2016,25(3):397-402.
- [6] 谢娟英,高红超,谢维信.K近邻优化的密度峰值快速搜索聚类算法[J].中国科学:信息科学,2016,46(2):258-280.
- [7] XIE J,GAO H,XIE W,et al.Robust clustering by detecting density peaks and assigning points based on fuzzy weighted K-nearest neighbors[J].Information Sciences,2016,354:19-40.
- [8] PANG Z,WANG G,JIE Y.A Multi-granularity Decomposition Mechanism of Complex Tasks Based on Density Peaks[J].Big Data Mining & Analytics,2018,1(3):75-86.
- [9] YAOHUI L,ZHENGMING M,FANG Y.Adaptive density peak clustering based on K-nearest neighbors with aggregating strategy[J].Knowledge-Based Systems,2017,133:208-220.
- [10] PARMAR M.REDPC:A residual error-based density peak clustering algorithm[J].Neurocomputing,2019.348:82-96.
- [11] SEYEDI S A,LOTFI A,MORADI P,et al.Dynamic graph-based label propagation for density peaks clustering[J].Expert Systems with Applications,2018,115:314-328.
- [12] DU T,QU S,WANG Q.A Data-Driven Parameter Adaptive Clustering Algorithm Based on Density Peak[J].Complexity,2018,2018:1-14.
- [13] 褚睿鸿,王红军,杨燕,等.基于密度峰值的聚类集成[J].自动化学报,2016,42(9):1401-1412.
- [14] 黄岚,李玉,王贵参,等.基于点距离和密度峰值聚类的社区发现方法[J].吉林大学学报:工学版,2016,46(6):2042-2051.
- [15] 周世波,徐维祥.一种基于相对密度和决策图的聚类算法[J].控制与决策,2018,33(11):4-13.
- [16] 王业东,李向前,敬石开,等.基于密度峰值法的设计理性聚类方法[J].计算机集成制造系统,2017,23(8):1662-1669.
- [17] 辜振谱,刘晓波,韩子东,等.基于改进密度峰值聚类的航空发动机故障诊断[J].计算机集成制造系统,2020,26(5):1211-1217.