FSDM 2017

Invited Speaker---Assoc. Prof. Juanying Xie

School of Computer Science, Shaanxi Normal University, China

Biography: Juanying Xie is an associate professor at Shaanxi Normal University in PR China. She was a senior member of CCF, and an associate editor of HISS. She was awarded her Ph.D. in signal and information processing from Xidian University in 2012. She cooperated with Prof. Xiaohui Liu at Brunel University in UK from 2010 to 2011 in gene selection research. Her research interests include machine learning, data mining, and biomedical data analysis. She received an engineering master degree in the application technology of computers at Xidian University in 2004 and a bachelor degree of science in computer science at Shanxi Normal University in 1993, and from then on she has been working in Shaanxi Normal University.

Speech Title: Robust clustering algorithms by detecting density peaks and assigning points based on K-nearest neighbors and fuzzy weighted K-nearest neighbors
Abstract: Clustering by fast search and find of Density Peaks (referred to as DPC) was introduced by Alex Rodríguez and Alessandro Laio in Science at June 2014. The DPC algorithm is based on the idea that cluster centers are characterized by having a higher density than their neighbors and by being at a relatively large distance from points with higher densities. The power of DPC was demonstrated on several test cases. It can intuitively find the number of clusters and can detect and exclude the outliers automatically, while recognizing the clusters regardless of their shape and the dimensions of the space containing them. However, DPC does have some drawbacks to be addressed before it may be widely applied. First, the local density ρi of point i is affected by the cutoff distance dc, and is computed in different ways depending on the size of datasets, which can influence the clustering, especially for small real-world cases. Second, the assignment strategy for the remaining points, after the density peaks (that is the cluster centers) have been found, can create a “Domino Effect”, whereby once one point is assigned erroneously, then there may be many more points subsequently misassigned. This is especially the case in real-word datasets where there could exist several clusters of arbitrary shape overlapping each other. More...