Cluster analysis and outlier detection are strongly coupled tasks in data mining. A few points not belonging to any clusters can easily corrupt an otherwise well defined clustering structure. The same problem can be found in meta-clustering, where different clusterings of the same data are clustered to reduce the complexity of the choice of the best partitioning and the number of alternatives to compare. In this paper, the outlier rejection problem is tackled with a rough graded possibilistic medoid meta-clustering algorithm, exploiting its ability to perform a soft transition from probabilistic to possibilistic memberships and its natural rejection of anomalous observations. Outlier detection is hence based on a threshold, where a low memberships of a partition in all meta-clusters identifies observations to be filtered out from the clustering process. The effectiveness of the proposed approach has been assessed by comparing the performance of the meta clustering algorithm with and without clustering outlier detection on synthetic data, yielding promising results.
Rough graded possibilistic meta-outlier detection in granular clustering
Ferone A.
;Maratea A.
2019-01-01
Abstract
Cluster analysis and outlier detection are strongly coupled tasks in data mining. A few points not belonging to any clusters can easily corrupt an otherwise well defined clustering structure. The same problem can be found in meta-clustering, where different clusterings of the same data are clustered to reduce the complexity of the choice of the best partitioning and the number of alternatives to compare. In this paper, the outlier rejection problem is tackled with a rough graded possibilistic medoid meta-clustering algorithm, exploiting its ability to perform a soft transition from probabilistic to possibilistic memberships and its natural rejection of anomalous observations. Outlier detection is hence based on a threshold, where a low memberships of a partition in all meta-clusters identifies observations to be filtered out from the clustering process. The effectiveness of the proposed approach has been assessed by comparing the performance of the meta clustering algorithm with and without clustering outlier detection on synthetic data, yielding promising results.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.