Cluster analysis and outlier detection are strongly coupled tasks in data mining. A few points not belonging to any clusters can easily corrupt an otherwise well defined clustering structure. The same problem can be found in meta-clustering, where different clusterings of the same data are clustered to reduce the complexity of the choice of the best partitioning and the number of alternatives to compare. In this paper, the outlier rejection problem is tackled with a rough graded possibilistic medoid meta-clustering algorithm, exploiting its ability to perform a soft transition from probabilistic to possibilistic memberships and its natural rejection of anomalous observations. Outlier detection is hence based on a threshold, where a low memberships of a partition in all meta-clusters identifies observations to be filtered out from the clustering process. The effectiveness of the proposed approach has been assessed by comparing the performance of the meta clustering algorithm with and without clustering outlier detection on synthetic data, yielding promising results.

Rough graded possibilistic meta-outlier detection in granular clustering

Ferone A.
;
Maratea A.
2019-01-01

Abstract

Cluster analysis and outlier detection are strongly coupled tasks in data mining. A few points not belonging to any clusters can easily corrupt an otherwise well defined clustering structure. The same problem can be found in meta-clustering, where different clusterings of the same data are clustered to reduce the complexity of the choice of the best partitioning and the number of alternatives to compare. In this paper, the outlier rejection problem is tackled with a rough graded possibilistic medoid meta-clustering algorithm, exploiting its ability to perform a soft transition from probabilistic to possibilistic memberships and its natural rejection of anomalous observations. Outlier detection is hence based on a threshold, where a low memberships of a partition in all meta-clusters identifies observations to be filtered out from the clustering process. The effectiveness of the proposed approach has been assessed by comparing the performance of the meta clustering algorithm with and without clustering outlier detection on synthetic data, yielding promising results.
2019
9781450371490
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11367/103753
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact