Clustering approaches are widely used methodologies to analyse large data sets. The K-means algorithm is well-known as a procedure too computational-intensive for the large data analytic problem. In this work, we focus on a parallel technique to reduce the execution time when the K-means is used to cluster large dataset. We exploit computational powerful of its design when the Graphic Processor Units (GPUs), a massively parallel architecture, is adopted. We optimize the proposed implementation to handle (i) the space limitation issue of GPUs; (ii) the host-device data transfer time. Experimental results, on real and synthetic data, show how our parallelization approach give good results in terms of execution time and speed-up.

A GPU-accelerated parallel K-means algorithm

FARINA, Gennaro;Marcellino, L.;TORALDO, Gerardo
2019-01-01

Abstract

Clustering approaches are widely used methodologies to analyse large data sets. The K-means algorithm is well-known as a procedure too computational-intensive for the large data analytic problem. In this work, we focus on a parallel technique to reduce the execution time when the K-means is used to cluster large dataset. We exploit computational powerful of its design when the Graphic Processor Units (GPUs), a massively parallel architecture, is adopted. We optimize the proposed implementation to handle (i) the space limitation issue of GPUs; (ii) the host-device data transfer time. Experimental results, on real and synthetic data, show how our parallelization approach give good results in terms of execution time and speed-up.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11367/64577
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 32
  • ???jsp.display-item.citation.isi??? ND
social impact