Denoising algorithms are widely studied to improve image quality in many applicative fields, such as for example the Magnetic Resonance Imaging (MRI). In real scenarios, computationally expensive schemes have been rendered practicable by means of programming methodologies that resort to massively parallel architectures, known as Graphic Processor Units (GPUs). In this paper, we propose an hybrid CPU and GPU parallel implementation of the Overcomplete Local Principal Component Analysis (OLPCA) for the image denoising. We have implemented some computational tasks of the denoising procedure focusing on a strategy that combines both shared and global memories mapping approach on the GPU with the aim to optimize the performance of OLPCA. The experimental results show improvements in terms of GFlops and memory throughput with a promising speedup with respect to the CPU version that encourage its usability in the expensive application of the Diffusion Weighted Imaging (DWI) as filter of noisy images.
|Titolo:||Local principal component analysis overcomplete method: A GPU parallel implementation combining shared and global memories|
|Autori interni:||GALLETTI, Ardelio|
|Data di pubblicazione:||2016|
|Appare nelle tipologie:||4.1 Contributo in Atti di convegno|