In this work, we propose an efficient parallel implementation of the nonsymmetric block Lanczos algorithm for the computation of few extreme eigenvalues, and corresponding eigenvectors, of real nonhermitian matrices for distributed memory multicomputers. The reorganisation of the block Lanczos algorithm implemented allows to exploit a coarse-grained parallelism and to harness the computational power of the target architectures. The computational kernels of the algorithm are matrix– matrix multiplications, with dense and sparse factors, QR factorisation and singular value decomposition. To reduce the total amount of communication involved in the matrix–matrix multiplication with a sparse factor, we substitute each matrix appearing in the algorithm with its transpose. Then, we develop an efficient parallelisation of the matrix–matrix multiplication when the second factor is sparse. Some other linear algebra operations are performed using ScaLAPACK library. The parallel eigensolver has been tested on a cluster of PCs. All reported results show the proposed algorithm is efficient on the target architectures for problems of adequate dimension.
A sparse nonsymmetric eigensolver for distributed memory architectures
PERLA, Francesca;ZANETTI, Paolo
2008-01-01
Abstract
In this work, we propose an efficient parallel implementation of the nonsymmetric block Lanczos algorithm for the computation of few extreme eigenvalues, and corresponding eigenvectors, of real nonhermitian matrices for distributed memory multicomputers. The reorganisation of the block Lanczos algorithm implemented allows to exploit a coarse-grained parallelism and to harness the computational power of the target architectures. The computational kernels of the algorithm are matrix– matrix multiplications, with dense and sparse factors, QR factorisation and singular value decomposition. To reduce the total amount of communication involved in the matrix–matrix multiplication with a sparse factor, we substitute each matrix appearing in the algorithm with its transpose. Then, we develop an efficient parallelisation of the matrix–matrix multiplication when the second factor is sparse. Some other linear algebra operations are performed using ScaLAPACK library. The parallel eigensolver has been tested on a cluster of PCs. All reported results show the proposed algorithm is efficient on the target architectures for problems of adequate dimension.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.