Generalised vectorisation for sparse matrix: vector multiplication

AN Yzelman - Proceedings of the 5th Workshop on Irregular …, 2015 - dl.acm.org
Proceedings of the 5th Workshop on Irregular Applications: Architectures and …, 2015dl.acm.org
This work generalises the various ways in which a sparse matrix--vector (SpMV)
multiplication can be vectorised. It arrives at a novel data structure that generalises three
earlier well-known data structures for sparse computations: the Blocked CRS format, the
(sliced) ELLPACK format, and segmented scan based formats. The new data structure is
relevant since efficient use of new hardware requires the use of increasingly wide vector
registers. Normally, the use of vectorisation for sparse computations is limited due to …
This work generalises the various ways in which a sparse matrix--vector (SpMV) multiplication can be vectorised. It arrives at a novel data structure that generalises three earlier well-known data structures for sparse computations: the Blocked CRS format, the (sliced) ELLPACK format, and segmented scan based formats.
The new data structure is relevant since efficient use of new hardware requires the use of increasingly wide vector registers. Normally, the use of vectorisation for sparse computations is limited due to bandwidth constraints. In cases where computations are limited by memory latencies instead of memory bandwidth, however, vectorisation can still help performance. The Intel Xeon Phi, appearing as a component in several top-500 supercomputers, displays exactly this behaviour for SpMV multiplication. On this architecture the use of the new generalised vectorisation scheme increases performance up to 178 percent.
ACM Digital Library
Showing the best result for this search. See all results