Landing CG on EARTH: A case study of fine-grained multithreading on an evolutionary path

KB Theobald, G Agrawal, R Kumar… - SC'00: Proceedings …, 2000 - ieeexplore.ieee.org
SC'00: Proceedings of the 2000 ACM/IEEE Conference on Supercomputing, 2000ieeexplore.ieee.org
We report on our work in developing a fine-grained multithreaded solution for the
communication-intensive Conjugate Gradient (CG) problem. In our recent work, we
developed a simple yet efficient program for sparse matrix-vector multiply on a multi-
threaded system. This paper presents an effective mechanism for the reduction-broadcast
phase, which is integrated with the sparse MVM, resulting in a scalable implementation of
the complete CG application. Three major observations from our experiments on the EARTH …
We report on our work in developing a fine-grained multithreaded solution for the communication-intensive Conjugate Gradient (CG) problem. In our recent work, we developed a simple yet efficient program for sparse matrix-vector multiply on a multi-threaded system. This paper presents an effective mechanism for the reduction-broadcast phase, which is integrated with the sparse MVM, resulting in a scalable implementation of the complete CG application. Three major observations from our experiments on the EARTH multithreaded testbed are: (1) The scalability of our CG implementation is impressive, e.g., absolute speedup is 90 on 120 processors for the NAS CG class B input. (2) Our dataflow-style reduction-broadcast network based on fine-grain multithreading is twice as fast as a serial reduction scheme on the same system. (3) By slowing down the network by a factor of 2, no notable degradation of overall CG performance was observed.
ieeexplore.ieee.org
Showing the best result for this search. See all results