Atomic-free irregular computations on GPUs
… -based computations of irregular … computation is idempotent, this extra round does not
change the computed solution. We now discuss how idempotent computations enable atomicfree …
change the computed solution. We now discuss how idempotent computations enable atomicfree …
Data-driven versus topology-driven irregular computations on GPUs
… We obtained significant performance improvements with atomic-free worklist updates,
variable kernel configuration, kernel unrolling and intra-block work donation. We also find that a …
variable kernel configuration, kernel unrolling and intra-block work donation. We also find that a …
Automatic generation of warp-level primitives and atomic instructions for fast and portable parallel reduction on GPUs
SG De Gonzalo, S Huang… - 2019 IEEE/ACM …, 2019 - ieeexplore.ieee.org
… Abstract—Since the advent of GPU computing, GPU … is extremely challenging for GPU
application library developers. … library across three generations of GPU architectures, and show …
application library developers. … library across three generations of GPU architectures, and show …
Atomic-free optimization on GPU based SAR raw data simulation
… processing unit (GPU), it can … GPU, which has a bad influence to simulated time. To optimize
simulated time, in this article, we put forward three GPU optimistic strategies for atomic-free …
simulated time, in this article, we put forward three GPU optimistic strategies for atomic-free …
Tigr: Transforming irregular graphs for gpu-friendly graph processing
… This work addresses the critical irregularity issue in GPU graph processing by transforming
irregular … Atomic-free irregular computations on GPUs. In Proceedings of the 6th Workshop on …
irregular … Atomic-free irregular computations on GPUs. In Proceedings of the 6th Workshop on …
A deep collaborative computing based SAR raw data simulation on multiple CPU/GPU platform
… CPU/GPU collaborative simulation achieves up to 250χ speedup. Furthermore, the irregular
reduction based atomic-free optimization boosts the performance of the single GPU method …
reduction based atomic-free optimization boosts the performance of the single GPU method …
An efficient transaction-based GPU implementation of minimum spanning forest algorithm
S Manoochehri, B Goodarzi… - … conference on high …, 2017 - ieeexplore.ieee.org
… irregular algorithm to implement on GPUs. In this paper we show that a transactionbased
design and implementation of the Boruvka’s algorithm on GPU … that such atomic-free approach …
design and implementation of the Boruvka’s algorithm on GPU … that such atomic-free approach …
Nested parallelism on GPU: Exploring parallelization templates for irregular loops and recursive computations
… patterns that present uneven work distribution across iterations and recursive calls. • … GPU
in the context of two computational patterns: irregular nested loops and recursive computations…
in the context of two computational patterns: irregular nested loops and recursive computations…
A Customizable Lightweight STM for Irregular Algorithms on GPU
S Manoochehri, P Cristofaro… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
… atomic-free techniques for a small subset of irregular algorithms by (1) identifying certain
algebraic properties of computations [6… multicore CPUs [25] and GPUs [27] and are discussed in …
algebraic properties of computations [6… multicore CPUs [25] and GPUs [27] and are discussed in …
TLPGNN: A lightweight two-level parallelism paradigm for graph neural network computation on GPU
… Push, edge-centric, and GNNAdvisor all use atomic operations to update the feature vectors
of each vertex, while pull is atomic free. Table 1 shows the performance and related profiling …
of each vertex, while pull is atomic free. Table 1 shows the performance and related profiling …