Atomic-free irregular computations on GPUs

R Nasre, M Burtscher, K Pingali - … Using Graphics Processing Units, 2013 - dl.acm.org
… -based computations of irregularcomputation is idempotent, this extra round does not
change the computed solution. We now discuss how idempotent computations enable atomicfree

Data-driven versus topology-driven irregular computations on GPUs

R Nasre, M Burtscher, K Pingali - 2013 IEEE 27th International …, 2013 - ieeexplore.ieee.org
… We obtained significant performance improvements with atomic-free worklist updates,
variable kernel configuration, kernel unrolling and intra-block work donation. We also find that a …

Automatic generation of warp-level primitives and atomic instructions for fast and portable parallel reduction on GPUs

SG De Gonzalo, S Huang… - 2019 IEEE/ACM …, 2019 - ieeexplore.ieee.org
… Abstract—Since the advent of GPU computing, GPU … is extremely challenging for GPU
application library developers. … library across three generations of GPU architectures, and show …

Atomic-free optimization on GPU based SAR raw data simulation

X Yao, C Hu, F Zhang, W Hu… - 2016 IEEE International …, 2016 - ieeexplore.ieee.org
… processing unit (GPU), it can … GPU, which has a bad influence to simulated time. To optimize
simulated time, in this article, we put forward three GPU optimistic strategies for atomic-free

Tigr: Transforming irregular graphs for gpu-friendly graph processing

AH Nodehi Sabet, J Qiu, Z Zhao - ACM SIGPLAN Notices, 2018 - dl.acm.org
… This work addresses the critical irregularity issue in GPU graph processing by transforming
irregularAtomic-free irregular computations on GPUs. In Proceedings of the 6th Workshop on …

A deep collaborative computing based SAR raw data simulation on multiple CPU/GPU platform

F Zhang, C Hu, W Li, W Hu, P Wang… - IEEE Journal of Selected …, 2016 - ieeexplore.ieee.org
… CPU/GPU collaborative simulation achieves up to 250χ speedup. Furthermore, the irregular
reduction based atomic-free optimization boosts the performance of the single GPU method …

An efficient transaction-based GPU implementation of minimum spanning forest algorithm

S Manoochehri, B Goodarzi… - … conference on high …, 2017 - ieeexplore.ieee.org
irregular algorithm to implement on GPUs. In this paper we show that a transactionbased
design and implementation of the Boruvka’s algorithm on GPU … that such atomic-free approach …

Nested parallelism on GPU: Exploring parallelization templates for irregular loops and recursive computations

D Li, H Wu, M Becchi - 2015 44th International Conference on …, 2015 - ieeexplore.ieee.org
… patterns that present uneven work distribution across iterations and recursive calls. • … GPU
in the context of two computational patterns: irregular nested loops and recursive computations

A Customizable Lightweight STM for Irregular Algorithms on GPU

S Manoochehri, P Cristofaro… - 2022 IEEE International …, 2022 - ieeexplore.ieee.org
atomic-free techniques for a small subset of irregular algorithms by (1) identifying certain
algebraic properties of computations [6… multicore CPUs [25] and GPUs [27] and are discussed in …

TLPGNN: A lightweight two-level parallelism paradigm for graph neural network computation on GPU

Q Fu, Y Ji, HH Huang - Proceedings of the 31st International Symposium …, 2022 - dl.acm.org
… Push, edge-centric, and GNNAdvisor all use atomic operations to update the feature vectors
of each vertex, while pull is atomic free. Table 1 shows the performance and related profiling …