Abstract
Irregular applications, i.e., programs that manipulate pointer-based data structures such as graphs and trees, constitute a challenging target for parallelization because the amount of parallelism is input dependent and changes dynamically. Traditional dependence analysis techniques are too conservative to expose this parallelism. Even manual parallelization is difficult, time consuming, and error prone. The Galois system parallelizes such applications using an optimistic approach that exploits higher-level semantics of abstract data types.
In this paper, we study the performance and scalability of a Galoised, that is, automatically parallelized, version of Delaunay mesh refinement (DR) on a shared-memory system with 128 CPUs. DR is an important irregular application that is used, e.g., in graphics and finite-element codes. The parallelized program scales to 64 threads, where it reaches a speedup of 25.8. For large numbers of threads, the performance is hampered by the load imbalance and the nonuniform memory latency, both of which grow as the number of threads increases. While these two issues will have to be addressed in future work, we believe our results already show the Galois approach to be very promising.
Chapter PDF
Similar content being viewed by others
Keywords
References
Chernikov, A., Chrisochoides, N.: Parallel 2D Constrained Delaunay Mesh Generation. ACM Transactions on Mathematical Software 34(1) (2008)
Chernikov, A., Chrisochoides, N.: Three-dimensional Delaunay Refinement for Multi-core Processors. In: 22nd International Conference on Supercomputing, pp. 214–224 (2008)
Chew, L.P.: Guaranteed-quality Mesh Generation for Curved Surfaces. In: Ninth Annual Symposium on Computational Geometry (1993)
Ghiya, R., Hendren, L.J.: Putting pointer analysis to work. In: 25th Symposium on Principles of Programming Languages, pp. 121–133 (1998)
Hendren, L.J., Nicolau, A.: Parallelizing Programs with Recursive Data Structures. IEEE Transactions on Parallel and Distributed Systems 1(1), 35–47 (1990)
Allen, R.J., Kennedy, K.: Optimizing Compilers for Modern Architectures: a Dependence-based Approach. Morgan Kaufmann Publishers Inc., San Francisco (2002)
Krishnan, V., Torrellas, J.: A Chip-multiprocessor Architecture with Speculative Multithreading. IEEE Transactions on Computers 48(9) (1999)
Kulkarni, M., Carribault, P., Pingali, K., Ramanarayanan, G., Walter, B., Bala, K., Chew, L.P.: Scheduling Strategies for Optimistic Parallel Execution of Irregular Programs. In: Symposium on Parallelism in Algorithms and Architectures, pp. 217–228 (2008)
Kulkarni, M., Pingali, K., Ramanarayanan, G., Walter, B., Bala, K., Chew, L.P.: Optimistic Parallelism Benefits from Data Partitioning. In: International Conference on Architectural Support for Programming Languages and Operating Systems, vol. 36(1), pp. 233–243 (2008)
Kulkarni, M., Pingali, K., Walter, B., Ramanarayanan, G., Bala, K., Chew, L.P.: Optimistic Parallelism Requires Abstractions. In: Conference on Programming Language Design and Implementation, vol. 42(6), pp. 211–222 (2007)
Larus, J.R., Hilfinger, P.N.: Detecting Conflicts between Structure Accesses. In: Conference on Programming Language Design and Implementation (1988)
Larus, J., Rajwar, R.: Transactional Memory (Synthesis Lectures on Computer Architecture). Morgan & Claypool Publishers, San Francisco (2007)
Ni, Y., Menon, V.S., Adl-Tabatabai, A.R., Hosking, A.L., Hudson, R.L., Moss, J.E.B., Saha, B., Shpeisman, T.: Open Nesting in Software Transactional Memory. In: 12th Symposium on Principles and Practice of Parallel Programming, pp. 68–78 (2007)
Ponnusamy, R., Saltz, J., Choudhary, A.: Runtime Compilation Techniques for Data Partitioning and Communication Schedule Reuse. In: Conference on Supercomputing, pp. 361–370 (1993)
Rauchwerger, L., Padua, D.: The LRPD Test: Speculative Runtime Parallelization of Loops with Privatization and Reduction Parallelization. IEEE Transactions on Parallel Distributed Systems 10(2), 160–180 (1999)
Sagiv, M., Reps, T., Wilhelm, R.: Parametric Shape Analysis via 3-valued Logic. In: 26th Symposium on Principles of Programming Languages, pp. 105–118 (1999)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Burtscher, M., Kulkarni, M., Prountzos, D., Pingali, K. (2008). On the Scalability of an Automatically Parallelized Irregular Application. In: Amaral, J.N. (eds) Languages and Compilers for Parallel Computing. LCPC 2008. Lecture Notes in Computer Science, vol 5335. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-89740-8_8
Download citation
DOI: https://doi.org/10.1007/978-3-540-89740-8_8
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-89739-2
Online ISBN: 978-3-540-89740-8
eBook Packages: Computer ScienceComputer Science (R0)