default search action
36th ISC 2021: Virtual Event
- Bradford L. Chamberlain, Ana Lucia Varbanescu, Hatem Ltaief, Piotr Luszczek:
High Performance Computing - 36th International Conference, ISC High Performance 2021, Virtual Event, June 24 - July 2, 2021, Proceedings. Lecture Notes in Computer Science 12728, Springer 2021, ISBN 978-3-030-78712-7
Architecture, Networks, and Storage
- Yi Dai, Kai Lu, Junsheng Chang, Xingyun Qi, Jijun Cao, Jianmin Zhang:
Microarchitecture of a Configurable High-Radix Router for the Post-Moore Era. 3-17 - Mohammadreza Bayatpour, Nick Sarkauskas, Hari Subramoni, Jahanzeb Maqbool Hashmi, Dhabaleswar K. Panda:
BluesMPI: Efficient MPI Non-blocking Alltoall Offloading Designs on Modern BlueField Smart NICs. 18-37 - Jesmin Jahan Tithi, Fabrizio Petrini, David F. Richards:
Lessons Learned from Accelerating Quicksilver on Programmable Integrated Unified Memory Architecture (PIUMA) and How That's Different from CPU. 38-56 - Narasinga Rao Miniskar, Frank Liu, Aaron R. Young, Dwaipayan Chakraborty, Jeffrey S. Vetter:
A Hierarchical Task Scheduler for Heterogeneous Computing. 57-76
Machine Learning, AI, and Emerging Technologies
- Ruobing Han, James Demmel, Yang You:
Auto-Precision Scaling for Distributed Deep Learning. 79-97 - Tian Ye, Yang Yang, Sanmukh R. Kuppannagari, Rajgopal Kannan, Viktor K. Prasanna:
FPGA Acceleration of Number Theoretic Transform. 98-117 - Kawthar Shafie Khorassani, Jahanzeb Maqbool Hashmi, Ching-Hsiang Chu, Chen-Chun Chen, Hari Subramoni, Dhabaleswar K. Panda:
Designing a ROCm-Aware MPI Library for AMD GPUs: Early Experiences. 118-136 - Kevin A. Brown, Neil McGlohon, Sudheer Chunduri, Eric Borch, Robert B. Ross, Christopher D. Carothers, Kevin Harms:
A Tunable Implementation of Quality-of-Service Classes for HPC Networks. 137-156 - Brian A. Page, Peter M. Kogge:
Scalability of Streaming Anomaly Detection in an Unbounded Key Space Using Migrating Threads. 157-175 - Pouya Fotouhi, Marjan Fariborz, Roberto Proietti, Jason Lowe-Power, Venkatesh Akella, S. J. Ben Yoo:
HTA: A Scalable High-Throughput Accelerator for Irregular HPC Workloads. 176-194 - Burak Aksar, Yijia Zhang, Emre Ates, Benjamin Schwaller, Omar Aaziz, Vitus J. Leung, Jim M. Brandt, Manuel Egele, Ayse K. Coskun:
Proctor: A Semi-Supervised Performance Anomaly Diagnosis Framework for Production HPC Systems. 195-214
HPC Algorithms and Applications
- Marko Kabic, Simon Pintarelli, Anton Kozhevnikov, Joost VandeVondele:
COSTA: Communication-Optimal Shuffle and Transpose Algorithm with Process Relabeling. 217-236 - Yicong Zhu, Peng Zhang, Changnian Han, Guojing Cong, Yuefan Deng:
Enabling AI-Accelerated Multiscale Modeling of Thrombogenesis at Millisecond and Molecular Resolutions on Supercomputers. 237-254 - Keith Obenschain, Yu Yu Khine, Raghunandan Mathur, Gopal Patnaik, Robert Rosenberg:
Evaluation of the NEC Vector Engine for Legacy CFD Codes. 255-271 - Pietro Incardona, Tommaso Bianucci, Ivo F. Sbalzarini:
Distributed Sparse Block Grids on GPUs. 272-290 - Luk Burchard, Johannes Moe, Daniel Thilo Schroeder, Konstantin Pogorelov, Johannes Langguth:
iPUG: Accelerating Breadth-First Graph Traversals Using Manycore Graphcore IPUs. 291-309
Performance Modeling, Evaluation, and Analysis
- Richard Todd Evans, Matthew Cawood, Stephen Lien Harrell, Lei Huang, Si Liu, Chun-Yaung Lu, Amit Ruhela, Yinzhi Wang, Zhao Zhang:
Optimizing GPU-Enhanced HPC System and Cloud Procurements for Scientific Workloads. 313-331 - Andrei Poenaru, Wei-Chen Lin, Simon McIntosh-Smith:
A Performance Analysis of Modern Parallel Programming Models Using a Compute-Bound Application. 332-350 - Ayesha Afzal, Georg Hager, Gerhard Wellein:
Analytic Modeling of Idle Waves in Parallel Programs: Communication, Cluster Topology, and Noise Impact. 351-371 - Masahiro Nakao, Koji Ueno, Katsuki Fujisawa, Yuetsu Kodama, Mitsuhisa Sato:
Performance of the Supercomputer Fugaku for Breadth-First Search in Graph500 Benchmark. 372-390 - István Z. Reguly, Andrew M. B. Owenson, Archie Powell, Stephen A. Jarvis, Gihan R. Mudalige:
Under the Hood of SYCL - An Initial Performance Analysis with An Unstructured-Mesh CFD Application. 391-410 - Amit Ruhela, Stephen Lien Harrell, Richard Todd Evans, Gregory J. Zynda, John M. Fonner, Matt Vaughn, Tommy Minyard, John Cazes:
Characterizing Containerized HPC Applications Performance at Petascale on CPU and GPU Architectures. 411-430 - David Böhme, Pascal Aschwanden, Olga Pearce, Kenneth Weiss, Matthew P. LeGendre:
Ubiquitous Performance Analysis. 431-449
Programming Environments and Systems Software
- Chad Wood, Giorgis Georgakoudis, David Beckingsale, David Poliakoff, Alfredo Giménez, Kevin A. Huck, Allen D. Malony, Todd Gamblin:
Artemis: Automatic Runtime Tuning of Parallel Execution Parameters Using Machine Learning. 453-472
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.