default search action
IPDPS 2021: Portland, OR, USA - Workshops
- IEEE International Parallel and Distributed Processing Symposium Workshops, IPDPS Workshops 2021, Portland, OR, USA, June 17-21, 2021. IEEE 2021, ISBN 978-1-6654-3577-2
HCW: Heterogeneity in Computing Workshop
- Yujing Ma, Florin Rusu, Kesheng Wu, Alexander Sim:
Adaptive Stochastic Gradient Descent for Deep Learning on Heterogeneous CPU+GPU Architectures. 6-15 - Vinícius Garcia Pinto, Lucas Leandro Nesi, Marcelo Cogo Miletto, Lucas Mello Schnorr:
Providing In-depth Performance Analysis for Heterogeneous Task-based Applications with StarVZ. 16-25 - Francis O'Brien, Matthew Agostini, Tarek S. Abdelrahman:
A Streaming Accelerator for Heterogeneous CPU-FPGA Processing of Graph Applications. 26-35 - Feng Li, Moon Gi Seok, Wentong Cai:
A New Double Rank-based Multi-workflow Scheduling with Multi-objective Optimization in Cloud Environments. 36-45 - Caio S. Rohwedder, João P. L. de Carvalho, José Nelson Amaral, Guido Araújo, Giancarlo Colmenares, Kai-Ting Amy Wang:
Pooling Acceleration in the DaVinci Architecture Using Im2col and Col2im Instructions. 46-55 - Ranjan Sarpangala Venkatesh, Tony Mason, Pradeep Fernando, Greg Eisenhauer, Ada Gavrilovska:
Scheduling HPC Workflows with Intel Optane Persistent Memory. 56-65 - Rohan Kumar, Matt Baughman, Ryan Chard, Zhuozhao Li, Yadu N. Babuji, Ian T. Foster, Kyle Chard:
Coding the Computing Continuum: Fluid Function Execution in Heterogeneous Computing Environments. 66-75 - Morris Riedel, Rocco Sedona, Chadi Barakat, Pétur Helgi Einarsson, Reza Hassanian, Gabriele Cavallaro, Matthias Book, Helmut Neukirchen, Andreas Lintermann:
Practice and Experience in using Parallel and Scalable Machine Learning with Heterogenous Modular Supercomputing Architectures. 76-85
RAW: Reconfigurable Architectures Workshop
- Hirohisa Watanabe, Hiroki Matsutani:
Accelerating ODE-Based Neural Networks on Low-Cost FPGAs. 88-95 - Hirohisa Watanabe, Mineto Tsukada, Hiroki Matsutani:
An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning. 96-103 - Lorenzo Farinelli, Daniele Valentino De Vincenti, Andrea Damiani, Luca Stornaiuolo, Rolando Brondolin, Marco D. Santambrogio, Donatella Sciuto:
Plaster: an Embedded FPGA-based Cluster Orchestrator for Accelerated Distributed Algorithms. 104-107 - Nael Fasfous, Manoj Rohit Vemparala, Alexander Frickenstein, Lukas Frickenstein, Mohamed Badawy, Walter Stechele:
BinaryCoP: Binary Neural Network-based COVID-19 Face-Mask Wear and Positioning Predictor on Edge Devices. 108-115 - Danielle Tchuinkou Kwadjo, Joel Mandebi Mbongue, Christophe Bobda:
Exploring a Layer-based Pre-implemented Flow for Mapping CNN on FPGA. 116-123 - Timothy Martin, Gary Gréwal, Shawki Areibi:
A Machine Learning Approach to Predict Timing Delays During FPGA Placement. 124-127 - Daniele Paletti, Davide Conficconi, Marco D. Santambrogio:
Dovado: An Open-Source Design Space Exploration Framework. 128-135 - Lukas Weber, Lukas Sommer, Leonardo Solis-Vasquez, Tobias Vinçon, Christian Knödler, Arthur Bernhardt, Ilia Petrov, Andreas Koch:
A Framework for the Automatic Generation of FPGA-based Near-Data Processing Accelerators in Smart Storage Systems. 136-143 - Renato Campos, João M. P. Cardoso:
On Data Parallelism Code Restructuring for HLS Targeting FPGAs. 144-151 - Philipp Holzinger, Daniel Reiser, Tobias Hahn, Marc Reichenbach:
Fast HBM Access with FPGAs: Analysis, Architectures, and Applications. 152-159 - Mohamed W. Hassan, Peter M. Athanas:
Graph Analytics on Hybrid System (GAHS) Case Study: PageRank. 160-167 - Joel Mandebi Mbongue, Sujan Kumar Saha, Christophe Bobda:
Performance Study of Multi-tenant Cloud FPGAs. 168-171 - Najdet Charaf, Ahmed Kamaleldin, Martin Thümmler, Diana Göhringer:
RV-CAP: Enabling Dynamic Partial Reconfiguration for FPGA-Based RISC-V System-on-Chip. 172-179 - Quentin Berthet, Andres Upegui, Laurent Gantel, Alexandre Duc, Giulia Traverso:
An Area-Efficient SPHINCS+ Post-Quantum Signature Coprocessor. 180-187 - Jianyu Chen, Maurice Daverveldt, Zaid Al-Ars:
FPGA Acceleration of Zstd Compression Algorithm. 188-191
HiCOMB: High Performance Computational Biology
- Gulsum Gudukbay, Jashwant Raj Gunasekaran, Yilin Feng, Mahmut T. Kandemir, Anton Nekrutenko, Chita R. Das, Paul Medvedev, Björn A. Grüning, Nate Coraor, Nathan Roach, Enis Afgan:
GYAN: Accelerating Bioinformatics Tools in Galaxy with GPU-Aware Computation Mapping. 194-203 - Bryce Kille, Yunxi Liu, Nicolae Sapoval, Michael Nute, Lawrence Rauchwerger, Nancy M. Amato, Todd J. Treangen:
Accelerating SARS-CoV-2 low frequency variant calling on ultra deep sequencing datasets. 204-208 - Zülal Bingöl, Mohammed Alser, Onur Mutlu, Ozcan Ozturk, Can Alkan:
GateKeeper-GPU: Fast and Accurate Pre-Alignment Filtering in Short Read Mapping. 209 - Ahmad Hesam, Lukas Breitwieser, Fons Rademakers, Zaid Al-Ars:
GPU Acceleration of 3D Agent-Based Biological Simulations. 210-217 - Pierre Barbera, Alexandros Stamatakis:
Efficient Memory Management in Likelihood-based Phylogenetic Placement. 218-227 - Chiranjeb Mondal, Sanjay V. Rajopadhye:
Accelerating the BPMax Algorithm for RNA-RNA Interaction. 228-237
GrAPL: Graphs, Architectures, Programming, and Learning
- Gábor Szárnyas, David A. Bader, Timothy A. Davis, James Kitchen, Timothy G. Mattson, Scott McMillan, Erik Welch:
LAGraph: Linear Algebra, Network Analysis Libraries, and the Study of Graph Algorithms. 243-252 - Benjamin Brock, Aydin Buluç, Timothy G. Mattson, Scott McMillan, José E. Moreira:
Introduction to GraphBLAS 2.0. 253-262 - Jeremy Kepner, Timothy Davis, Vijay Gadepally, Hayden Jananthan, Lauren Milechin:
Mathematics of Digital Hyperspace. 263-271 - Egor Orachev, Maria Karpenko, Artem Khoroshev, Semyon V. Grigorev:
SPbLA: The Library of GPGPU-Powered Sparse Boolean Linear Algebra Operations. 272-275 - Kasimir Gabert, Ümit V. Çatalyürek:
PIGO: A Parallel Graph Input/Output Library. 276-279 - Pat Devlin, Jeremy Kepner, Ashley Luo, Erin Meger:
Hybrid Power-Law Models of Network Traffic. 280-287 - Zhaochen Gu, Sihai Tang, Beilei Jiang, Song Huang, Qiang Guan, Song Fu:
Characterizing Job-Task Dependency in Cloud Workloads Using Graph Learning. 288-297 - Kuldeep R. Kurte, Neena Imam, Ramakrishnan Kannan, S. M. Shamimul Hasan, Srikanth B. Yoginath:
Co-design of Advanced Architectures for Graph Analytics using Machine Learning. 298-307 - Catherine D. Schuman, Bill Kay, Prasanna Date, Ramakrishnan Kannan, Piyush Sao, Thomas E. Potok:
Sparse Binary Matrix-Vector Multiplication on Neuromorphic Computers. 308-311
EduPar: NSF/TCPP Workshop on Parallel and Distributed Computing Education
- Jirí Dokulil:
Let's Put the Memory Model Front and Center When Teaching Parallel Programming in C++. 315-320 - Sascha Hunold, Bartlomiej Przybylski:
Teaching Complex Scheduling Algorithms. 321-327 - Sherif G. Aly, Haidar Harmanani, Rajendra K. Raj, Sanaa Sharafeddine:
ABET Accreditation: A Way Forward for PDC Education. 328-335 - Jesús Cámara, José-Carlos Cano, Javier Cuenca, Toshiyuki Maeda, Mariano Saura-Sánchez, Lewis Tseng, Akiyoshi Wakatani, Martina Barnas:
EduPar Virtual Poster Session. 336-341 - Joel C. Adams, Richard A. Brown, Suzanne J. Matthews, Elizabeth Shoop:
Teaching PDC in the Time of COVID: Hands-on Materials for Remote Learning. 342-349 - Michael Gowanlock, Benoît Gallet:
Data-Intensive Computing Modules for Teaching Parallel and Distributed Computing. 350-357
HIPS: High-level Parallel Programming Models and Supportive Environments
- Yong Wang, Yongfa Zhou, Qi Scott Wang, Yang Wang, Qing Xu, Chen Wang, Bo Peng, Zhaojun Zhu, Katayama Takuya, Dylan Wang:
Developing medical ultrasound beamforming application on GPU and FPGA using oneAPI. 360-370 - Zheming Jin, Jeffrey S. Vetter:
Evaluating CUDA Portability with HIPCL and DPCT. 371-376 - Gregor Daiß, Mikael Simberg, Auriane Reverdell, John Biddiscombe, Theresa Pollinger, Hartmut Kaiser, Dirk Pflüger:
Beyond Fork-Join: Integration of Performance Portable Kokkos Kernels with HPX. 377-386 - Bo Qiao, Jürgen Teich, Frank Hannig:
An Efficient Approach for Image Border Handling on GPUs via Iteration Space Partitioning. 387-396 - Xinyao Yi, David Stokes, Yonghong Yan, Chunhua Liao:
CUDAMicroBench: Microbenchmarks to Assist CUDA Performance Programming. 397-406 - Poornima Nookala, Zafar Ahmad, Mohammad Mahdi Javanmard, Martin Kong, Rezaul Chowdhury, Robert J. Harrison:
Understanding Recursive Divide-and-Conquer Dynamic Programs in Fork-Join and Data-Flow Execution Models. 407-416 - Donovan Snyder, Chen Ding:
Measuring Cache Complexity Using Data Movement Distance (DMD). 417-419 - Aaron Welch, Oscar R. Hernandez, Barbara M. Chapman:
Combining Static and Dynamic Analysis to Query Characteristics of HPC Applications. 420-429
AsHES: Accelerators and Hybrid Emerging Systems
- Tetsuro Nakamura, Shogo Saito, Kei Fujimoto, Masashi Kaneko, Akinori Shiraga:
Time-Division Multiplexing for FPGA Considering CNN Model Switch Time. 433-438 - S. M. Shamimul Hasan, Neena Imam, Ramakrishnan Kannan, Srikanth B. Yoginath, Kuldeep R. Kurte:
Design Space Exploration of Emerging Memory Technologies for Machine Learning Applications. 439-448 - Felix Liu, Niclas Jansson, Artur Podobas, Albin Fredriksson, Stefano Markidis:
Accelerating Radiation Therapy Dose Calculation with Nvidia GPUs. 449-458 - Lena Oden, Jörg Keller:
Improving Cryptanalytic Applications with Stochastic Runtimes on GPUs. 459-468 - Jennifer A. Loe, Christian A. Glusa, Ichitaro Yamazaki, Erik G. Boman, Sivasankaran Rajamanickam:
Experimental Evaluation of Multiprecision Strategies for GMRES on GPUs. 469-478 - Jaemin Choi, Zane Fink, Sam White, Nitin Bhat, David F. Richards, Laxmikant V. Kalé:
GPU-aware Communication with UCX in Parallel Programming Models: Charm++, MPI, and Python. 479-488
PDCO: Parallel / Distributed Combinatorics and Optimization
- Florian Fey, Sergei Gorlatch:
CPRIC: Collaborative Parallelism for Randomized Incremental Constructions. 490-499 - Fekhr Eddine Keddous, H.-N. Nguyen, Amir Nakib:
Characters Recognition based on CNN-RNN architecture and Metaheuristic. 500-507 - Roger L. Goodwin:
Linearizing Computing the Power Set with OpenMP. 508-519 - Oswaldo Artiles, Fahad Saeed:
TurboBFS: GPU Based Breadth-First Search (BFS) Algorithms in the Language of Linear Algebra. 520-528 - Ryan J. Marshall, Lakmali Weerasena, Anthony Skjellum:
A Parallel Meta-Solver for the Multi-Objective Set Covering Problem. 529-538 - Peter Oostema, Franz Franchetti:
Leveraging High Dimensional Spatial Graph Embedding as a Heuristic for Graph Algorithms. 539-547 - Mikhail G. Babenko, Andrei Tchernykh, Luis Bernardo Pulido-Gaytan, Jorge M. Cortés-Mendoza, Egor M. Shiryaev, Elena Golimblevskaia, Arutyun Avetisyan, Sergio Nesmachnow:
RRNS Base Extension Error-Correcting Code for Performance Optimization of Scalable Reliable Distributed Cloud Data Storage. 548-553
APDCM: Advances in Parallel and Distributed Computational Models
- Jonas Posner, Lukas Reitz, Claudia Fohry:
Checkpointing vs. Supervision Resilience Approaches for Dynamic Independent Tasks. 556-565 - Masahiro Shibata, Masaki Ohyabu, Yuichi Sudo, Junya Nakamura, Yonghwan Kim, Yoshiaki Katayama:
Gathering of seven autonomous mobile robots on triangular grids. 566-575 - Kevin Buchin, Paola Flocchini, Irina Kostitsyna, Tom Peters, Nicola Santoro, Koichi Wada:
Autonomous Mobile Robots: Refining the Computational Landscape. 576-585 - Shota Nagahama, Fukuhito Ooshita, Michiko Inoue:
Terminating Grid Exploration with Myopic Luminous Robots. 586-595 - Hirotsugu Kakugawa, Sayaka Kamei:
A self-stabilizing token circulation with graceful handover on bidirectional ring networks. 596-604 - Andreas Klos, Marius Rosenbaum, Wolfram Schiffmann:
Scalable and Highly Available Multi-Objective Neural Architecture Search in Bare Metal Kubernetes Cluster. 605-610 - George Bosilca, Aurélien Bouteiller, Thomas Hérault, Valentin Le Fèvre, Yves Robert, Jack J. Dongarra:
Revisiting Credit Distribution Algorithms for Distributed Termination Detection. 611-620 - Roman Iakymchuk, Amândio Faustino, Andrew P. J. Emerson, João Barreto, Valeria Bartsch, Rodrigo Rodrigues, José C. Monteiro:
Efficient and Eventually Consistent Collective Operations. 621-630 - Andrew Rosen, Benjamin Levin, Anu G. Bourgeois:
Autonomous Load Balancing in Distributed Hash Tables Using Churn and the Sybil Attack. 631-640 - Aparna Sasidharan:
Performance Models for Hybrid Programs Accelerated by GPUs. 641-651 - Zheming Jin, Jeffrey S. Vetter:
Evaluating the Performance of Integer Sum Reduction on an Intel GPU. 652-655 - Koji Nakano, Shotaro Aoki, Yasuaki Ito, Akihiko Kasagi:
On the Computational Power of Convolution Pooling: A Theoretical Approach for Deep Learning. 656-665
PDSEC: Parallel and Distributed Scientific and Engineering Computing
- Pranav Gadikar, Patrick Diehl, Prashant K. Jha:
Load balancing for distributed nonlocal models within asynchronous many-task systems. 669-678 - Sandra Catalán, Francisco D. Igual, Rafael Rodríguez-Sánchez, Enrique S. Quintana-Ortí:
Scalable Hybrid Loop- and Task-Parallel Matrix Inversion for Multicore Processors. 679-687 - Yu-Hsuan Shih, Garrett Wright, Joakim Andén, Johannes P. Blaschke, Alex H. Barnett:
cuFINUFFT: a load-balanced GPU library for general-purpose nonuniform FFTs. 688-697 - Amin Totounferoush, Neda Ebrahimi Pour, Sabine Roller, Miriam Mehl:
Parallel Machine Learning of Partial Differential Equations. 698-703 - Jessica Imlau Dagostini, Henrique Corrêa Pereira da Silva, Vinícius Garcia Pinto, Roberto Machado Velho, Eduardo Simoes Lopes Gastal, Lucas Mello Schnorr:
Improving Workload Balance of a Marine CSEM Inversion Application. 704-713 - Hadia Ahmed, David B. Williams-Young, Khaled Z. Ibrahim, Chao Yang:
Performance Modeling and Tuning for DFT Calculations on Heterogeneous Architectures. 714-722 - Makoto Morishita, Satoshi Ohshima, Takahiro Katagiri, Toru Nagai:
Parallelization of GKV benchmark using OpenACC. 723-729 - Sergio Barrachina, Adrián Castelló, Mar Catalán, Manuel F. Dolz, José I. Mestre:
A Flexible Research-Oriented Framework for Distributed Training of Deep Neural Networks. 730-739 - Jan Verschelde:
Accelerated Polynomial Evaluation and Differentiation at Power Series in Multiple Double Precision. 740-749
iWAPT: Automatic Performance Tuning
- Yuta Sasaki, Ayumu Ishizuka, Mulya Agung, Hiroyuki Takizawa:
Evaluating I/O Acceleration Mechanisms of SX-Aurora TSUBASA. 752-759 - Kengo Nakajima, Takeshi Ogita, Masatoshi Kawai:
Efficient Parallel Multigrid Methods on Manycore Clusters with Double/Single Precision Computing. 760-769 - Chia-Chun Liang, Che-Rung Lee:
Automatic Selection of Tensor Decomposition for Compressing Convolutional Neural Networks A Case Study on VGG-type Networks. 770-778 - Kou Murakami, Kazuhiko Komatsu, Masayuki Sato, Hiroaki Kobayashi:
A Processor Selection Method based on Execution Time Estimation for Machine Learning Programs. 779-788 - Naruya Kitai, Daisuke Takahashi, Franz Franchetti, Takahiro Katagiri, Satoshi Ohshima, Toru Nagai:
An Auto-tuning with Adaptation of A64 Scalable Vector Extension for SPIRAL. 789-797 - Ayse Bagbaba, Xuan Wang:
Improving the MPI-IO Performance of Applications with Genetic Algorithm based Auto-tuning. 798-805 - Jacob O. Tørring, Jan Christian Meyer, Anne C. Elster:
Autotuning Benchmarking Techniques: A Roofline Model Case Study. 806-815 - Sai P. Chenna, Herman Lam, Greg Stitt, S. Balachandar:
Scalable Performance Prediction of Irregular Workloads in Multi-Phase Particle-in-Cell Applications. 816-825
SNACS: Scalable Networks for Advanced Computing Systems Workshop
- Ryohei Sato, Hidetoshi Kawaguchi, Yuichi Nakatani:
User Allocation for Real-Time Applications with State Sharing in Fog Computing Networks. 828-831 - Zaid Alzaid, Saptarshi Bhowmik, Xin Yuan:
Multi-Path Routing in the Jellyfish Network. 832-841
PAISE: Parallel AI and Systems for the Edge
- Enrique Nueve, Sean Shahkarami, Seongha Park, Nicola J. Ferrier:
Addressing the Constraints of Active Learning on the Edge. 845-849 - Xiaojun Ruan, Haiquan Chen:
Informed Prefetching in I/O Bounded Distributed Deep Learning. 850-857 - Gaurav Verma, Yashi Gupta, Abid M. Malik, Barbara M. Chapman:
Performance Evaluation of Deep Learning Compilers for Edge Inference. 858-865 - Martin Breitbach, Janick Edinger, Dominik Schäfer, Christian Becker:
DataVinci: Proactive Data Placement for Ad-Hoc Computing. 866-873 - André Luckow, Kartik Rattan, Shantenu Jha:
Pilot-Edge: Distributed Resource Management Along the Edge-to-Cloud Continuum. 874-878 - Bibek Shrestha, Richard Cziva, Engin Arslan:
INT Based Network-Aware Task Scheduling for Edge Computing. 879-886 - Aravind Sankaran, Paolo Bientinesi:
Performance Comparison for Scientific Computations on the Edge via Relative Performance. 887-895
RADR: Resource Arbitration for Dynamic Runtimes
- Liang Wei, Kazuyuki Shudo:
Dynamic Computing Resources Allocation for Multiple Deep Learning Tasks. 899-905
ScaDL: Scalable Deep Learning over Parallel And Distributed Infrastructures
- Quentin Anthony, Lang Xu, Hari Subramoni, Dhabaleswar K. D. K. Panda:
Scaling Single-Image Super-Resolution Training on Modern HPC Clusters: Early Experiences. 923-932 - Medha Atre, Birendra Jha, Ashwini Rao:
Distributed Deep Learning Using Volunteer Computing-Like Paradigm. 933-942 - Pankaj Rajak, Anikeya Aditya, Shogo Fukushima, Rajiv K. Kalia, Thomas Linker, Kuang Liu, Ye Luo, Aiichiro Nakano, Ken-ichi Nomura, Kohei Shimamura, Fuyuki Shimojo, Priya Vashishta:
Ex-NNQMD: Extreme-Scale Neural Network Quantum Molecular Dynamics. 943-946 - Arissa Wongpanich, Hieu Pham, James Demmel, Mingxing Tan, Quoc V. Le, Yang You, Sameer Kumar:
Training EfficientNets at Supercomputer Scale: 83% ImageNet Top-1 Accuracy in One Hour. 947-950 - Kaoutar El Maghraoui, Lorraine M. Herger, Chekuri Choudary, Kim Tran, Todd Deshane, David Hanson:
Performance Analysis of Deep Learning Workloads on a Composable System. 951-954
HPS: High-Performance Storage
- Zhe Wang, Pradeep Subedi, Matthieu Dorier, Philip E. Davis, Manish Parashar:
Facilitating Staging-based Unstructured Mesh Processing to Support Hybrid In-Situ Workflows. 960-964 - Ke Fan, Kristopher K. Micinski, Thomas Gilray, Sidharth Kumar:
Exploring MPI Collective I/O and File-per-process I/O for Checkpointing a Logical Inference Task. 965-972
ParSocial: Parallel and Distributed Processing for Computational Social Systems
- Eunice E. Santos, Vairavan Murugappan, John Korah:
Memory Efficient Edge Addition Designs for Large and Dynamic Social Networks. 975-984 - Bogdan Mucenic, Chaitanya Kaligotla, Abby Stevens, Jonathan Ozik, Nicholson T. Collier, Charles M. Macal:
Load Balancing Schemes for Large Synthetic Population-Based Complex Simulators. 985-988 - Eric Tatara, John A. Schneider, Madeline Quasebarth, Nicholson T. Collier, Harold Pollack, Basmattee Boodram, Samuel H. Friedman, Elizabeth Salisbury-Afshar, Mary Ellen Mackesy-Amiti, Jonathan Ozik:
Application of Distributed Agent-based Modeling to Investigate Opioid Use Outcomes in Justice Involved Populations. 989-997 - Kasimir Gabert, Ali Pinar, Ümit V. Çatalyürek:
Shared-Memory Scalable k-Core Maintenance on Dynamic Graphs and Hypergraphs. 998-1007 - Petros Anastasiadis, Sergiy Gogolenko, Nikela Papadopoulou, Marcin Lawenda, Hamid Arabnejad, Alireza Jahani, Imran Mahmood, Derek Groen:
P-Flee: An Efficient Parallel Algorithm for Simulating Human Migration. 1008-1011
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.