default search action
Khaled Hamidouche
Person information
Refine list
refinements active!
zoomed in on ?? of ?? records
view refined list in
export refined list as
2020 – today
- 2023
- [c56]Gabriel H. Loh, Michael J. Schulte, Mike Ignatowski, Vignesh Adhinarayanan, Shaizeen Aga, Derrick Aguren, Varun Agrawal, Ashwin M. Aji, Johnathan Alsop, Paul T. Bauman, Bradford M. Beckmann, Majed Valad Beigi, Sergey Blagodurov, Travis Boraten, Michael Boyer, William C. Brantley, Noel Chalmers, Shaoming Chen, Kevin Cheng, Michael L. Chu, David Cownie, Nicholas Curtis, Joris Del Pino, Nam Duong, Alexandru Dutu, Yasuko Eckert, Christopher Erb, Chip Freitag, Joseph L. Greathouse, Sudhanva Gurumurthi, Anthony Gutierrez, Khaled Hamidouche, Sachin Hossamani, Wei Huang, Mahzabeen Islam, Nuwan Jayasena, John Kalamatianos, Onur Kayiran, Jagadish Kotra, Alan Lee, Daniel Lowell, Niti Madan, Abhinandan Majumdar, Nicholas Malaya, Srilatha Manne, Susumu Mashimo, Damon McDougall, Elliot Mednick, Michael Mishkin, Mark Nutter, Indrani Paul, Matthew Poremba, Brandon Potter, Kishore Punniyamurthy, Sooraj Puthoor, Steven E. Raasch, Karthik Rao, Gregory Rodgers, Marko Scrbak, Mohammad Seyedzadeh, John Slice, Vilas Sridharan, René van Oostrum, Eric Van Tassell, Abhinav Vishnu, Samuel Wasmundt, Mark Wilkening, Noah Wolfe, Mark Wyse, Adithya Yalavarti, Dmitri Yudanov:
A Research Retrospective on AMD's Exascale Computing Journey. ISCA 2023: 81:1-81:14 - [i1]Kishore Punniyamurthy, Bradford M. Beckmann, Khaled Hamidouche:
GPU-initiated Fine-grained Overlap of Collective Communication with Computation. CoRR abs/2305.06942 (2023) - 2020
- [j3]Ryan E. Grant, Khaled Hamidouche:
Hot Interconnects 26. IEEE Micro 40(1): 6-7 (2020) - [c55]Khaled Hamidouche, Michael LeBeane:
<u>G</u>PU <u>i</u>nitiated <u>O</u>penSHMEM: correct and efficient intra-kernel networking for dGPUs. PPoPP 2020: 336-347
2010 – 2019
- 2018
- [c54]Michael LeBeane, Khaled Hamidouche, Brad Benton, Maurício Breternitz, Steven K. Reinhardt, Lizy K. John:
ComP-net: command processor networking for efficient intra-kernel communications on GPUs. PACT 2018: 29:1-29:13 - 2017
- [c53]Jahanzeb Maqbool Hashmi, Khaled Hamidouche, Hari Subramoni, Dhabaleswar K. Panda:
Kernel-Assisted Communication Engine for MPI on Emerging Manycore Processors. HiPC 2017: 84-93 - [c52]Akshay Venkatesh, Khaled Hamidouche, Sreeram Potluri, Davide Rossetti, Ching-Hsiang Chu, Dhabaleswar K. Panda:
MPI-GDS: High Performance MPI Designs with GPUDirect-aSync for CPU-GPU Control Flow Decoupling. ICPP 2017: 151-160 - [c51]Ammar Ahmad Awan, Khaled Hamidouche, Jahanzeb Maqbool Hashmi, Dhabaleswar K. Panda:
S-Caffe: Co-designing MPI Runtimes and Caffe for Scalable Deep Learning on Modern GPU Clusters. PPoPP 2017: 193-205 - [c50]Michael LeBeane, Khaled Hamidouche, Brad Benton, Maurício Breternitz, Steven K. Reinhardt, Lizy K. John:
GPU triggered networking for intra-kernel communications. SC 2017: 22 - 2016
- [j2]Khaled Hamidouche, Akshay Venkatesh, Ammar Ahmad Awan, Hari Subramoni, Ching-Hsiang Chu, Dhabaleswar K. Panda:
CUDA-Aware OpenSHMEM: Extensions and Designs for High Performance OpenSHMEM on GPU Clusters. Parallel Comput. 58: 27-36 (2016) - [c49]Ching-Hsiang Chu, Khaled Hamidouche, Akshay Venkatesh, Ammar Ahmad Awan, Dhabaleswar K. Panda:
CUDA Kernel Based Collective Reduction Operations on Large-scale GPU Clusters. CCGrid 2016: 726-735 - [c48]Dip Sankar Banerjee, Khaled Hamidouche, Dhabaleswar K. Panda:
Re-Designing CNTK Deep Learning Framework on Modern GPU Enabled Clusters. CloudCom 2016: 144-151 - [c47]Mingzhe Li, Xiaoyi Lu, Khaled Hamidouche, Jie Zhang, Dhabaleswar K. Panda:
Mizan-RMA: Accelerating Mizan Graph Processing Framework with MPI RMA. HiPC 2016: 42-51 - [c46]Khaled Hamidouche, Ammar Ahmad Awan, Akshay Venkatesh, Dhabaleswar K. Panda:
CUDA M3: Designing Efficient CUDA Managed Memory-Aware MPI by Exploiting GDR and IPC. HiPC 2016: 52-61 - [c45]Jahanzeb Maqbool Hashmi, Khaled Hamidouche, Dhabaleswar K. Panda:
Enabling Performance Efficient Runtime Support for Hybrid MPI+UPC++ Programming Models. HPCC/SmartCity/DSS 2016: 1180-1187 - [c44]Ching-Hsiang Chu, Khaled Hamidouche, Akshay Venkatesh, Dip Sankar Banerjee, Hari Subramoni, Dhabaleswar K. Panda:
Exploiting Maximal Overlap for Non-Contiguous Data Movement Processing on Modern GPU-Enabled Systems. IPDPS 2016: 983-992 - [c43]Dip Sankar Banerjee, Khaled Hamidouche, Dhabaleswar K. Panda:
Designing high performance communication runtime for GPU managed memory: early experiences. GPGPU@PPoPP 2016: 82-91 - [c42]A. A. Awan, Khaled Hamidouche, Akshay Venkatesh, Dhabaleswar K. Panda:
Efficient Large Message Broadcast using NCCL and CUDA-Aware MPI for Deep Learning. EuroMPI 2016: 15-22 - [c41]Ching-Hsiang Chu, Khaled Hamidouche, Hari Subramoni, Akshay Venkatesh, Bracy Elton, Dhabaleswar K. Panda:
Designing High Performance Heterogeneous Broadcast for Streaming Applications on GPU Clusters. SBAC-PAD 2016: 59-66 - [c40]Khaled Hamidouche, Jie Zhang, Dhabaleswar K. Panda, Karen Tomko:
OpenSHMEM Non-blocking Data Movement Operations with MVAPICH2-X: Early Experiences. PAW@SC 2016: 9-16 - [c39]Ching-Hsiang Chu, Khaled Hamidouche, Hari Subramoni, Akshay Venkatesh, Bracy Elton, Dhabaleswar K. Panda:
Efficient Reliability Support for Hardware Multicast-Based Broadcast in GPU-enabled Streaming Applications. COMHPC@SC 2016: 29-38 - [c38]Mingzhe Li, Khaled Hamidouche, Xiaoyi Lu, Hari Subramoni, Jie Zhang, Dhabaleswar K. Panda:
Designing MPI library with on-demand paging (ODP) of infiniband: challenges and benefits. SC 2016: 433-443 - [c37]Hari Subramoni, Albert Mathews Augustine, Mark Daniel Arnold, Jonathan L. Perkins, Xiaoyi Lu, Khaled Hamidouche, Dhabaleswar K. Panda:
INAM2: InfiniBand Network Analysis and Monitoring with MPI. ISC 2016: 300-320 - 2015
- [c36]Raghunath Raja Chandrasekar, Akshay Venkatesh, Khaled Hamidouche, Dhabaleswar K. Panda:
Power-Check: An Energy-Efficient Checkpointing Framework for HPC Clusters. CCGRID 2015: 261-270 - [c35]Khaled Hamidouche, Akshay Venkatesh, Ammar Ahmad Awan, Hari Subramoni, Ching-Hsiang Chu, Dhabaleswar K. Panda:
Exploiting GPUDirect RDMA in Designing High Performance OpenSHMEM for NVIDIA GPU Clusters. CLUSTER 2015: 78-87 - [c34]Mingzhe Li, Hari Subramoni, Khaled Hamidouche, Xiaoyi Lu, Dhabaleswar K. Panda:
High Performance MPI Datatype Support with User-Mode Memory Registration: Challenges, Designs, and Benefits. CLUSTER 2015: 226-235 - [c33]Mingzhe Li, Khaled Hamidouche, Xiaoyi Lu, Jian Lin, Dhabaleswar K. Panda:
High-Performance and Scalable Design of MPI-3 RMA on Xeon Phi Clusters. Euro-Par 2015: 625-637 - [c32]Akshay Venkatesh, Khaled Hamidouche, Hari Subramoni, Dhabaleswar K. Panda:
Offloaded GPU Collectives Using CORE-Direct and CUDA Capabilities on InfiniBand Clusters. HiPC 2015: 234-243 - [c31]Mingzhe Li, Khaled Hamidouche, Xiaoyi Lu, Jie Zhang, Jian Lin, Dhabaleswar K. Panda:
High Performance OpenSHMEM Strided Communication Support with InfiniBand UMR. HiPC 2015: 244-253 - [c30]Hari Subramoni, Akshay Venkatesh, Khaled Hamidouche, Karen Tomko, Dhabaleswar K. Panda:
Impact of InfiniBand DC Transport Protocol on Energy Consumption of All-to-All Collective Algorithms. Hot Interconnects 2015: 60-67 - [c29]Jian Lin, Khaled Hamidouche, Xiaoyi Lu, Mingzhe Li, Dhabaleswar K. Panda:
High-Performance Coarray Fortran Support with MVAPICH2-X: Initial Experience and Evaluation. IPDPS Workshops 2015: 225-234 - [c28]A. A. Awan, Khaled Hamidouche, Ching-Hsiang Chu, Dhabaleswar K. Panda:
A Case for Non-blocking Collectives in OpenSHMEM: Design, Implementation, and Performance Evaluation using MVAPICH2-X. OpenSHMEM 2015: 69-86 - [c27]Antonio Gómez-Iglesias, Jérôme Vienne, Khaled Hamidouche, Christopher S. Simmons, William L. Barth, Dhabaleswar K. Panda:
Scalable Out-of-core OpenSHMEM Library for HPC. OpenSHMEM 2015: 138-153 - [c26]Jian Lin, Khaled Hamidouche, Jie Zhang, Xiaoyi Lu, Abhinav Vishnu, Dhabaleswar K. Panda:
Accelerating k-NN Algorithm with Hybrid MPI and OpenSHMEM. OpenSHMEM 2015: 164-177 - [c25]A. A. Awan, Khaled Hamidouche, Akshay Venkatesh, Jonathan L. Perkins, Hari Subramoni, Dhabaleswar K. Panda:
GPU-Aware Design, Implementation, and Evaluation of Non-blocking Collective Benchmarks. EuroMPI 2015: 9:1-9:10 - [c24]Akshay Venkatesh, Abhinav Vishnu, Khaled Hamidouche, Nathan R. Tallent, Dhabaleswar K. Panda, Darren J. Kerbyson, Adolfy Hoisie:
A case for application-oblivious energy-efficient MPI runtime. SC 2015: 29:1-29:12 - [c23]Hari Subramoni, Ammar Ahmad Awan, Khaled Hamidouche, Dmitry Pekurovsky, Akshay Venkatesh, Sourav Chakraborty, Karen Tomko, Dhabaleswar K. Panda:
Designing Non-blocking Personalized Collectives with Near Perfect Overlap for RDMA-Enabled Clusters. ISC 2015: 434-453 - [c22]Antonio Gómez-Iglesias, Dmitry Pekurovsky, Khaled Hamidouche, Jie Zhang, Jérôme Vienne:
Porting scientific libraries to PGAS in XSEDE resources: practice and experience. XSEDE 2015: 40:1-40:7 - [e1]Dhabaleswar K. Panda, Karl W. Schulz, Khaled Hamidouche, Hari Subramoni:
Proceedings of the First International Workshop on Extreme Scale Programming Models and Middleware, ESPM 2015, Austin, Texas, USA, November 15, 2015. ACM 2015, ISBN 978-1-4503-3996-4 [contents] - 2014
- [c21]Jithin Jose, Khaled Hamidouche, Xiaoyi Lu, Sreeram Potluri, Jie Zhang, Karen Tomko, Dhabaleswar K. Panda:
High performance OpenSHMEM for Xeon Phi clusters: Extensions, runtime designs and application co-design. CLUSTER 2014: 10-18 - [c20]Mingzhe Li, Xiaoyi Lu, Sreeram Potluri, Khaled Hamidouche, Jithin Jose, Karen Tomko, Dhabaleswar K. Panda:
Scalable Graph500 design with MPI-3 RMA. CLUSTER 2014: 230-238 - [c19]Rong Shi, Sreeram Potluri, Khaled Hamidouche, Jonathan L. Perkins, Mingzhe Li, Davide Rossetti, Dhabaleswar K. Panda:
Designing efficient small message transfer mechanism for inter-node MPI communication on InfiniBand GPU clusters. HiPC 2014: 1-10 - [c18]Akshay Venkatesh, Hari Subramoni, Khaled Hamidouche, Dhabaleswar K. Panda:
A high performance broadcast design with hardware multicast and GPUDirect RDMA for streaming applications on Infiniband clusters. HiPC 2014: 1-10 - [c17]Raghunath Rajachandrasekar, Sreeram Potluri, Akshay Venkatesh, Khaled Hamidouche, Md. Wasi-ur-Rahman, Dhabaleswar K. Panda:
MIC-Check: a distributed check pointing framework for the intel many integrated cores architecture. HPDC 2014: 121-124 - [c16]Rong Shi, Xiaoyi Lu, Sreeram Potluri, Khaled Hamidouche, Jie Zhang, Dhabaleswar K. Panda:
HAND: A Hybrid Approach to Accelerate Non-contiguous Data Movement Using MPI Datatypes on GPU Clusters. ICPP 2014: 221-230 - [c15]Jithin Jose, Khaled Hamidouche, Jie Zhang, Akshay Venkatesh, Dhabaleswar K. Panda:
Optimizing Collective Communication in UPC. IPDPS Workshops 2014: 361-370 - [c14]Akshay Venkatesh, Sreeram Potluri, Raghunath Rajachandrasekar, Miao Luo, Khaled Hamidouche, Dhabaleswar K. Panda:
High Performance Alltoall and Allgather Designs for InfiniBand MIC Clusters. IPDPS 2014: 637-646 - [c13]Jithin Jose, Sreeram Potluri, Hari Subramoni, Xiaoyi Lu, Khaled Hamidouche, Karl W. Schulz, Hari Sundar, Dhabaleswar K. Panda:
Designing Scalable Out-of-core Sorting with Hybrid MPI+PGAS Programming Models. PGAS 2014: 7:1-7:9 - [c12]Mingzhe Li, Jian Lin, Xiaoyi Lu, Khaled Hamidouche, Karen Tomko, Dhabaleswar K. Panda:
Scalable MiniMD Design with Hybrid MPI and OpenSHMEM. PGAS 2014: 24:1-24:4 - [c11]Miao Luo, Xiaoyi Lu, Khaled Hamidouche, Krishna Chaitanya Kandalla, Dhabaleswar K. Panda:
Initial study of multi-endpoint runtime for MPI+OpenMP hybrid programming model on multi-core systems. PPoPP 2014: 395-396 - [c10]Raghunath Rajachandrasekar, Jonathan L. Perkins, Khaled Hamidouche, Mark Daniel Arnold, Dhabaleswar K. Panda:
Understanding the Memory-Utilization of MPI Libraries: Challenges and Designs in Implementing the MPI_T Interface. EuroMPI/ASIA 2014: 97 - [c9]Hari Subramoni, Khaled Hamidouche, Akshay Venkatesh, Sourav Chakraborty, Dhabaleswar K. Panda:
Designing MPI Library with Dynamic Connected Transport (DCT) of InfiniBand: Early Experiences. ISC 2014: 278-295 - 2013
- [j1]Khaled Hamidouche, Fernando Machado Mendonca, Joel Falcou, Alba Cristina Magalhaes Alves de Melo, Daniel Etiemble:
Parallel Smith-Waterman Comparison on Multicore and Manycore Computing Platforms with BSP++. Int. J. Parallel Program. 41(1): 111-136 (2013) - [c8]Rong Shi, Sreeram Potluri, Khaled Hamidouche, Xiaoyi Lu, Karen Tomko, Dhabaleswar K. Panda:
A scalable and portable approach to accelerate hybrid HPL on heterogeneous CPU-GPU clusters. CLUSTER 2013: 1-8 - [c7]Krishna Chaitanya Kandalla, Akshay Venkatesh, Khaled Hamidouche, Sreeram Potluri, Devendar Bureddy, Dhabaleswar K. Panda:
Designing Optimized MPI Broadcast and Allreduce for Many Integrated Core (MIC) InfiniBand Clusters. Hot Interconnects 2013: 63-70 - [c6]Sreeram Potluri, Khaled Hamidouche, Akshay Venkatesh, Devendar Bureddy, Dhabaleswar K. Panda:
Efficient Inter-node MPI Communication Using GPUDirect RDMA for InfiniBand Clusters with NVIDIA GPUs. ICPP 2013: 80-89 - [c5]Khaled Hamidouche, Sreeram Potluri, Hari Subramoni, Krishna Chaitanya Kandalla, Dhabaleswar K. Panda:
MIC-RO: enabling efficient remote offload on heterogeneous many integrated core (MIC) clusters with InfiniBand. ICS 2013: 399-408 - [c4]Mingzhe Li, Sreeram Potluri, Khaled Hamidouche, Jithin Jose, Dhabaleswar K. Panda:
Efficient and truly passive MPI-3 RMA using InfiniBand atomics. EuroMPI 2013: 91-96 - [c3]Sreeram Potluri, Devendar Bureddy, Khaled Hamidouche, Akshay Venkatesh, Krishna Chaitanya Kandalla, Hari Subramoni, Dhabaleswar K. Panda:
MVAPICH-PRISM: a proxy-based communication framework using InfiniBand and SCIF for intel MIC clusters. SC 2013: 54:1-54:11 - 2011
- [b1]Khaled Hamidouche:
Programmation des architectures hiérarchiques et hétérogènes. (Programming hierarxchical and heterogenous machines). University of Paris-Sud, Orsay, France, 2011 - [c2]Khaled Hamidouche, Fernando Machado Mendonca, Joël Falcou, Daniel Etiemble:
Parallel Biological Sequence Comparison on Heterogeneous High Performance Computing Platforms with BSP++. SBAC-PAD 2011: 136-143 - [c1]Khaled Hamidouche, Joel Falcou, Daniel Etiemble:
A framework for an automatic hybrid MPI+OpenMP code generation. SpringSim (HPC) 2011: 48-55
Coauthor Index
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.
Unpaywalled article links
Add open access links from to the list of external document links (if available).
Privacy notice: By enabling the option above, your browser will contact the API of unpaywall.org to load hyperlinks to open access articles. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Unpaywall privacy policy.
Archived links via Wayback Machine
For web page which are no longer available, try to retrieve content from the of the Internet Archive (if available).
Privacy notice: By enabling the option above, your browser will contact the API of archive.org to check for archived content of web pages that are no longer available. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Internet Archive privacy policy.
Reference lists
Add a list of references from , , and to record detail pages.
load references from crossref.org and opencitations.net
Privacy notice: By enabling the option above, your browser will contact the APIs of crossref.org, opencitations.net, and semanticscholar.org to load article reference information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the Crossref privacy policy and the OpenCitations privacy policy, as well as the AI2 Privacy Policy covering Semantic Scholar.
Citation data
Add a list of citing articles from and to record detail pages.
load citations from opencitations.net
Privacy notice: By enabling the option above, your browser will contact the API of opencitations.net and semanticscholar.org to load citation information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the OpenCitations privacy policy as well as the AI2 Privacy Policy covering Semantic Scholar.
OpenAlex data
Load additional information about publications from .
Privacy notice: By enabling the option above, your browser will contact the API of openalex.org to load additional information. Although we do not have any reason to believe that your call will be tracked, we do not have any control over how the remote server uses your data. So please proceed with care and consider checking the information given by OpenAlex.
last updated on 2024-10-07 21:21 CEST by the dblp team
all metadata released as open data under CC0 1.0 license
see also: Terms of Use | Privacy Policy | Imprint