default search action
ACM Transactions on Architecture and Code Optimization, Volume 18
Volume 18, Number 1, January 2021
- Ari Rasch, Richard Schulze, Michel Steuwer, Sergei Gorlatch:
Efficient Auto-Tuning of Parallel Programs with Interdependent Tuning Parameters via Auto-Tuning Framework (ATF). 1:1-1:26 - Syed Mohammad Asad Hassan Jafri, Hasan Hassan, Ahmed Hemani, Onur Mutlu:
Refresh Triggered Computation: Improving the Energy Efficiency of Convolutional Neural Network Accelerators. 2:1-2:29 - Solomon Abera, M. Balakrishnan, Anshul Kumar:
Performance-Energy Trade-off in Modern CMPs. 3:1-3:26 - Atefeh Mehrabi, Aninda Manocha, Benjamin C. Lee, Daniel J. Sorin:
Bayesian Optimization for Efficient Accelerator Synthesis. 4:1-4:25 - Minsu Kim, Jeong-Keun Park, Soo-Mook Moon:
Irregular Register Allocation for Translation of Test-pattern Programs. 5:1-5:23 - Negin Nematollahi, Mohammad Sadrosadati, Hajar Falahati, Marzieh Barkhordar, Mario Paulo Drumond, Hamid Sarbazi-Azad, Babak Falsafi:
Efficient Nearest-Neighbor Data Sharing in GPUs. 6:1-6:26 - Lorenz Braun, Sotirios Nikas, Chen Song, Vincent Heuveline, Holger Fröning:
A Simple Model for Portable and Fast Prediction of Execution Time and Power Consumption of GPU Kernels. 7:1-7:25 - Marcel Mettler, Daniel Mueller-Gritschneder, Ulf Schlichtmann:
A Distributed Hardware Monitoring System for Runtime Verification on Multi-Tile MPSoCs. 8:1-8:25 - Yu Emma Wang, Carole-Jean Wu, Xiaodong Wang, Kim M. Hazelwood, David Brooks:
Exploiting Parallelism Opportunities with Deep Learning Frameworks. 9:1-9:23 - Wenjie Liu, Shoaib Akram, Jennifer B. Sartor, Lieven Eeckhout:
Reliability-aware Garbage Collection for Hybrid HBM-DRAM Memories. 10:1-10:25 - Sanket Tavarageri, Alexander Heinecke, Sasikanth Avancha, Bharat Kaul, Gagandeep Goyal, Ramakrishna Upadrasta:
PolyDL: Polyhedral Optimizations for Creation of High-performance DL Primitives. 11:1-11:27 - Sujay Yadalam, Vinod Ganapathy, Arkaprava Basu:
SGXL: Security and Performance for Enclaves Using Large Pages. 12:1-12:25 - Kleovoulos Kalaitzidis, André Seznec:
Leveraging Value Equality Prediction for Value Speculation. 13:1-13:20 - Abhishek Singh, Shail Dave, Pantea Zardoshti, Robert Brotzman, Chao Zhang, Xiaochen Guo, Aviral Shrivastava, Gang Tan, Michael F. Spear:
SPX64: A Scratchpad Memory for General-purpose Microprocessors. 14:1-14:26 - Sooraj Puthoor, Mikko H. Lipasti:
Systems-on-Chip with Strong Ordering. 15:1-15:27 - Paolo Sylos Labini, Marco Cianfriglia, Damiano Perri, Osvaldo Gervasi, Grigori Fursin, Anton Lokhmotov, Cedric Nugteren, Bruno Carpentieri, Fabiana Zollo, Flavio Vella:
On the Anatomy of Predictive Models for Accelerating GPU Convolution Kernels and Beyond. 16:1-16:24
Volume 18, Number 2, March 2021
- Nils Voss, Bastiaan Kwaadgras, Oskar Mencer, Wayne Luk, Georgi Gaydadjiev:
On Predictable Reconfigurable System Design. 17:1-17:28 - Anirudh Mohan Kaushik, Gennady Pekhimenko, Hiren D. Patel:
Gretch: A Hardware Prefetcher for Graph Analytics. 18:1-18:25 - Nhut-Minh Ho, Himeshi De Silva, Weng-Fai Wong:
GRAM: A Framework for Dynamically Mixing Precisions in GPU Applications. 19:1-19:24 - Arnab Kumar Biswas:
Cryptographic Software IP Protection without Compromising Performance or Timing Side-channel Leakage. 20:1-20:20 - Maxime France-Pillois, Jérôme Martin, Frédéric Rousseau:
A Non-Intrusive Tool Chain to Optimize MPSoC End-to-End Systems. 21:1-21:22 - Pengyu Wang, Jing Wang, Chao Li, Jianzong Wang, Haojin Zhu, Minyi Guo:
Grus: Toward Unified-memory-efficient High-performance Graph Processing on GPU. 22:1-22:25 - Ramin Izadpanah, Christina L. Peterson, Yan Solihin, Damian Dechev:
PETRA: Persistent Transactional Non-blocking Linked Data Structures. 23:1-23:26 - Muhammad Hassan, Chang Hyun Park, David Black-Schaffer:
A Reusable Characterization of the Memory System Behavior of SPEC2017 and SPEC2006. 24:1-24:20
Volume 18, Number 3, June 2021
- Sugandha Tiwari, Neel Gala, Chester Rebeiro, V. Kamakoti:
PERI: A Configurable Posit Enabled RISC-V Core. 25:1-25:26 - George Charitopoulos, Dionisios N. Pnevmatikatos, Georgi Gaydadjiev:
MC-DeF: Creating Customized CGRAs for Dataflow Applications. 26:1-26:25 - Jose M. Rodriguez Borbon, Junjie Huang, Bryan M. Wong, Walid A. Najjar:
Acceleration of Parallel-Blocked QR Decomposition of Tall-and-Skinny Matrices on FPGAs. 27:1-27:25 - Michael Stokes, David B. Whalley, Soner Önder:
Decreasing the Miss Rate and Eliminating the Performance Penalty of a Data Filter Cache. 28:1-28:22 - Shoaib Akram:
Performance Evaluation of Intel Optane Memory for Managed Workloads. 29:1-29:26 - Ya-Shuai Lü, Hui Guo, Libo Huang, Qi Yu, Li Shen, Nong Xiao, Zhiying Wang:
GraphPEG: Accelerating Graph Processing on GPUs. 30:1-30:24 - Hamza Omar, Omer Khan:
PRISM: Strong Hardware Isolation-based Soft-Error Resilient Multicore Architecture with High Performance and Availability at Low Hardware Overheads. 31:1-31:25 - Devashree Tripathy, AmirAli Abdolrashidi, Laxmi Narayan Bhuyan, Liang Zhou, Daniel Wong:
PAVER: Locality Graph-Based Thread Block Scheduling for GPUs. 32:1-32:26 - Wim Heirman, Stijn Eyerman, Kristof Du Bois, Ibrahim Hur:
Automatic Sublining for Efficient Sparse Memory Accesses. 33:1-33:23 - Mustafa Cavus, Mohammed Shatnawi, Resit Sendag, Augustus K. Uht:
Fast Key-Value Lookups with Node Tracker. 34:1-34:26 - Weijia Song, Christina Delimitrou, Zhiming Shen, Robbert van Renesse, Hakim Weatherspoon, Lotfi Benmohamed, Frederic J. de Vaulx, Charif Mahmoudi:
CacheInspector: Reverse Engineering Cache Resources in Public Clouds. 35:1-35:25 - Daniel Rodrigues Carvalho, André Seznec:
Understanding Cache Compression. 36:1-36:27 - Daniel Thuerck, Nicolas Weber, Roberto Bifulco:
Flynn's Reconciliation: Automating the Register Cache Idiom for Cross-accelerator Programming. 37:1-37:26 - João P. L. de Carvalho, Braedy Kuzma, Ivan Korostelev, José Nelson Amaral, Christopher Barton, José E. Moreira, Guido Araujo:
KernelFaRer: Replacing Native-Code Idioms with High-Performance Library Calls. 38:1-38:22 - Ricardo Alves, Stefanos Kaxiras, David Black-Schaffer:
Early Address Prediction: Efficient Pipeline Prefetch and Reuse. 39:1-39:22
Volume 18, Number 4, December 2021
- Kaustav Goswami, Dip Sankar Banerjee, Shirshendu Das:
Towards Enhanced System Efficiency while Mitigating Row Hammer. 40:1-40:26 - Jerzy Proficz:
All-gather Algorithms Resilient to Imbalanced Process Arrival Patterns. 41:1-41:22 - Rui Xu, Sheng Ma, Yaohua Wang, Xinhai Chen, Yang Guo:
Configurable Multi-directional Systolic Array Architecture for Convolutional Neural Networks. 42:1-42:24 - Wonik Seo, Sanghoon Cha, Yeonjae Kim, Jaehyuk Huh, Jongse Park:
SLO-Aware Inference Scheduler for Heterogeneous Processors in Edge Platforms. 43:1-43:26 - Yasir Mahmood Qureshi, William Andrew Simon, Marina Zapater, Katzalin Olcoz, David Atienza:
Gem5-X: A Many-core Heterogeneous Simulation Platform for Architectural Exploration and Optimization. 44:1-44:27 - Tina Jung, Fabian Ritter, Sebastian Hack:
PICO: A Presburger In-bounds Check Optimization for Compiler-based Memory Safety Instrumentations. 45:1-45:27 - Zhibing Sha, Jun Li, Lihao Song, Jiewen Tang, Min Huang, Zhigang Cai, Lianju Qian, Jianwei Liao, Zhiming Liu:
Low I/O Intensity-aware Partial GC Scheduling to Reduce Long-tail Latency in SSDs. 46:1-46:25 - Syed Asad Alam, James Garland, David Gregg:
Low-precision Logarithmic Number Systems: Beyond Base-2. 47:1-47:25 - Candace Walden, Devesh Singh, Meenatchi Jagasivamani, Shang Li, Luyi Kang, Mehdi Asnaashari, Sylvain Dubois, Bruce L. Jacob, Donald Yeung:
Monolithically Integrating Non-Volatile Main Memory over the Last-Level Cache. 48:1-48:26 - Matthew Tomei, Shomit Das, Mohammad Seyedzadeh, Philip Bedoukian, Bradford M. Beckmann, Rakesh Kumar, David A. Wood:
Byte-Select Compression. 49:1-49:27 - Cunlu Li, Dezun Dong, Shazhou Yang, Xiangke Liao, Guangyu Sun, Yongheng Liu:
CIB-HIER: Centralized Input Buffer Design in Hierarchical High-radix Routers. 50:1-50:21 - Tobias Gysi, Christoph Müller, Oleksandr Zinenko, Stephan Herhut, Eddie Davis, Tobias Wicky, Oliver Fuhrer, Torsten Hoefler, Tobias Grosser:
Domain-Specific Multi-Level IR Rewriting for GPU: The Open Earth Compiler for GPU-accelerated Climate Simulation. 51:1-51:23 - An Zou, Huifeng Zhu, Jingwen Leng, Xin He, Vijay Janapa Reddi, Christopher D. Gill, Xuan Zhang:
System-level Early-stage Modeling and Evaluation of IVR-assisted Processor Power Delivery System. 52:1-52:27 - Aninda Manocha, Tyler Sorensen, Esin Tureci, Opeoluwa Matthews, Juan L. Aragón, Margaret Martonosi:
GraphAttack: Optimizing Data Supply for Graph Applications on In-Order Multicore Architectures. 53:1-53:26 - Joscha Benz, Oliver Bringmann:
Scenario-Aware Program Specialization for Timing Predictability. 54:1-54:26 - Shounak Chakraborty, Magnus Själander:
WaFFLe: Gated Cache-Ways with Per-Core Fine-Grained DVFS for Reduced On-Chip Temperature and Leakage Consumption. 55:1-55:25 - Sriseshan Srikanth, Anirudh Jain, Thomas M. Conte, Erik P. DeBenedictis, Jeanine E. Cook:
SortCache: Intelligent Cache Management for Accelerating Sparse Data Workloads. 56:1-56:24 - Paul Metzger, Volker Seeker, Christian Fensch, Murray Cole:
Device Hopping: Transparent Mid-Kernel Runtime Switching for Heterogeneous Systems. 57:1-57:25 - Yu Zhang, Da Peng, Xiaofei Liao, Hai Jin, Haikun Liu, Lin Gu, Bingsheng He:
LargeGraph: An Efficient Dependency-Aware GPU-Accelerated Large-Scale Graph Processing. 58:1-58:24 - M. Hüsrev Cilasun, Salonik Resch, Zamshed I. Chowdhury, Erin Olson, Masoud Zabihi, Zhengyang Zhao, Thomas Peterson, Keshab K. Parhi, Jianping Wang, Sachin S. Sapatnekar, Ulya R. Karpuzcu:
Spiking Neural Networks in Spintronic Computational RAM. 59:1-59:21
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.