default search action
28th ICS 2014: Muenchen, Germany
- Arndt Bode, Michael Gerndt, Per Stenström, Lawrence Rauchwerger, Barton P. Miller, Martin Schulz:
2014 International Conference on Supercomputing, ICS'14, Muenchen, Germany, June 10-13, 2014. ACM 2014, ISBN 978-1-4503-2642-1
Keynote address I
- Thomas Lippert:
HPC for the human brain project. 1
Programming models
- Quan Chen, Minyi Guo, Haibing Guan:
LAWS: locality-aware work-stealing for multi-socket multi-core architectures. 3-12 - Chandan Reddy, Uday Bondhugula:
Effective automatic computation placement and dataallocation for parallelization of regular programs. 13-22 - Khaled Z. Ibrahim, Katherine A. Yelick:
On the conditions for efficient interoperability with threads: an experience with PGAS languages using cray communication domains. 23-32 - Md. Wasi-ur-Rahman, Xiaoyi Lu, Nusrat Sharmin Islam, Dhabaleswar K. Panda:
HOMR: a hybrid approach to exploit maximum overlapping in MapReduce over high performance interconnects. 33-42
Memory systems
- Zehan Cui, Sally A. McKee, Zhongbin Zha, Yungang Bao, Mingyu Chen:
DTail: a flexible approach to DRAM refresh management. 43-52 - Yingying Tian, Samira Manabi Khan, Daniel A. Jiménez, Gabriel H. Loh:
Last-level cache deduplication. 53-62 - Lingda Li, Junlin Lu, Xu Cheng:
Block value based insertion policy for high performance last-level caches. 63-72 - Sanyam Mehta, Zhenman Fang, Antonia Zhai, Pen-Chung Yew:
Multi-stage coordinated prefetching for present-day processors. 73-82
Applications and algorithms
- Ron A. Oldfield, Kenneth Moreland, Nathan Fabian, David H. Rogers:
Evaluation of methods to integrate analysis into a large-scale shock shock physics code. 83-92 - Shuo Chen, Xiaoming Li:
Input-adaptive parallel sparse fast fourier transform for stream processing. 93-102 - Alejandro Chacón, Santiago Marco-Sola, Antonio Espinosa, Paolo Ribeca, Juan Carlos Moure:
Thread-cooperative, bit-parallel computation of levenshtein distance on GPU. 103-112 - Olga Pearce, Todd Gamblin, Bronis R. de Supinski, Tom Arsenlis, Nancy M. Amato:
Load balancing n-body simulations with highly non-uniform density. 113-122
Keynote address II
- Mark D. Hill:
21st century computer architecture keynote at 2014 international conference on supercomputing (ICS). 123
MPI
- Min Si, Antonio J. Peña, Pavan Balaji, Masamichi Takagi, Yutaka Ishikawa:
MT-MPI: multithreaded MPI for many-core environments. 125-134 - Jesper Larsson Träff, Antoine Rougier, Sascha Hunold:
Implementing a classic: zero-copy all-to-all communication with mpi datatypes. 135-144 - Philip C. Roth, Jeremy S. Meredith:
Value influence analysis for message passing applications. 145-154 - Amir Bahmani, Frank Mueller:
Scalable performance analysis of exascale MPI programs through signature-based clustering algorithms. 155-164
Poster session
- Akhil Langer:
An optimal distributed load balancing algorithm for homogeneous work units. 165 - Josué Feliu, Julio Sahuquillo, Salvador Petit, José Duato:
Addressing bandwidth contention in SMT multicores through scheduling. 167 - Yang You, Shuaiwen Leon Song, Darren J. Kerbyson:
An adaptive cross-architecture combination method for graph traversal. 169 - Jun Ohno, Kei Hiraki:
Accelerating cache coherence mechanism with speculation. 171 - Takahiro Naruko:
Reducing energy consumption of NoC by router bypassing. 173 - Teruo Tanimoto, Takatsugu Ono, Kohta Nakashima, Takashi Miyoshi:
Hardware-assisted scalable flow control of shared receive queue. 175 - Bin Ren, Nishkam Ravi, Yi Yang, Min Feng, Gagan Agrawal, Srimat T. Chakradhar:
Automating and optimizing data transfers for many-core coprocessors. 177 - Muthu Manikandan Baskaran, Benoît Meister, Richard Lethin:
Parallelizing and optimizing sparse tensor computations. 179
I/O and NVRAM
- Yin Lu, Yong Chen, Robert Latham, Yu Zhuang:
Revealing applications' access pattern in collective I/O for cache management. 181-190 - Lauro Beltrão Costa, Samer Al-Kiswany, Hao Yang, Matei Ripeanu:
Supporting storage configuration for I/O intensive workflows. 191-200 - Wei Wang, Tao Xie, Deng Zhou:
Understanding the impact of threshold voltage on MLC flash memory performance and reliability. 201-210 - Fei Xia, Dejun Jiang, Jin Xiong, Mingyu Chen, Lixin Zhang, Ninghui Sun:
DWC: dynamic write consolidation for phase change memory systems. 211-220
Modeling and optimization
- Nathan R. Tallent, Adolfy Hoisie:
Palm: easing the burden of analytical performance modeling. 221-230 - Ruijin Zhou, Sankaran Sivathanu, Jinpyo Kim, Bing Tsai, Tao Li:
An end-to-end analysis of file system features on sparse virtual disks. 231-240 - Jie Shen, Ana Lucia Varbanescu, Peng Zou, Yutong Lu, Henk J. Sips:
Improving performance by matching imbalanced workloads with heterogeneous platforms. 241-250 - Shanjiang Tang, Bu-Sung Lee, Bingsheng He, Haikun Liu:
Long-term resource fairness: towards economic fairness on pay-as-you-use computing systems. 251-260
Keynote address III
- Marc Snir:
The future of supercomputing. 261-262
Accelerators
- Gordon Erlebacher, Erik Saule, Natasha Flyer, Evan F. Bollig:
Acceleration of derivative calculations with application to radial basis function: finite-differences on the intel mic architecture. 263-272 - Arash Ashari, Naser Sedaghati, John Eisenlohr, P. Sadayappan:
An efficient two-dimensional blocking strategy for sparse matrix-vector multiplication on GPUs. 273-282 - Xin Huo, Bin Ren, Gagan Agrawal:
A programming system for xeon phis with runtime SIMD parallelization. 283-292 - Ari B. Hayes, Eddy Z. Zhang:
Unified on-chip memory allocation for SIMT architecture. 293-302
Interconnect and microarchitecture
- Yigit Demir, Yan Pan, Seokwoo Song, Nikos Hardavellas, John Kim, Gokhan Memik:
Galaxy: a high-performance energy-efficient multi-chip architecture using photonic interconnects. 303-312 - Karthikeyan P. Saravanan, Paul M. Carpenter, Alex Ramírez:
A performance perspective on energy efficient HPC links. 313-322 - Hui Meen Nyew, Nilufer Onder, Soner Önder, Zhenlin Wang:
Verifying micro-architecture simulators using event traces. 323-332
Multi- and many-core systems
- Fengguang Song, Jack J. Dongarra:
Scaling up matrix computations on shared-memory manycore systems with 1000 CPU cores. 333-342 - George Michelogiannakis, Alexander Williams, Samuel Williams, John Shalf:
Collective memory transfers for multi-core chips. 343-352 - Miquel Pericàs, Kenjiro Taura, Satoshi Matsuoka:
Scalable analysis of multicore data reuse and sharing. 353-362
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.