


default search action
CCF Transactions on High Performance Computing, Volume 6
Volume 6, Number 1, February 2024
- Yunquan Zhang, Guangming Tan, Liang Yuan:
Special issue of HPCChina 2023. 1-2 - Yidong Chen, Jingshan Pan, Zidong Han, Yonghong Hu, Meng Guo, Zhonghua Lu
:
BSPADMM: block splitting proximal ADMM for sparse representation with strong scalability. 3-16 - Yueyuan Zhou
, Ziyi Ren, En Shao, Lixian Ma, Qiang Hu, Leping Wang
, Guangming Tan:
FILL: a heterogeneous resource scheduling system addressing the low throughput problem in GROMACS. 17-31 - Lu Bai, Weixing Ji
, Qinyuan Li, Xilai Yao, Wei Xin, Wanyi Zhu:
ConvDarts: a fast and exact convolutional algorithm selector for deep learning frameworks. 32-44 - Hang Cao
, Cheng Xu, Yunqi Han, Muhui Lin, Kai Shen, Geng Wang, Jinhu Li, Xiangzheng Sun, Ronghui He, Liang You, Hang Yang, Xiantao Zhang:
An efficient cloud-based elastic RDMA protocol for HPC applications. 45-53 - Haoyuan Zhang, Wenpeng Ma, Wu Yuan, Jian Zhang, Zhonghua Lu:
Mixed-precision block incomplete sparse approximate preconditioner on Tensor core. 54-67 - Dazheng Liu, Wenjuan Liu, Liangrui Pan, Yutao Dou, Jianping Wu
:
Optimization of the parallel semi-Lagrangian scheme to overlap computation with communication based on grouping levels in YHGSM. 68-77 - Yang Wang, Qinglin Wang
, Xiangdong Pei, Songzhu Mei, Rongchun Li, Jie Liu:
High performance dilated convolutions on multi-core DSPs. 78-93 - Zhengxian Lu
, Chengkun Du, Yanfeng Jiang, Xueshuo Xie
, Tao Li, Fei Yang:
Quantitative evaluation of deep learning frameworks in heterogeneous computing environment. 94-111
Volume 6, Number 2, April 2024
- Shanjiang Tang, Yusen Li:
Editorial for the special issue on heterogenous computing. 113-114 - Yang Xiao, Zeke Wang
:
AIbench: a tool for benchmarking Huawei ascend AI processors. 115-129 - Shanjiang Tang
, Ziyi Wang, Ce Yu, Chao Sun, Yusen Li, Jian Xiao:
Fast and accurate novelty detection for large surveillance video. 130-149 - Xinyang Shen, Yu Huang
, Long Zheng, Xiaofei Liao, Hai Jin:
A heterogeneous 3-D stacked PIM accelerator for GCN-based recommender systems. 150-163 - Gang Liu, Zeting Wang, Amelie Chi Zhou
, Rui Mao:
Adaptive key partitioning in distributed stream processing. 164-178 - Shiyang Li
, Jingyu Zhu, Jiaxun Han, Yuting Peng, Zhuoran Wang, Xiaoli Gong
, Gang Wang, Jin Zhang, Xuqiang Wang:
OneGraph: a cross-architecture framework for large-scale graph computing on GPUs based on oneAPI. 179-191 - Yuanwei Sun, Haikun Liu
, Xiaofei Liao, Hai Jin, Yu Zhang:
FPGA-based acceleration architecture for Apache Spark operators. 192-205 - Yani Liu
, Feng Zhang, Zaifeng Pan, Xiaoguang Guo, Yihua Hu, Xiao Zhang, Xiaoyong Du:
Compressed data direct computing for Chinese dataset on DCU. 206-220 - Yu Lu, Ce Yu, Jian Xiao, Hao Wang, Hao Fu, Bo Kang, Gang Zheng:
A large-scale heterogeneous computing framework for non-uniform sampling two-dimensional convolution applications. 221-239
Volume 6, Number 3, June 2024
- Jianbin Fang, Jidong Zhai, Zheng Wang:
Editorial for the special issue on programming models and system software for High-Performance Computing (HPC) environments. 241-242 - Junsheng Chang
, Kai Lu, Yang Guo, Yongwen Wang, Zhenyu Zhao, Libo Huang, Hongwei Zhou, Yao Wang, Fei Lei, Biwei Zhang:
A survey of compute nodes with 100 TFLOPS and beyond for supercomputers. 243-262 - Jianfeng Liu
, Wangrong Gao, Hanzheng Liang, Lin Peng, Ting Wang:
Towards a universal and portable assembly code size reduction: a case study of RISC-V ISA. 263-273 - Haoran Lin, Lifeng Yan, Qixin Chang, Haitian Lu, Chenlin Li, Quanjie He, Zeyu Song, Xiaohui Duan
, Zekun Yin, Yuxuan Li, Zhao Liu, Wei Xue, Haohuan Fu, Lin Gan, Guangwen Yang, Weiguo Liu:
O2ath: an OpenMP offloading toolkit for the sunway heterogeneous manycore platform. 274-286 - Yicheng Sui, Yufei Sun, Changqing Shi, Haotian Wang, Zhiqiang Zhang, Jiahao Wang, Yuzhi Zhang:
Opencl-pytorch: an OpenCL-based extension of PyTorch. 287-300 - Juncheng Hu, Xilong Che
, Bowen Kan, Yuhan Shao:
LS-HTC: an HTC system for large-scale jobs. 301-318 - Changqing Shi
, Yufei Sun, Yicheng Sui
, Yuqiao Chen, Haotian Wang, Yuzhi Zhang:
oclCUB: an OpenCL parallel computing library for deep learning operators. 319-329 - Zongjing Chen, Kangjin Huang, Yonggang Che
, Chuanfu Xu, Jian Zhang, Zhe Dai, Ming Li:
Extending OP2 framework to support portable parallel programming of complex applications. 330-342 - Shaojie Tan, Qingcai Jiang
, Zhenwei Cao, Xiaoyu Hao, Junshi Chen, Hong An:
Uncovering the performance bottleneck of modern HPC processor with static code analyzer: a case study on Kunpeng 920. 343-364
Volume 6, Number 4, August 2024
- Zhengxiong Hou
, Hong Shen, Qiying Feng, Zhiqi Lv, Junwei Jin, Xingshe Zhou, Jianhua Gu:
Optimizing job scheduling by using broad learning to predict execution times on HPC clusters. 365-377 - Moirangthem Goldie Meitei
, Ningrinla Marchang:
Altruistic user-oriented task allocation techniques for mobile crowdsensing. 378-396 - Chenyang Jiao, Zhikai Qin, Li Shen
:
ScalaQC: a scalability optimization framework for full-state quantum simulation on CPU+GPU heterogeneous clusters. 397-407 - Huming Zhu
, Chendi Liu, Qiuming Li, Lingyun Zhang
, Libing Wang, Sifan Li, Licheng Jiao, Biao Hou:
Deep convolutional encoder-decoder networks based on ensemble learning for semantic segmentation of high-resolution aerial imagery. 408-424 - Riku Nunokawa, Yoichi Shimomura, Mulya Agung
, Ryusuke Egawa, Hiroyuki Takizawa
:
Conflict-aware workload co-execution on SX-aurora TSUBASA. 425-438 - Maoxue Yu, Guanghao Ma, Zhuoya Wang, Shuai Tang, Yuhu Chen, Yucheng Wang, Yuanyuan Liu, Dongning Jia
, Zhiqiang Wei:
swCUDA: Auto parallel code translation framework from CUDA to ATHREAD for new generation sunway supercomputer. 439-458
Volume 6, Number 5, October 2024
- Wenxiang Yang, Jie Yu:
Trade-off topology design for hierarchical network based on job characteristics. 459-471 - D. Sirisha
, S. Sambhu Prasad
:
CPTF-a new heuristic based branch and bound algorithm for workflow scheduling in heterogeneous distributed computing systems. 472-487 - Yu Hu
, Ziteng Li, Jianfeng Li, Junbo Tie, Lei Wang:
A security JPEG image system accelerated by NEON technology based on FT-2000/4. 488-502 - Mouzhi Yang
, Peng Zhang
, Jianbin Fang, Weifeng Liu
, Chun Huang:
thSORT: an efficient parallel sorting algorithm on multi-core DSPs. 503-518 - Jie Jia, Xinyuan Lin, Fang Lin, Yi Liu:
DCU-CHK: checkpointing for large-scale CPU-DCU heterogeneous computing systems. 519-532 - Zhewen Xu
, Xiaohui Wei, Jieyun Hao, Jiale Li, Hongliang Li
, Zhaohui Ding, Sicong Li:
HiRM: Hierarchical resource management for earth system models on many-core clusters. 533-548
Volume 6, Number 6, December 2024
- Jie Liu, Yizhuo Wang
, Jianhua Gao, Weixing Ji:
pSpMv: precision-based sparse matrix partition and SpMV optimization. 549-565 - Xinjie Wang
, Guanghao Ma, Jiaying Song, Mingyao Geng, Wenhui Hu, Xi Duan, Zhigang Wang, Jiali Xu, Xiaogang Jin, Fang Li, Dexun Chen, Maoxue Yu
:
Heterogeneous many-core optimization for Monte Carlo path-tracing on new generation Sunway HPC system. 566-587 - Jianfeng Liu, Jianbin Fang
, Ting Wang, Jing Xie, Chun Huang, Zheng Wang:
Efficient compiler optimization by modeling passes dependence. 588-607 - Zhangyu Liu
, Jinqiu Wang, Huijun Wu, Qingzhen Ma, Lin Peng, Zhanyong Tang:
Auto-tuning for HPC storage stack: an optimization perspective. 608-631 - Zhenshan Bao, Mengyuan Wang, Wei Bai, Wenbo Zhang:
Multi-index federated aggregation algorithm based on trusted verification. 632-645 - Zheng Liu, Meng Hao, Weizhe Zhang
, Gangzhao Lu, Xueyang Tian, Siyu Yang, Mingdong Xie, Jie Dai, Chenyu Yuan, Desheng Wang, Hongwei Yang:
Optimizing depthwise separable convolution on DCU. 646-664

manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.