Google Scholar

Finding optimal numerical format for sub-8-bit post-training quantization of vision transformers

J Lee, Y Hwang, J Choi - ICASSP 2023-2023 IEEE …, 2023 - ieeexplore.ieee.org

ICASSP 2023-2023 IEEE International Conference on Acoustics …, 2023•ieeexplore.ieee.org

Vision Transformers (ViTs) have gained significant attention for their exceptional model accuracies on computer vision applications, but their demanding memory requirements and computational complexity have hindered active deployment. Post-training quantization (PTQ) is a practical method to tackle this challenge by directly reducing ViT’s bit-precision. However, diverse data characteristics across different operations of ViT cannot be well captured solely by a single numerical format (fixed or floating-point). This work proposes an analytical framework that optimizes the numerical format of each matrix multiplication of ViTs for mixed-format sub-8bit quantization. The extensive evaluation demonstrates that the proposed method can reduce the PTQ error and achieve state-of-the-art accuracy for popular ViT models.

ieeexplore.ieee.org

Show moreShow less

Save Cite Cited by 3 Related articles

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

Finding optimal numerical format for sub-8-bit post-training quantization of vision transformers