"Hessian-Aware KV Cache Quantization for LLMs."

Woohong Byun, Jongseok Woo, Saibal Mukhopadhyay (2024)

Details and statistics

DOI: 10.1109/MWSCAS60917.2024.10658840

access: closed

type: Conference or Workshop Paper

metadata version: 2024-10-10