
Huawei's new open source technique shrinks LLMs to make them run on less powerful, less expensive hardware
Huawei’s Computing Systems Lab in Zurich has introduced a new open-source quantization method for large language models (LLMs) aimed at reducing memory demands without sacrificing output quality. The technique, called SINQ (Sinkhorn-Normalized Quantization), is designed […]