ChenMnZ - Overview
[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.
Python 892 77
[ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models
Python 336 30
An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization
Python 170 16
(AAAI 2023 Oral) Pytorch implementation of "CF-ViT: A General Coarse-to-Fine Method for Vision Transformer"
Python 107 8
[ICCV 23]An approach to enhance the efficiency of Vision Transformer (ViT) by concurrently employing token pruning and token merging techniques, while incorporating a differentiable compression rate.
Jupyter Notebook 103 7
A framework to compare low-bit integer and float-point formats
Python 74 7