ChenMnZ - Overview

Pinned Loading

[ICLR2024 spotlight] OmniQuant is a simple and powerful quantization technique for LLMs.

Python 892 77
[ACL 2025 Main] EfficientQAT: Efficient Quantization-Aware Training for Large Language Models

Python 336 30
An algorithm for weight-activation quantization (W4A4, W4A8) of LLMs, supporting both static and dynamic quantization

Python 170 16
(AAAI 2023 Oral) Pytorch implementation of "CF-ViT: A General Coarse-to-Fine Method for Vision Transformer"

Python 107 8
[ICCV 23]An approach to enhance the efficiency of Vision Transformer (ViT) by concurrently employing token pruning and token merging techniques, while incorporating a differentiable compression rate.

Jupyter Notebook 103 7
A framework to compare low-bit integer and float-point formats

Python 74 7