gs-olive - Overview
TensorRT-Model-Optimizer TensorRT-Model-Optimizer Public
Forked from NVIDIA/Model-Optimizer
A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment…
Python