gs-olive - Overview

TensorRT-Model-Optimizer TensorRT-Model-Optimizer Public

Forked from NVIDIA/Model-Optimizer

A unified library of state-of-the-art model optimization techniques like quantization, pruning, distillation, speculative decoding, etc. It compresses deep learning models for downstream deployment…

Python