turboderp - Overview

turboderp - Overview

Skip to content

Pinned Loading

An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs

Python 638 69
A fast inference library for running LLMs locally on modern consumer-class GPUs

Python 4.5k 327
The official API server for Exllama. OAI compatible, lightweight, and fast.

Python 1.1k 144
Web UI for ExLlamaV2

JavaScript 511 46