turboderp - Overview

Skip to content

Navigation Menu

Sign in

Appearance settings

Pinned Loading

  1. An optimized quantization and inference library for running LLMs locally on modern consumer-class GPUs

    Python 638 69

  2. A fast inference library for running LLMs locally on modern consumer-class GPUs

    Python 4.5k 327

  3. The official API server for Exllama. OAI compatible, lightweight, and fast.

    Python 1.1k 144

  4. Web UI for ExLlamaV2

    JavaScript 511 46