smpanaro - Overview

Skip to content

Navigation Menu

Sign in

Appearance settings

Pinned Loading

  1. CLI to demonstrate running a large language model (LLM) on Apple Neural Engine.

    Swift 126 12

  2. See the device (CPU/GPU/ANE) and estimated cost for every layer in your CoreML model.

    Swift 25 3

  3. ModernBERT model optimized for Apple Neural Engine.

    Python 31 2

  4. Run transformers (incl. LLMs) on the Apple Neural Engine.

    Python 62 2

  5. Unofficial implementation of Token Recycling self-speculative decoding method.

    Python 9 1

  6. GPU Adaptive Non-Uniform Quantization (GANQ) Unofficial Implementation

    Python 5