GitHub - rupurt/llm-http-api: HTTP API for LLM with OpenAI compatibility

GitHub - rupurt/llm-http-api: HTTP API for LLM with OpenAI compatibility

HTTP API for LLM with OpenAI compatibility

Install

> pip install llm
> pip install llm-http-api

Getting Started

Follow the directions from LLM
Run the plugin llm http-api
Visit the OpenAPI documentation localhost:8080/docs

Usage

> llm http-api --help
Usage: llm http-api [OPTIONS]

  Run a FastAPI HTTP server with OpenAI compatibility

Options:
  -h, --host TEXT         [default: 0.0.0.0]
  -p, --port INTEGER      [default: 8080]
  -l, --log-level TEXT    [default: info]
  -r, --reload
  -d, --reload-dirs LIST  [default: src]
  --help                  Show this message and exit.

> curl http://localhost:8080/v1/embeddings -X POST -H "Content-Type: application/json" -d '{
  "input": "Hello world",
  "model": "jina-embeddings-v2-small-en"
}'
{"object":"embedding","embedding":[-0.47561466693878174,-0.4471365511417389,...],"index":0}

Supported OpenAI Endpoints

Models

Embeddings

POST /v1/embeddings

Chat

POST /v1/chat/completions

Unsupported OpenAI Endpoints

A detailed list of unimplemented OpenAI endpoints can be found here

Development

This repository manages the dev environment as a Nix flake and requires Nix to be installed

make deps.install
make deps.install/test

Publish Package to PyPi

Local LLM's

make llm.install/mlc
make llm.setup/mlc
make llm.mlc.download
make llm.mlc.download/code_llama-34b-python-q4f16
make llm.mlc.download/code_llama-34b-instruct-q0f16
make llm.mlc.download/code_llama-13b-q4f16
make llm.mlc.download/code_llama-7b-q4f16
make llm.mlc.download/wizard-coder-15b-q4f32
make llm.mlc.download/open_hermes-2.5-mistral-7b-q4f16
make llm.mlc.download/mistral-7b-instruct-q4f16