inference-manager
The inference-manager manages inference runtimes (e.g., vLLM and Ollama) in containers, load models, and process requests.
Inferece Request flow.
Please see inference_request_flow.md.
GitHub - llmariner/inference-manager: Inference Manager
{{ message }}
llmariner / inference-manager Public
| Name | Name | Last commit message | Last commit date | |
|---|---|---|---|---|
Latest commitHistory | ||||
The inference-manager manages inference runtimes (e.g., vLLM and Ollama) in containers, load models, and process requests.
Please see inference_request_flow.md.
Inference Manager