server/docs at main · SimpleDeepLearning/server

Triton Inference Server Documentation

User Guide

The User Guide describes how to use Triton as an inference solution, including information on how to configure Triton, how to organize and configure your models, how to use the C++ and Python clients, etc.

QuickStart
Model Repository
Model Configuration
Model Pipeline
- Model Ensemble
- Business Logic Scripting (BLS)
Model Management
- Explicit Model Loading and Unloading
- Modifying the Model Repository
Metrics
Framework Custom Operations
- TensorRT
- TensorFlow
- PyTorch
- ONNX
Client Libraries and Examples
- C++ HTTP/GRPC Libraries
- Python HTTP/GRPC Libraries
- Java HTTP Library
- GRPC Generated Libraries
  - go
  - Java/Scala
  - Javascript
Performance Analysis
Jetson and JetPack

Developer Guide

The Developer Guide describes how to build and test Triton and also how Triton can be extended with new functionality.

Build
Protocols and APIs.
Backends
Repository Agents
Test