Tabrizian - Overview
Skip to content
Sign in
AI CODE CREATION
GitHub CopilotWrite better code with AI
GitHub SparkBuild and deploy intelligent apps
GitHub ModelsManage and compare prompts
MCP RegistryNewIntegrate external tools
View all features
Sign up
NVIDIA
TensorRT LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and supports state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. Tensor…
Python 13.4k 2.3k
The Triton Inference Server provides an optimized cloud and edge inferencing solution.
Python 10.6k 1.8k
Triton backend that enables pre-process, post-processing and other logic to be implemented in Python.
C++ 673 193
Code for "Adaptive Gradient Quantization for Data-Parallel SGD", published in NeurIPS 2020.
Jupyter Notebook 30 5
Triton Model Analyzer is a CLI tool to help with better understanding of the compute and memory requirements of the Triton Inference Server models.
Python 510 85