llm-d

Welcome to llm-d: a Kubernetes-native high-performance distributed LLM inference framework

GitHub Org's stars Documentation License

Join Slack X (formerly Twitter) Follow Bluesky LinkedIn Reddit YouTube

llm-d is a Kubernetes-native high-performance distributed LLM inference framework that provides the fastest time-to-value and competitive performance per dollar. Built on vLLM, Kubernetes, and Inference Gateway, llm-d offers modular solutions for distributed inference with features like KV-cache aware routing and disaggregated serving.

๐Ÿš€ Quick Start Guide

New to llm-d? Here's how to get started:

  1. Join our Slack ๐Ÿ’ฌ โ†’ Get your invite and visit llm-d.slack.com
  2. Explore our code ๐Ÿ“‚ โ†’ GitHub Organization
  3. Join a meeting ๐Ÿ“… โ†’ Add calendar
  4. Pick your area ๐ŸŽฏ โ†’ Browse Special Interest Groups.

๐Ÿ“š Key Resources

๐Ÿ’ฌ Communication Channels

๐Ÿ—“๏ธ Regular Meetings

All meetings are open to the public! ๐ŸŒŸ

  • ๐Ÿ“… Weekly Standup: Every Wednesday at 12:30pm ET - Project updates and open discussion
  • ๐ŸŽฏ SIG Meetings: Various times throughout the week - See SIG details for schedules

Join to participate, ask questions, or just listen and learn!

๐ŸŽฏ Special Interest Groups (SIGs)

Want to dive deeper into specific areas? Our Special Interest Groups are focused teams working on different aspects of llm-d:

  • Inference Scheduler - Intelligent request routing and load balancing
  • Benchmarking - Performance testing and optimization
  • PD-Disaggregation - Prefill/decode separation patterns
  • KV-Disaggregation - KV caching and distributed storage
  • Installation - Kubernetes integration and deployment
  • Autoscaling - Traffic-aware autoscaling and resource management
  • Observability - Monitoring, logging, and metrics

View more SIG Details โ†’

๐Ÿค How to Contribute

Getting Involved

Contributing Code

  1. Read Guidelines: Review our Code of Conduct and contribution process
  2. Sign Commits: All commits require DCO sign-off (git commit -s)

Ways to Contribute

  • ๐Ÿ› Bug fixes and small features - Submit PRs directly to component repos
  • ๐Ÿš€ New features with APIs - Require project proposals
  • ๐Ÿ“š Documentation - Help improve guides and examples
  • ๐Ÿงช Testing & Benchmarking - Contribute to our test coverage
  • ๐Ÿ’ก Experimental features - Start in llm-d-incubation org

๐Ÿ”’ Security & Safety

๐ŸŒ Connect With Us

Follow llm-d across social platforms for updates, discussions, and community highlights:

โ“ Need Help?

Questions? Ideas? Just want to chat? We're here to help! The llm-d community team is friendly and responsive.


License: Apache 2.0