Roadmap
Functionality
- Batched inference
- Fine-grained KV cache management
- Explore tree sparsity
- Fine-tune Medusa heads together with LM head from scratch
- Distill from any model without access to the original training data
Medusa/ROADMAP.md at main · FasterDecoding/Medusa
{{ message }}
FasterDecoding / Medusa Public