GitHub - MachineLearningSystem/vlash: Real-Time VLAs via Future-state-aware Asynchronous Inference.

VLASH

About

VLASH is an efficient and easy-to-use framework for VLAs fine-tuning and inference.

VLASH is efficient through:

Asynchronous inference for fast reaction and smooth motion in real-time (>30Hz inference frequency for $\pi_{0.5}$ on RTX 5090)
Future-state-awareness to enable stable asynchronous VLA inference without overhead
Action quantization for faster robot execution speed
LoRA with shared observation encoding for efficient fine-tuning on consumer GPUs

VLASH is easy to use with:

conda create -n "vlash" python=3.10
conda activate vlash
conda install ffmpeg=7.1.1 -c conda-forge
pip install -e .

Fine-tune a VLA policy for your task, enabling smooth async inference without overhead:

vlash train examples/train/pi05/async.yaml

Run async inference on a robot:

vlash run examples/inference/async.yaml

Run async inference with 2x speedup:

vlash run examples/inference/async.yaml --action_quant_ratio=2

This project is built upon the following excellent open-source projects: LeRobot, PEFT.

Apache 2.0