VLA-RL: Towards Masterful and General Robotic Manipulation with Scalable Reinforcement Learning
๐ Highlights
-
๐ฏ General Manipulation: Improving OpenVLA-7B with outcome-based multi-task reinforcement learing.
-
โก๏ธ Cutting-edge Architecture: Built with Ray+vLLM+LoRA+FSDP, our codebase delivers both scalability and flexibility.
-
๐ Clean Implementation: Following cleanrl's philosophy, we provide a single-file implementation for easy reading and modification.
-
๐ง Active Development: Work in Progress, let's build it together.
๐ TODO
- Support SERL-style Real-world RL
- Support More Environments (e.g., Roboverse)
- Support More VLAs (e.g., MiniVLA)
๐ ๏ธ Installation
See INSTALL.md for installation instructions.
See ERROR_CATCH.md for error catching.
๐ Quick Start
Before launching distributed training, please edit the script with the appropriate dataset and model paths first.
๐ Training
# bash scripts/train_rl_vllm_ray_fsdp.sh <gpus> <task_ids> # e.g., bash scripts/train_rl_vllm_ray_fsdp.sh 0,1 0,1,2,3,4,5,6,7,8,9
๐งช Evaluation
# parallel evaluation with vectorized environment
bash scripts/eval_vllm_ray.sh 0,1๐ท๏ธ License
This repository is released under the Apache-2.0 license.
๐ Acknowledgement
Our code is built upon open-instruct, OpenRLHF, verl and openvla. We thank all these authors for their nicely open sourced code and their great contributions to the community.
๐ฅฐ Citation
If you find this repository helpful, please consider citing:
@misc{lu2025vlarl,
title={VLA-RL: Towards Masterful and General Robotic Manipulation with Scalable Reinforcement Learning},
author={Guanxing Lu, Chubin Zhang, Haonan Jiang, Yuheng Zhou, Zifeng Gao, Yansong Tang and Ziwei Wang},
year={2025},
howpublished={\url{https://congruous-farmhouse-8db.notion.site/VLA-RL-Towards-Masterful-and-General-Robotic-Manipulation-with-Scalable-Reinforcement-Learning-1953a2cd706280ecaad4e93a5bd2b8e3?pvs=4}},
note={Notion Blog}
}