GR-MG
This repo contains code for the paper:
Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy
Peiyan Li, Hongtao Wu*‡, Yan Huang*, Chilam Cheang, Liang Wang, Tao Kong
*Corresponding author ‡ Project lead
🌐 Project Website | 📄 Paper
News
- (🔥 New) (2024.12.18) Our paper was accepted by IEEE Robotics and Automation Letter (RA-L) !
- (🔥 New) (2024.08.27) We have released the code and checkpoints of GR-MG !
Preparation
Note: We only test GR-MG with CUDA 12.1 and python 3.9
# clone this repository git clone https://github.com/bytedance/GR-MG.git cd GR_MG # install dependencies for goal image generation model bash ./goal_gen/install.sh # install dependencies for multi-modal goal conditioned policy bash ./policy/install.sh
Download the pretrained InstructPix2Pix weights from Huggingface and save them in resources/IP2P/.
Download the pretrained MAE encoder mae_pretrain_vit_base.pth and save it in resources/MAE/.
Download and unzip the CALVIN dataset.
Checkpoints
Training
1. Train Goal Image Generation Model
# modify the variables in the script before you execute the following instruction
bash ./goal_gen/train_ip2p.sh ./goal_gen/config/train.json2. Pretrain Multi-modal Goal Conditioned Policy
We use the method described in GR-1 and pretrain our policy with Ego4D videos. You can download the pretrained model checkpoint here. You can also pretrain the policy yourself using the scripts we provide. Before doing this, you'll need to download the Ego4D dataset.
# pretrain multi-modal goal conditioned policy
bash ./policy/main.sh ./policy/config/pretrain.json3. Train Multi-modal Goal Conditioned Policy
After pretraining, modify the pretrained_model_path in /policy/config/train.json and execute the following instruction to train the policy.
# train multi-modal goal conditioned policy
bash ./policy/main.sh ./policy/config/train.jsonEvaluation
To evaluate our model on CALVIN, you can execute the following instruction:
# Evaluate GR-MG on CALVIN
bash ./evaluate/eval.sh ./policy/config/train.jsonIn the eval.sh script, you can specify which goal image generation model and policy to use. Additionally, we provide multi-GPU evaluation code, allowing you to evaluate different training epochs of the policy simultaneously.
Contact
If you have any questions about the project, please contact peiyan.li@cripac.ia.ac.cn.
Acknowledgements
We thank the authors of the following projects for making their code and dataset open source:
Citation
If you find this project useful, please star the repository and cite our paper:
@article{li2025gr,
title={GR-MG: Leveraging Partially-Annotated Data Via Multi-Modal Goal-Conditioned Policy},
author={Li, Peiyan and Wu, Hongtao and Huang, Yan and Cheang, Chilam and Wang, Liang and Kong, Tao},
journal={IEEE Robotics and Automation Letters},
year={2025},
publisher={IEEE}
}
