GitHub - bytedance/GR-MG: Official implementation of GR-MG

GR-MG

This repo contains code for the paper:

Leveraging Partially Annotated Data via Multi-Modal Goal Conditioned Policy

Peiyan Li, Hongtao Wu^*‡, Yan Huang^*, Chilam Cheang, Liang Wang, Tao Kong

^*Corresponding author ^‡ Project lead

🌐 Project Website | 📄 Paper

News

(🔥 New) (2024.12.18) Our paper was accepted by IEEE Robotics and Automation Letter (RA-L) !
(🔥 New) (2024.08.27) We have released the code and checkpoints of GR-MG !

Preparation

Note: We only test GR-MG with CUDA 12.1 and python 3.9

# clone this repository
git clone https://github.com/bytedance/GR-MG.git
cd GR_MG
# install dependencies for goal image generation model
bash ./goal_gen/install.sh
# install dependencies for multi-modal goal conditioned policy
bash ./policy/install.sh

Download the pretrained InstructPix2Pix weights from Huggingface and save them in resources/IP2P/. Download the pretrained MAE encoder mae_pretrain_vit_base.pth and save it in resources/MAE/. Download and unzip the CALVIN dataset.

Checkpoints

Training

1. Train Goal Image Generation Model

# modify the variables in the script before you execute the following instruction
bash ./goal_gen/train_ip2p.sh  ./goal_gen/config/train.json

2. Pretrain Multi-modal Goal Conditioned Policy

We use the method described in GR-1 and pretrain our policy with Ego4D videos. You can download the pretrained model checkpoint here. You can also pretrain the policy yourself using the scripts we provide. Before doing this, you'll need to download the Ego4D dataset.

# pretrain multi-modal goal conditioned policy
bash ./policy/main.sh  ./policy/config/pretrain.json

3. Train Multi-modal Goal Conditioned Policy

After pretraining, modify the pretrained_model_path in /policy/config/train.json and execute the following instruction to train the policy.

# train multi-modal goal conditioned policy
bash ./policy/main.sh  ./policy/config/train.json

Evaluation

To evaluate our model on CALVIN, you can execute the following instruction:

# Evaluate GR-MG on CALVIN
bash ./evaluate/eval.sh  ./policy/config/train.json

In the eval.sh script, you can specify which goal image generation model and policy to use. Additionally, we provide multi-GPU evaluation code, allowing you to evaluate different training epochs of the policy simultaneously.

Contact

If you have any questions about the project, please contact peiyan.li@cripac.ia.ac.cn.

Acknowledgements

We thank the authors of the following projects for making their code and dataset open source:

Citation

If you find this project useful, please star the repository and cite our paper:

@article{li2025gr,
  title={GR-MG: Leveraging Partially-Annotated Data Via Multi-Modal Goal-Conditioned Policy},
  author={Li, Peiyan and Wu, Hongtao and Huang, Yan and Cheang, Chilam and Wang, Liang and Kong, Tao},
  journal={IEEE Robotics and Automation Letters},
  year={2025},
  publisher={IEEE}
}