GitHub - LYX0501/DiscussNav

DiscussNav

In this work, we introduce a novel zero-shot VLN framework. Within this framework, large models possessing distinct abilities are served as domain experts. Our proposed navigation agent, namely DiscussNav, can actively discuss with these experts to collect essential information before moving at every step. These discussions cover critical navigation subtasks like instruction understanding, environment perception, and completion estimation. The performances on the representative VLN task R2R show that our method surpasses the leading zero-shot VLN model by a large margin on all metrics.

DiscussNav

Requirements

Ubuntu 18.04.6 LTS

Python 3.8.17

Torch 1.13.1

Matterport3DSimulator

Recognize Anything (RAM)

InsturctBLIP

R2R Data

We have prepared R2R Val Unseen data in the tasks/data directory.

Run DiscussNav

BibTex

Please cite our paper if you find it helpful :)

@article{long2023discuss,
  title={Discuss before moving: Visual language navigation via multi-expert discussions},
  author={Long, Yuxing and Li, Xiaoqi and Cai, Wenzhe and Dong, Hao},
  journal={arXiv preprint arXiv:2309.11382},
  year={2023}
}