Vision and Language Group@ MIL

Popular repositories Loading

  1. Deep Modular Co-Attention Networks for Visual Question Answering

    Python 458 89

  2. A lightweight, scalable, and general framework for visual question answering research

    Python 331 64

  3. A PyTorch reimplementation of bottom-up-attention models

    Jupyter Notebook 301 76

  4. Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".

    Python 278 28

  5. imp imp Public

    a family of highly capabale yet efficient large multimodal models

    Python 193 15

  6. An VideoQA dataset based on the videos from ActivityNet

    Python 91 10