Vision and Language Group@ MIL

Popular repositories Loading

Deep Modular Co-Attention Networks for Visual Question Answering

Python 458 89
A lightweight, scalable, and general framework for visual question answering research

Python 331 64
A PyTorch reimplementation of bottom-up-attention models

Jupyter Notebook 301 76
Implementation of CVPR 2023 paper "Prompting Large Language Models with Answer Heuristics for Knowledge-based Visual Question Answering".

Python 278 28
imp imp Public

a family of highly capabale yet efficient large multimodal models

Python 193 15
An VideoQA dataset based on the videos from ActivityNet

Python 91 10