bfshi - Overview

Skip to content

Navigation Menu

Sign in

Appearance settings

View bfshi's full-sized avatar

Baifeng Shi bfshi

Block or report bfshi

Pinned Loading

  1. Scaling Vision Pre-Training to 4K Resolution

    Python 221 10

  2. When do we not need larger vision models?

    Python 413 15

  3. VILA is a family of state-of-the-art vision language models (VLMs) for diverse multimodal AI tasks across the edge, data center, and cloud.

    Python 3.8k 315

  4. Official code for "TOAST: Transfer Learning via Attention Steering"

    Python 188 10

  5. Official code for "Top-Down Visual Attention from Analysis by Synthesis" (CVPR 2023 highlight)

    Jupyter Notebook 167 13