Multimedia Computing Group, Nanjing University
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Python 1.7k 160
[CVPR 2022 Oral & TPAMI 2024] MixFormer: End-to-End Tracking with Iterative Mixed Attention
Python 517 72
[CVPR 2025] Multiple Object Tracking as ID Prediction
Python 478 39
[NeurIPS 2025 Spotlight] StreamForest: Efficient Online Video Understanding with Persistent Event Memory
Python 141 4
[CVPR 2026] DDT: Decoupled Diffusion Transformer
Python 365 17
SAM 2++: Tracking Anything at Any Granularity
Python 56 5