Xindi Wu

Xindi(Cindy) Wu

My research focuses on data for efficient multimodal machine learning, developing data-centric approaches that trace model performance back to training data composition and introduce data-driven capabilities to build more scalable and capable vision-language systems.

My work sits at the intersection of three core pillars:

I am happy to collaborate and answer questions about my research, feel free to send me an email. I especially encourage students from underrepresented groups to reach out.

Email /  CV /  Google Scholar /  LinkedIn /  GitHub  /  X

profile photo

Selected Publications & Preprints

Preprint 2026

Preprint 2026, ICCV Workshop on Reliable and Interactable World Models 2025

ICLR 2026

ICLR 2026

CVPR 2026

Preprint 2025

NeurIPS Datasets & Benchmarks 2024, ECCV Knowledge in Generative Models Workshop (Spotlight)

TMLR 2024, ECCV Dataset Distillation Workshop (Best Paper)

CVPR 2023

KDD 2022

CVPR 2022

CVPR 2020 (Oral)

CVPR Workshop, 2020

BIBM 2019 (Oral)

Computational Biology, Codon Publications, Brisbane, Australia, 2019

ICIP 2019

DSP 2018