Haotian Liu

I am a member of technical staff at xAI. I obtained my Ph.D. from University of Wisconsin-Madison in May 2024, under the supervision of Prof. Yong Jae Lee. During my Ph.D., I’ve been fortunate to work with Dr. Chunyuan Li at Microsoft Research. Before this, I obtained my bachler’s degree (with honor) at Zhejiang University, where I worked with Prof. Xiaogang Jin and Prof. Fei Wu.

I am generally interested in computer vision and machine learning. My recent focus is on building steerable large models. The first baby is LLaVA.

I am a core contributor to Grok-1.5V and Grok-2. I led the vision effort of Grok-3 and Grok-3 Reasoning.

selected publications

Blog

LLaVA-NeXT: Improved reasoning, OCR, and world knowledge

Jan 2024
Improved Baselines with Visual Instruction Tuning (LLaVA-1.5)

CVPR, 2024
Visual Instruction Tuning (LLaVA)

NeurIPS, 2023 (Oral, top 0.5%)
Learning Customized Visual Models with Retrieval-Augmented Knowledge

CVPR, 2023 (Highlight, top 2.5%)
GLIGEN: Open-Set Grounded Text-to-Image Generation

CVPR, 2023
ELEVATER: A Benchmark and Toolkit for Evaluating Language-Augmented Visual Models

Chunyuan Li* , Haotian Liu* , Liunian Li , Pengchuan Zhang , Jyoti Aneja , Jianwei Yang, Ping Jin , Houdong Hu , Zicheng Liu , Yong Jae Lee, and Jianfeng Gao

NeurIPS, Datasets and Benchmarks Track, 2022
Masked Discrimination for Self-Supervised Learning on Point Clouds

ECCV, 2022
YolactEdge: Real-time Instance Segmentation on the Edge

ICRA, 2021