Yunhao (Andy) Ge

DreamZero: World Action Models are Zero-shot Policies

Seonghyeon Ye†, Yunhao Ge*, Kaiyuan Zheng*, Shenyuan Gao*, Sihyun Yu*, George Kurian*, Suneel Indupuru*, You Liang Tan*, Chuning Zhu, Jiannan Xiang, Ayaan Malik, Kyungmin Lee, William Liang, Nadun Ranawaka, Jiasheng Gu, Yinzhen Xu, Guanzhi Wang, Fengyuan Hu, Avnish Narayan, Johan Bjorck, Jing Wang, Gwanghyun Kim, Dantong Niu, Ruijie Zheng, Yuqi Xie, Jimmy Wu, Qi Wang, Ryan Julian, Danfei Xu, Yilun Du, Yevgen Chebotar, Scott Reed, Jan Kautz, Yuke Zhu†, Linxi "Jim" Fan†, Joel Jang†
(*=Core Contributors, †=Project Lead)

[paper] [project page] [code] [huggingface]

Cosmos-Policy: Cosmos-powered Multi-agent Policy Model for Physical AI

Moo Jin Kim, Yihuai Gao, Tsung-Yi Lin, Yen-Chen Lin, Yunhao Ge, Grace Lam, Percy Liang, Shuran Song, Ming-Yu Liu, Chelsea Finn, Jinwei Gu

ICLR 2026.

[paper] [project page] [code] [huggingface]

Top 2% of submissions by average score at ICLR 2026

GR00T N1.6: An Improved Open Foundation Model for Generalist Humanoid Robots

NVIDIA (Yunhao Ge: core contributor)

[research blog] [code] [huggingface]

I-Scene: 3D Instance Models are Implicit Generalizable Spatial Learners

Lu Ling, Yunhao Ge, Yichen Sheng, Aniket Bera

CVPR 2026.

[paper] [project page] [code] [huggingface]

Cosmos-Predict2.5 and Cosmos-Transfer2.5: Improved World Simulation with Video Foundation Models for Physical AI

NVIDIA (Yunhao Ge: core contributor)

[paper] [project page] [code] [huggingface]

Cosmos-Transfer1: Conditional World Generation with Adaptive Multimodal Control

NVIDIA (Yunhao Ge: core contributor)

[paper] [project page] [code] [huggingface] [video]

Cosmos: World Foundation Model Platform for Physical AI

NVIDIA (Yunhao Ge: core contributor)

Best AI + Best overall of CES 2025

[paper] [project page] [code] [huggingface] [video] [Demo API]

Describe Anything: Detailed Localized Image and Video Captioning Long Lian, Yifan Ding, Yunhao Ge, Sifei Liu, Hanzi Mao, Boyi Li, Marco Pavone, Ming-Yu Liu, Trevor Darrell, Adam Yala, Yin Cui
ICCV 2025.

[paper] [code] [project page] [demo]

Edify 3D: Scalable High-Quality 3D Asset Generation

NVIDIA (Yunhao Ge: core contributor)

[paper] [project page] [video]

Edify Image: High-Quality Image Generation with Pixel Space Laplacian Diffusion Models

NVIDIA (Yunhao Ge: core contributor)

[paper] [project page] [video]

GenUSD: 3D Scene Generation Made Easy

Jiashu Xu, Yunhao Ge, Yifan Ding, Yin Cui, Chen-Hsuan Lin, Xiaohui Zeng, Zekun Hao, Zhaoshuo Li, Donglai Xiang, Qianli Ma, Fangyin Wei, JP Lewis, Qinsheng Zhang, Seungjun Nah, Arun Mallya, Jingyi Jin, Hanzi Mao, Yen-Chen Lin, Pooya Jannaty, Tsung-Yi Lin, Ming-Yu Liu

ACM SIGGRAPH Real-Time Live! 2024.

[paper] [project page]

BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation Yunhao Ge^*, Yihe Tang^*, Jiashu Xu^*, Cem Gokmen^*, Chengshu Li, Wensi Ai, Benjamin Jose Martinez, Arman Aydin, Mona Anvari, Ayush K Chakravarthy, Hong-Xing Yu, Josiah Wong, Sanjana Srivastava, Sharon Lee, Shengxin Zha, Laurent Itti, Yunzhu Li, Roberto Martín-Martín, Miao Liu, Pengchuan Zhang, Ruohan Zhang, Li Fei-Fei, Jiajun Wu
(*=equal contribution)
CVPR 2024

[paper] [code] [project page] [tools]

Highlight (top 12% of all accepted papers)

Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation Yunhao Ge, Xiaohui Zeng, Jacob Samuel Huffman, Tsung-Yi Lin, Ming-Yu Liu, Yin Cui
CVPR 2024

[paper] [video] [project page]

DreamDistribution: Prompt Distribution Learning for Text-to-Image Diffusion Models Brian Nlong Zhao, Yuhang Xiao^*, Jiashu Xu^*, Xinyang Jiang, Yifan Yang, Dongsheng Li, Laurent Itti, Vibhav Vineet^†, Yunhao Ge^† (*=co-second author, †=equal contribution)
ICLR 2025.

[paper] [code] [project page]

3D Copy-Paste: Physically-Plausible Object Insertion for Monocular 3D Detection Yunhao Ge, Hong-Xing Yu, Cheng Zhao, Yuliang Guo, Xinyu Huang, Liu Ren, Laurent Itti, Jiajun Wu
NeurIPS 2023

[paper] [code] [project page]

DALL-E for Detection: Language-driven Compositional Image Synthesis for Object Detection
Beyond Generation: Harnessing Text to Image Models for Object Detection and Segmentation Yunhao Ge^*, Jiashu Xu^*, Brian Nlong Zhao, Neel Joshi, Laurent Itti, Vibhav Vineet (*=equal contribution)

[paper(Beyond Generation)] [paper(DALL-E for Detection)] [code]

Improving Zero-shot Generalization and Robustness of Multi-modal Models Yunhao Ge^*, Jie Ren^*, Andrew Gallagher, Yuxiao Wang, Ming-Hsuan Yang, Hartwig Adam, Laurent Itti, Balaji Lakshminarayanan, and Jiaping Zhao (*=equal contribution)
CVPR 2023

[paper] [code] [project page]

Neural-Sim: Learning to Generate Training Data with NeRF Yunhao Ge, Harkirat Behl^*, Jiashu Xu^*, Suriya Gunasekar, Neel Joshi, Yale Song, Xin Wang, Laurent Itti, and Vibhav Vineet (*=equal contribution as second author)
ECCV 2022

[paper] [code]

Zero-shot Synthesis with Group-Supervised Learning Yunhao Ge, Sami Abu-El-Haija, Gan Xin and Laurent Itti
ICLR 2021

[paper] [code] [project page] [Fonts Dataset] [USC Viterbi Press] [知乎] [AI科技评论]
[ USC News ] [ Tech Xplore ] [ Technology Networks ]

NeurIPS 2023, 2022, 2021
CVPR 2023, 2022
ECCV 2022
ICCV 2023, 2021
ICLR 2023, 2022
ICML 2022
WACV 2023