Baoxiong Jia @ BIGAI
News
- New Invited tutorial at EIS 2025 hosted by ACM SIGEMBED China, checkout the slides!
- New Invited talk at EAIRCon 2025 on 3D Gaussian World Models, checktout the slides!
- New SceneWeaver receives the Best Paper at RoboGen@IROS25, checkout the slides and talk (EN)!
- New We won the first place at the IROS 25 UniTree Dancing Challenge!
- New RoboVerse receives the Best Open-source Award at RoboGen@IROS25!
- 2025/10 Invited talk at HKU and 3DCVer on UniFP and COLA, checktout the slides and talk (CN)!
- 2025/09 UniFP receives the Best Paper Award at CoRL 2025! Oral talk available here!
- 2025/09 One paper on Agentic 3D Scene Generation is accepted by NeurIPS 2025.
- 2025/08 We won the of humanoid dancing champion at World Humanoid Robot Games (WHRG)!
- 2025/06 One paper on Unified Force and Position Control is accepted by CoRL 2025 as Oral!
- 2025/06 Two papers on 4D World Model and Embodied Vision Language are accepted by ICCV 2025!
- 2025/06 I’m co-organizing the 5th 3D Scene Understanding workshop at CVPR 2025. See you in Nashvile!
- 2025/04 RoboVerse is accepted by RSS 2025! Go check it out here!
- 2025/03 I recently gave a summary of our work at BostonDynamics. Checktout the slides!
- 2025/02 Four papers on 3D Scene Understanding and Reconstruction are accepted by CVPR 2025!
- 2025/01 Two papers on Mobile Manipulation and Articulated Part Generation are accepted by ICRA 2025!
- 2025/01 One paper on Articulated Object Reconstruction is accepted by ICLR 2025!
Selected Recent Publications (All publications)

SceneWeaver: All-in-One 3D Scene Synthesis with an Extensible and Self-Reflective Agent
Advances in Neural Information Processing Systems (NeurIPS) 2025 (RoboGen@IROS 2025 Best Paper Award)
(* indicates equal contribution. # indicates corresponding author.)

Learning Unified Force and Position Control for Legged Loco-Manipulation
Conference on Robot Learning (CoRL) 2025 (Best Paper Award)
(* indicates equal contribution. # indicates corresponding author.)

GWM: Toward Scalable Gaussian World Models for Robotic Manipulation
International Conference on Computer Vision (ICCV) 2025
(* indicates equal contribution. # indicates corresponding author.)

MetaScenes: Towards Automated Replica Creation for Real-world 3D Scans
Huangyue Yu* , Baoxiong Jia* , Yixin Chen* , Yandan Yang , Puhao Li , Rongpeng Su , Jiaxin Li , Qing Li , Wei Liang , Song-Chun Zhu , Tengyu Liu , Siyuan Huang .
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2025
(* indicates equal contribution.)
![]()
Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2025
(* indicates equal contribution.)

Masked Point-Entity Contrast for Open-Vocabulary 3D Scene Understanding
IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2025
(* indicates equal contribution.)

RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning
Haoran Geng* , Feishi Wang* , , Yuyang Li* , Bangjun Wang* , Boshi An* , Charlie Tianyue Cheng* , Haozhe Lou , Peihao Li , Yen-Jen Wang , Yutong Liang , Dylan Goetting , Chaoyi Xu , Haozhe Chen , Yuxi Qian , Yiran Geng , Jiageng Mao , Weikang Wan , Mingtong Zhang , Jiangran Lyu , Siheng Zhao , Jiazhao Zhang , Jialiang Zhang , Chengyang Zhao , Haoran Lu , Yufei Ding , Ran Gong , , Yuxuan Kuang , Ruihai Wu , Baoxiong Jia , Carlo Sferrazza , Hao Dong , Siyuan Huang# , Yue Wang# , Jitendra Malik# , Pieter Abbeel# .
Robotics Science and Systems (RSS) 2025 (RoboGen@IROS2 2025 Best Open-source Award)
(* indicates equal contribution.)

Buliding Interactable Replicas of Complex Articulated Objects via Gaussian Splatting
International Conference on Learning Representations (ICLR) 2025
(* indicates equal contribution.)

Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V
International Conference on Robotics and Automation (ICRA) 2025
(* indicates equal contribution. # indicates corresponding author.)

MSR3D: Multi-modal Situated Reasoning in 3D Scenes
Advances in Neural Information Processing Systems (NeurIPS) 2024
(* indicates equal contribution. # indicates corresponding author.)

SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding
European Conference on Computer Vision (ECCV) 2024
OpenSUN3D @ ECCV 2024 (* indicates equal contribution)

SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields
European Conference on Computer Vision (ECCV) 2024
Wild3D @ ECCV 2024 (* indicates equal contribution.)

An Embodied Generalist Agent in 3D World
International Conference on Machine Learning (ICML) 2024
GenAI4DM & AGI @ ICLR 2024 (* indicates equal contribution.)

PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI
Conference on Computer Vision and Pattern Recognition (CVPR) 2024 (Highlight)
AI3DG @ CVPR 2024 (* indicates equal contribution.)

Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance
Conference on Computer Vision and Pattern Recognition (CVPR) 2024 (Highlight)
HuMoGen @ CVPR 2024