Baoxiong Jia @ BIGAI

News

New Invited tutorial at EIS 2025 hosted by ACM SIGEMBED China, checkout the slides!
New Invited talk at EAIRCon 2025 on 3D Gaussian World Models, checktout the slides!
New SceneWeaver receives the Best Paper at RoboGen@IROS25, checkout the slides and talk (EN)!
New We won the first place at the IROS 25 UniTree Dancing Challenge!
New RoboVerse receives the Best Open-source Award at RoboGen@IROS25!
2025/10 Invited talk at HKU and 3DCVer on UniFP and COLA, checktout the slides and talk (CN)!
2025/09 UniFP receives the Best Paper Award at CoRL 2025! Oral talk available here!
2025/09 One paper on Agentic 3D Scene Generation is accepted by NeurIPS 2025.
2025/08 We won the of humanoid dancing champion at World Humanoid Robot Games (WHRG)!
2025/06 One paper on Unified Force and Position Control is accepted by CoRL 2025 as Oral!
2025/06 Two papers on 4D World Model and Embodied Vision Language are accepted by ICCV 2025!
2025/06 I’m co-organizing the 5th 3D Scene Understanding workshop at CVPR 2025. See you in Nashvile!
2025/04 RoboVerse is accepted by RSS 2025! Go check it out here!
2025/03 I recently gave a summary of our work at BostonDynamics. Checktout the slides!
2025/02 Four papers on 3D Scene Understanding and Reconstruction are accepted by CVPR 2025!
2025/01 Two papers on Mobile Manipulation and Articulated Part Generation are accepted by ICRA 2025!
2025/01 One paper on Articulated Object Reconstruction is accepted by ICLR 2025!

Selected Recent Publications (All publications)

SceneWeaver: All-in-One 3D Scene Synthesis with an Extensible and Self-Reflective Agent

Advances in Neural Information Processing Systems (NeurIPS) 2025 (RoboGen@IROS 2025 Best Paper Award)
(* indicates equal contribution. # indicates corresponding author.)

Learning Unified Force and Position Control for Legged Loco-Manipulation

Conference on Robot Learning (CoRL) 2025 (Best Paper Award)
(* indicates equal contribution. # indicates corresponding author.)

GWM: Toward Scalable Gaussian World Models for Robotic Manipulation

International Conference on Computer Vision (ICCV) 2025
(* indicates equal contribution. # indicates corresponding author.)

MetaScenes: Towards Automated Replica Creation for Real-world 3D Scans

Huangyue Yu* , Baoxiong Jia* , Yixin Chen* , Yandan Yang , Puhao Li , Rongpeng Su , Jiaxin Li , Qing Li , Wei Liang , Song-Chun Zhu , Tengyu Liu , Siyuan Huang .

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2025
(* indicates equal contribution.)

Unveiling the Mist over 3D Vision-Language Understanding: Object-centric Evaluation with Chain-of-Analysis

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2025
(* indicates equal contribution.)

Masked Point-Entity Contrast for Open-Vocabulary 3D Scene Understanding

IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2025
(* indicates equal contribution.)

RoboVerse: Towards a Unified Platform, Dataset and Benchmark for Scalable and Generalizable Robot Learning

Haoran Geng* , Feishi Wang* , Songlin Wei* , Yuyang Li* , Bangjun Wang* , Boshi An* , Charlie Tianyue Cheng* , Haozhe Lou , Peihao Li , Yen-Jen Wang , Yutong Liang , Dylan Goetting , Chaoyi Xu , Haozhe Chen , Yuxi Qian , Yiran Geng , Jiageng Mao , Weikang Wan , Mingtong Zhang , Jiangran Lyu , Siheng Zhao , Jiazhao Zhang , Jialiang Zhang , Chengyang Zhao , Haoran Lu , Yufei Ding , Ran Gong , Yuran Wang , Yuxuan Kuang , Ruihai Wu , Baoxiong Jia , Carlo Sferrazza , Hao Dong , Siyuan Huang# , Yue Wang# , Jitendra Malik# , Pieter Abbeel# .

Robotics Science and Systems (RSS) 2025 (RoboGen@IROS2 2025 Best Open-source Award)
(* indicates equal contribution.)

Buliding Interactable Replicas of Complex Articulated Objects via Gaussian Splatting

International Conference on Learning Representations (ICLR) 2025
(* indicates equal contribution.)

Closed-Loop Open-Vocabulary Mobile Manipulation with GPT-4V

International Conference on Robotics and Automation (ICRA) 2025
(* indicates equal contribution. # indicates corresponding author.)

MSR3D: Multi-modal Situated Reasoning in 3D Scenes

Advances in Neural Information Processing Systems (NeurIPS) 2024
(* indicates equal contribution. # indicates corresponding author.)

SceneVerse: Scaling 3D Vision-Language Learning for Grounded Scene Understanding

European Conference on Computer Vision (ECCV) 2024
OpenSUN3D @ ECCV 2024 (* indicates equal contribution)

SlotLifter: Slot-guided Feature Lifting for Learning Object-centric Radiance Fields

European Conference on Computer Vision (ECCV) 2024
Wild3D @ ECCV 2024 (* indicates equal contribution.)

An Embodied Generalist Agent in 3D World

International Conference on Machine Learning (ICML) 2024
GenAI4DM & AGI @ ICLR 2024 (* indicates equal contribution.)

PhyScene: Physically Interactable 3D Scene Synthesis for Embodied AI

Conference on Computer Vision and Pattern Recognition (CVPR) 2024 (Highlight)
AI3DG @ CVPR 2024 (* indicates equal contribution.)

Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene Affordance

Conference on Computer Vision and Pattern Recognition (CVPR) 2024 (Highlight)
HuMoGen @ CVPR 2024