Yiming Dou

  • 08/2025: 馃コ Our paper "Cross-Sensor Touch Generation" is selected as an oral presentation at CoRL 2025!
  • 08/2025: 馃帀 One paper accepted to CoRL 2025! See you in Seoul!
  • 02/2025: 馃帀 One paper accepted to CVPR 2025! See you in Nashville!
  • 01/2025: 馃帀 Two papers accepted to ICRA 2025! See you in Atlanta!
  • 09/2024: 馃コ Selected as an Outstanding Reviewer for ECCV 2024!
  • 02/2024: 馃帀 Three papers accepted to CVPR 2024! See you in Seattle!
  • Humans perceive the world with multiple senses, based on which we establish abstract concepts to understand it. From the concepts we develop logical reasoning ability, and thus creating brilliant achievements. Inspired by this, my dream is to design human-like multisensory intelligent systems, which can be divided into four specific problems:

  • Multimodal Perception: how to perceive and model the multimodal physical world.
  • Concept Learning: how to abstract the perceived information into high-level concepts.
  • Reasoning: how to perform causal reasoning on the basis of concepts.
  • Robot Learning: how to enable robots to actively interact with the real-world environments and humans.
  • Publications ( / )

    (* indicates equal contribution)

    Cross-Sensor Touch Generation
    Samanta Rodriguez*, Yiming Dou*, Miquel Oller, Andrew Owens, Nima Fazeli
    CoRL 2025 (Oral)
    paperproject page

    We learn to translate touch signals captured from one touch sensor to another, which allows us to transfer object manipulation policies between sensors.

    Hearing Hands: Generating Sounds from Physical Interactions in 3D Scenes
    Yiming Dou, Wonseok Oh, Yuqing Luo, Antonio Loquercio, Andrew Owens
    CVPR 2025
    paperproject pagecode

    We make 3D scene reconstruction interactive by predicting the sounds of human hands physically interacting with the scene.

    Tactile-Augmented Radiance Fields
    Yiming Dou, Fengyu Yang, Yi Liu, Antonio Loquercio, Andrew Owens
    CVPR 2024
    paperproject pagecode

    We present a visuo-tactile 3D scene representation that can estimate the visual and tactile signals for a given 3D position within the scene.

    The ObjectFolder Benchmark: Multisensory Learning with Neural and Real Objects
    Ruohan Gao*, Yiming Dou*, Hao Li*, Tanmay Agarwal, Jeannette Bohg, Yunzhu Li, Li Fei-Fei, Jiajun Wu
    CVPR 2023
    paperproject pagecodeinteractive demovideo

    We introduce a benchmark suite for multisensory object-centric learning with sight, sound, and touch. We also introduce a dataset including the multisensory measurements for real-world objects

  • Conference reviewer: ICRA (2025), ICLR (2025, 2026), AAAI (2025), ECCV (2024), CVPR (2023, 2024, 2025, 2026), ICCV (2023, 2025), ACMMM (2025)
  • Journal reviewer: IEEE RA-L (2025)
  • Outstanding Reviewer, ECCV 2024
  • Zhiyuan Scholarship (top 30 students), SJTU, 2023
  • Outstanding Graduate, SJTU, 2023
  • Zhiyuan Honors Scholarship (top 5%), SJTU, 2019-2022
  • As a person working on building multisensory systems, I also enjoy being a multisensory embodied agent outside of work:

  • 馃憗 Photography: I've been learning to take photos since I was 7 years old, and have been fortunate to capture some impressive moments along the way. See some of them here!
  • 馃憘 Classical music: I love listening to classical music, especially those from the Viennese Classic period to the Romantic period. (alphabetical order)
  • 馃挭 Tennis: Despite having been playing for 2+ years, I still regard myself as a beginner -- probably around NTRP level 3.0? -- but I really enjoy it and look forward to getting better!