Sigmund Hennum Høeg

I’m a Ph.D. graduate at the Robotics and Engineering Design Group at the Norwegian University of Science and Technology (NTNU), where I focus on applying learning methods to robotic planning and control.

Research Interests

My research centers around Imitation Learning, specifically the use of Flow- and Diffusion Models for robotic planning and control. While these methods are powerful out of the box, important and interesting challenges arise when we apply these models to robots. I’ve worked on models that enhance prediction speed and enable more complex, long-horizon planning.

My most recent work includes:

Streaming Diffusion Policy (ICRA 2025): A novel inference paradigm for diffusion-based policies for robotic visuomotor control.
Hybrid Diffusion Planning (Under review): A diffusion-based planner that achieves significantly higher success rates on long-horizon tasks than baselines by concurrently constructing a high-level symbolic plan.

In addition, I have collaborated on several other research projects. I’m currently finishing my Ph.d. thesis, so if my experience seems interesting, please reach out to me!

Background

I completed my master’s degree at NTNU, where I also had the opportunity to include an academic exchange at ETH Zürich. I have completed coursework in Machine Learning, Robotics, and Computer Vision. My master’s thesis focused on Reinforcement Learning methods for robotic grasping, comparing different algorithms and discussing the challenges of applying RL to robotic manipulation tasks.

Download my CV (PDF)

selected publications

RSS Workshop

Hybrid Diffusion for Simultaneous Symbolic and Continuous Planning

2nd Workshop on Semantic Reasoning and Goal Understanding in Robotics (SemRob) at Robotics Science and Systems Conference (RSS 2025), 2025

Constructing robots to accomplish long-horizon tasks is a long-standing challenge within artificial intelligence. Approaches using generative methods, particularly Diffusion Models, have gained attention due to their ability to model continuous robotic trajectories for planning and control. However, we show that these models struggle with long-horizon tasks that involve complex decision-making and, in general, are prone to confusing different modes of behavior, leading to failure. To remedy this, we propose to augment continuous trajectory generation by simultaneously generating a high-level symbolic plan. We show that this requires a novel mix of discrete variable diffusion and continuous diffusion, which dramatically outperforms the baselines. In addition, we illustrate how this hybrid diffusion process enables flexible trajectory synthesis, allowing us to condition synthesized actions on partial and complete discrete conditions.
Under Review

NoisyBCT: Robust and Reactive Imitation Learning from Image Sequences

Aksel Vaaler, Sigmund Hennum Høeg, Helle Stige, and Christian Holden

2025

Under review

Robotic imitation learning (IL) in dynamic environments—where object positions or external forces change unpredictably—poses a major challenge for current state-of-the-art methods. These methods often rely on multi-step, open-loop action execution for temporal consistency, but this approach hinders reactivity and adaptation under dynamic conditions. We propose Noise Augmented Behavior Cloning Transformer (NoisyBCT), a robust and responsive IL method that predicts single-step actions based on a sequence of past image observations. To mitigate the susceptibility to covariate shift that arises from longer observation horizons, NoisyBCT injects adversarial noise into low-dimensional spatial image embeddings during training. This enhances robustness to out-of-distribution states while preserving semantic content. We evaluate NoisyBCT on three simulated manipulation tasks and one real-world task, each featuring dynamic disturbances. NoisyBCT consistently outperforms the vanilla BC Transformer and the state-of-the-art Diffusion Policy across all environments. Our results demonstrate that NoisyBCT enables both temporally consistent and reactive policy learning for dynamic robotic tasks.
RSS Workshop

Flexible Multitask Learning with Factorized Diffusion Policy

Chaoqi Liu, Haonan Chen, Sigmund Hennum Høeg, Shaoxiong Yao, Yunzhu Li, Kris Hauser, and Yilun Du

2nd Workshop on Semantic Reasoning and Goal Understanding in Robotics (SemRob) at Robotics Science and Systems Conference (RSS 2025), 2025

Spotlight

In recent years, large-scale behavioral cloning has emerged as a promising paradigm for training general-purpose robot policies. However, effectively fitting policies to complex task distributions is often challenging, and existing models often underfit the action distribution. In this paper, we present a novel modular diffusion policy framework that factorizes modeling the complex action distributions as a composition of specialized diffusion models, each capturing a distinct sub-mode of the multimodal behavior space. This factorization enables each composed model to specialize and capture a subset of the task distribution, allowing the overall task distribution to be more effectively represented. In addition, this modular structure enables flexible policy adaptation to new tasks by simply fine-tuning a subset of components or adding new ones for novel tasks, while inherently mitigating catastrophic forgetting. Empirically, across both simulation and real-world robotic manipulation settings, we illustrate how our method consistently outperforms strong modular and monolithic baselines, achieving a 24% average relative improvement in multitask learning and a 34% improvement in task adaptation across all settings.
ICRA

Streaming Diffusion Policy: Fast Policy Synthesis with Variable Noise Diffusion Models

Sigmund Hennum Høeg, Yilun Du, and Olav Egeland

2025 IEEE International Conference on Robotics and Automation (ICRA), 2025

Introduction of a fast diffusion-based robotic control policy. The method enables real-time robot control while maintaining the quality of diffusion-based policies, demonstrating strong performance across various robotic manipulation tasks.
CoRL Workshop

More than eleven thousand words: Towards using language models for robotic sorting of unseen objects into arbitrary categories

Sigmund Hennum Høeg and Lars Tingelstad

Workshop on Language and Robotics at Conference on Robot Learning (CoRL), 2022

Analyzing the performance of language models on sorting unseen objects into arbitrary categories. Measuring performance metrics, and discussing failure modes.
Master’s Thesis

Learning to grasp: A study of learning-based methods for robotic grasping

Sigmund Hennum Høeg

2022

A study of Reinforcement Learning methods for robotic grasping. We compare the performance of different methods, and discuss the challenges of applying RL algorithms to robotic grasping. Using Robosuite as a simulated benchmark.