In This Story
The IEEE International Conference on Robotics and Automation (ICRA) will take place June 1-5, 2026 in Vienna, Austria. As a leading forum for robotics, ICRA brings together top researchers and industry experts for collaboration, sharing insights, and showcasing research results to advance robotics innovation. RAI Institute research groups presented numerous papers at this year’s ICRA, spanning a broad range of subjects, including stunts performed by bike robots and sampling-based model predictive control.
Planning-Guided Diffusion Policy Learning for Contact-Rich Bimanual Object Reorientation
Contact-rich bimanual manipulation involves precise coordination of two arms to change object states through strategically selected contacts and motions. Due to the inherent complexity of these tasks, acquiring sufficient demonstration data and training policies that generalize to unseen scenarios remains a largely unresolved challenge.
Building on recent advances in planning through contacts, we introduce Planning-Guided Diffusion Policy Learning (LIDE), an approach that effectively learns to solve contact-rich bimanual manipulation tasks by leveraging model-based motion planners, generating demonstration data in high-fidelity physics simulation. Through efficient planning in randomized environments, our approach generates large-scale and high-quality synthetic motion trajectories for tasks involving diverse objects and transformations. We then train a task conditioned diffusion policy via behavior cloning using these demonstrations. To reduce the sim-to-real gap, we propose a set of designs in feature extraction, action prediction, and data augmentation that enable learning robust prediction of smooth action sequences and generalization to unseen scenarios. Through experiments in both simulation and the real world, we demonstrate that our approach can enable a bimanual robotic system to effectively manipulate objects of diverse geometries, dimensions, and physical properties.
Robotic Dexterous Manipulation via Anisotropic Friction Modulation using Passive Rollers
Controlling friction at the fingertip is fundamental to dexterous manipulation, yet remains difficult to realize in robotic hands. We present the design and analysis of a robotic fingertip equipped with passive rollers that can be selectively braked or pivoted to modulate contact friction and constraint directions.
When unbraked, the rollers permit unconstrained sliding of the contact point along the rolling direction. When braked, they resist motion like a conventional fingertip. The rollers are mounted on a pivoting mechanism, allowing reorientation of the constraint frame to accommodate different manipulation tasks. We develop a constraint-based model of the fingertip integrated into a parallel-jaw gripper and analyze its ability to support diverse manipulation strategies. Experiments show that the proposed design enables a wide range of dexterous actions that are conventionally challenging for robotic grippers, including sliding and pivoting within the grasp, robust adaptation to uncertain contacts, multi-object or multi-part manipulation, and interactions requiring asymmetric friction across fingers. These results demonstrate the versatility of passive roller fingertips as a low-complexity, mechanically efficient approach to friction modulation, advancing the development of more adaptable and robust robotic manipulation.
Judo: A User-Friendly Open-Source Package for Sampling-Based Model Predictive Control
Sampling-based model predictive control (MPC) is experiencing a resurgence in robotics following both recent hardware successes and advancements in parallelized physics simulation. However, to build on this progress, the robotics community needs to develop shared tools for prototyping, benchmarking, and deploying sampling-based controllers. We introduce judo, a software package designed to address this need.
To facilitate rapid prototyping and evaluation, judo provides robust implementations of common sampling-based MPC algorithms and a comprehensive suite of benchmark tasks. It emphasizes usability with simple but extensible interfaces for controller and task definitions, asynchronous execution for straightforward simulation-to-hardware transfer, and a highly customizable interactive GUI for tuning controllers interactively. While the high-level library is written in Python, judo leverages MuJoCo as its physics backend to achieve real-time performance.
We present example benchmarking results using judo to compare standard sampling-based controllers across its tasks. We also provide real-world case studies in deploying judo on hardware for two contact-rich tasks: in-hand cube rotation and quadrupedal loco-manipulation.
Flip Stunts on Bike-Like Robots using Iterative Motion Imitation
Wheeled and legged robots have recently demonstrated remarkable dynamic capabilities, driven in large part by advances in Reinforcement Learning (RL) and motion imitation. By tracking demonstrations from motion capture, animal locomotion, or model-based controllers, robots can learn parkour and agile behaviors.
However, if the original motion references are dynamically or kinematically infeasible, the imitation policy may fail to train or lead to unsafe behaviors not suitable for real-world deployment. Blindly tracking these infeasible references may exceed the robot’s torque or
joint limits or cause self-collisions.To address this, we propose Iterative Motion Imitation (IMI), a method that iteratively imitates trajectories generated by prior policy rollouts.
Starting from an initial reference that is kinematically or dynamically infeasible, IMI helps train policies that lead to feasible and agile behaviors. We demonstrate our method on Ultra-Mobility Vehicle (UMV), a bicycle robot that is designed to enable agile behaviors. From a self-colliding table-to-ground flip reference generated by a model-based controller, we are able to train policies that enable ground-to-ground and ground-to-table front-flips. We show that compared to a single-shot motion imitation, IMI results in policies with higher success rates and can transfer robustly to the real world. To our knowledge, this is the first unassisted acrobatic flip behavior on such a platform.
NovaFlow: Zero-Shot Manipulation via Actionable Flow from Generated Videos
Enabling robots to execute novel manipulation tasks zero-shot is a central goal in robotics. Most existing methods assume in-distribution tasks or rely on fine-tuning with embodiment-matched data, limiting transfer across platforms. We present NovaFlow, an autonomous manipulation framework that converts a task description into an actionable plan for a target robot without any demonstrations.
Given a task description, NovaFlow synthesizes a video using a video generation model and distills it into 3D actionable object flow using off-the-shelf perception modules. From the object flow, it computes relative poses for rigid objects and realizes them as robot actions via grasp proposals and trajectory optimization; For deformable objects, this flow serves as a tracking objective for model-based planning with a particle-based dynamics model. By decoupling task understanding from low-level control, NovaFlow naturally transfers across embodiments. We validate on rigid, articulated, and deformable object manipulation tasks using a tabletop Franka arm and a Spot quadrupedal mobile robot, and achieve effective zero-shot execution without demonstrations or embodiment-specific training.
Lazy Anytime Planning for the Dubins Moving Target Traveling Salesman Problem with Obstacles
The Dubins Moving Target Traveling Salesman Problem with Obstacles (Dubins MT-TSP-O) seeks an obstacle-free trajectory for an agent with a fixed speed and minimum turning radius that intercepts several moving targets. To tackle this NP-hard problem, we introduce the Lazy Iterated Random Generalized TSP (Lazy IRG) algorithm.
Each iteration of Lazy IRG samples a set of possible interception points in space-time along the trajectories of the targets. Lazy IRG then manages the high computational cost of motion planning by alternating between two steps: first, it optimistically selects a sequence of interception points by solving a Generalized TSP (GTSP) assuming an obstacle-free world; second, it searches for obstacle-free trajectories between consecutive points in the sequence using an obstacle-aware RRT-Connect planner. If a trajectory is not found, Lazy IRG solves the GTSP again; otherwise, Lazy IRG enters its next iteration and samples new interception points. By deferring expensive collision-checking, our method efficiently focuses computational effort on the most promising solutions. Numerical results show that Lazy IRG finds significantly lower-cost solutions within a 1-minute time budget compared to the existing IRG-PGLNS algorithm.
Real-is-Sim: Bridging the Sim-to-Real Gap with a Dynamic Digital Twin
We introduce real-is-sim, a new approach to integrating simulation into behavior cloning pipelines. In contrast to real-only methods, which lack the ability to safely test policies before deployment, and sim-to-real methods, which require complex adaptation to cross the sim-to-real gap, our framework allows policies to seamlessly switch between running on real hardware and running in parallelized virtual environments.
At the center of real-is-sim is a dynamic digital twin, powered by the Embodied Gaussian simulator, that synchronizes with the real world at 60Hz. This twin acts as a mediator between the behavior cloning policy and the real robot. Policies are trained using representations derived from \emph{simulator} states and always act on the simulated robot, never the real one. During deployment, the real robot simply follows the simulated robot’s joint states, and the simulation is continuously corrected with real world measurements. This setup, where the simulator drives all policy execution and maintains real-time synchronization with the physical world, shifts the responsibility of crossing the sim-to-real gap to the digital twin’s synchronization mechanisms, instead of the policy itself. We demonstrate real-is-sim on a long-horizon manipulation task (PushT), showing that virtual evaluations are consistent with real-world results.
We further show how real-world data can be augmented with virtual rollouts and compare to policies trained on different representations derived from the simulator state including object poses and rendered images from both static and robot-mounted cameras. Our results highlight the flexibility of the real-is-sim framework across training, evaluation, and deployment stages.
Accelerating Residual Reinforcement Learning with Uncertainty Estimation
Residual Reinforcement Learning (RL) is a popular approach for adapting pre-trained policies by learning a lightweight residual policy that provides corrective actions. While Residual RL is more sample-efficient than fine-tuning the entire base policy, existing methods struggle with sparse rewards and are designed for deterministic base policies.
We propose two improvements to Residual RL that further enhance its sample efficiency and make it suitable for stochastic base policies. First, we leverage uncertainty estimates of the base policy to focus exploration on regions in which the base policy is not confident. Second, we propose a simple modification to off-policy residual learning that allows it to observe base actions and better handle stochastic base policies. We evaluate our method with both Gaussian-based and Diffusion-based stochastic base policies on tasks from Robosuite and D4RL, and compare against state-of-the-art finetuning methods, demo-augmented RL methods, and other Residual RL methods. Our algorithm significantly outperforms existing baselines in a variety of simulation benchmark environments. We also deploy our learned policies in the real world to demonstrate their robustness with zero-shot sim-to-real transfer.