Single-Demonstration Imitation with Residual Reinforcement Learning for Dual-Arm Robotic Bottle Opening

Autores/as

DOI:

https://doi.org/10.64117/simposioscea.v2i2.209

Resumen

Learning manipulation from extremely limited data remains challenging for robotic systems. We present a framework that combines Behavioural Cloning (BC) from a single kinesthetic demonstration with residual Reinforcement Learning (RL) to solve a long-horizon dual-arm bottle unscrewing task. A base policy is first trained via supervised imitation, capturing nominal behaviour. A residual policy is then learned in simulation using PPO to produce bounded corrective actions, improving robustness and generalization to variations in bottle geometry. Results show that while single-demonstration BC performs reliably under nominal conditions, it degrades under distribution shifts. The residual formulation preserves nominal performance and significantly improves robustness. The final controller is deployed in a one-shot sim-to-real transfer, achieving successful execution on different bottle types.

Descargas

Publicado

2026-05-28