Joint MECHATRONICS 2025, ROBOTICS 2025 Paper Abstract

Close

Paper WeAT5.1

Nguyen, Huy Hoang (AIT Austrian Institute of Technology), Vu, Minh Nhat (Automation & Control Institute (ACIN), TU Wien, Austria), Beck, Florian (TU Wien), Ebmer, Gerald (TU Wien), Nguyen, Anh (University of Liverpool), Kemmetmueller, Wolfgang (TU Wien), Kugi, Andreas (TU Wien)

Language-Driven Closed-Loop Grasping with Model-Predictive Trajectory Optimization

Scheduled for presentation during the Regular Session "Robot Hand Control" (WeAT5), Wednesday, July 16, 2025, 10:00−10:20, Room 109

Joint 10th IFAC Symposium on Mechatronic Systems and 14th Symposium on Robotics, July 15-18, 2025, Paris, France

This information is tentative and subject to change. Compiled on July 16, 2025

Keywords Multi-fingered hand control, Sensory based robot control, Modeling and identification

Abstract

Combining a vision module inside a closed-loop control system for the emph{seamless movement} of a robot in a manipulation task is challenging due to the inconsistent update rates between utilized modules. This task is even more difficult in a dynamic environment, e.g., objects are moving. This paper presents a emph{modular} zero-shot framework for language-driven manipulation of (dynamic) objects through a closed-loop control system with real-time trajectory replanning and an online 6D object pose localization. We segment an object within $SI{0.5}{second}$ by leveraging a vision language model via language commands. Then, guided by natural language commands, a closed-loop system, including a unified pose estimation and tracking and online trajectory planning, is utilized to continuously track this object and compute the optimal trajectory in real time. Our proposed zero-shot framework provides a smooth trajectory that avoids jerky movements and ensures the robot can grasp a non-stationary object. Experimental results demonstrate the real-time capability of the proposed zero-shot modular framework to accurately and efficiently grasp moving objects. The framework achieves update rates of up to SI{30}{hertz} for the online 6D pose localization module and SI{10}{hertz} for the receding-horizon trajectory optimization. These advantages highlight the modular framework's potential applications in robotics and human-robot interaction; see the video at href{https://language-driven-closed-loop-grasping.github.i o/}{language-driven-grasping.github.io}.

 

Technical Content Copyright © IFAC. All rights reserved.


This site is protected by copyright and trademark laws under US and International law.
All rights reserved. © 2002-2025 PaperCept, Inc.
Page generated 2025-07-16  13:38:21 PST   Terms of use