🚀 ICRA 2026: 𝑵𝒆𝒖𝒓𝒐𝑽𝑳𝑨: 𝑺𝒖𝒓𝒈𝒊𝒄𝒂𝒍 𝑺𝒄𝒆𝒏𝒂𝒓𝒊𝒐-𝑨𝒘𝒂𝒓𝒆 𝑳𝒆𝒂𝒓𝒏𝒊𝒏𝒈 𝒐𝒇 𝑫𝒆𝒃𝒖𝒍𝒌𝒊𝒏𝒈 𝑺𝒌𝒊𝒍𝒍𝒔 𝒊𝒏 𝑬𝒏𝒅𝒐𝒔𝒄𝒐𝒑𝒊𝒄 𝑹𝒐𝒃𝒐𝒕𝒊𝒄 𝑵𝒆𝒖𝒓𝒐𝒔𝒖𝒓𝒈𝒆𝒓𝒚 𝒗𝒊𝒂 𝑽𝒊𝒔𝒊𝒐𝒏-𝑳𝒂𝒏𝒈𝒖𝒂𝒈𝒆-𝑨𝒄𝒕𝒊𝒐𝒏 𝑴𝒐𝒅𝒆𝒍 🤖🧲 – Robotics, Embodied AI, Navigation in vivo

We present 𝐍𝐞𝐮𝐫𝐨-𝐕𝐋𝐀, an scenario-aware model designed for the motion control of a parallel continuum neurosurgical robot.

Robotic surgery systems have garnered significant attention for their precision and efficiency, yet achieving autonomous tasks in complex neurosurgical environments remains challenging. Although Vision-Language-Action (VLA) models hold great potential, their development is constrained by the scarcity of data from surgical environments and robotic kinematics. To address this issue, this paper proposes NeuroVLA: a VLA model specifically designed for neurosurgical robotic tumor debulking tasks. Through phantom experiments conducted on a flexible parallel continuum robot, we constructed a dataset and decomposed the debulking task into four skill-based instructions. NeuroVLA utilizes a Vision-Language Model (VLM) as its backbone for scene reasoning, enabling the robot to comprehend the surgical scene and its own state. Experimental results demonstrate that after training on 90 debulking segments, NeuroVLA can infer actions based on images, language instructions, and the robot’s state. It achieved average pixel distance errors of 29.10 pixels and 21.55 pixels for the “alignment” and “transfer” skills, respectively, and success rates of 88.89% and 100% for the “grasping” and “release” skills.

🧠 Technical Framework:

● End-to-End scenario-aware VLA model

● Skill-based scenario infer mechanism

● Debulking task dataset in neurosurgery

🎯 Experimental Results:

● NeuroVLA demonstrates significantly lower pixel distance (PD) errors in the “alignment” and “transfer” skills (29.10 px / 21.55 px), far surpassing the performance of baseline models (such as Octo’s 79.72 px / 65.46 px).

In the “grasping” and “release” skills, NeuroVLA exhibits greater robustness, achieving a grasping success rate of 88.89% and a release success rate of 100%. In contrast, baseline models often misinterpret incomplete forceps closure as task completion, leading to grasping failures.

#ICRA2026

News

Tags