Thrilled to share our latest international collaboration! At NVIDIA GTC 2026 in San Jose, CA, the team led by Professor Hongliang Ren from The Chinese University of Hong Kong (CUHK), in partnership with NVIDIA and 35 leading global institutions, officially released Open-H-Embodiment, the worldโs first and largest open-source dataset for medical robotics, now available on HuggingFace.
During the GTC keynote, Kimberly Powell, NVIDIAโs VP of Healthcare, highlighted this milestone. Our lab is honored to be a primary contributor, filling the critical gap in Embodied AI for medical robotics by providing high-fidelity data for contact dynamics and closed-loop control.
๐ง โจ What we contributed & developed:
This project breaks the “perception-heavy, execution-light” limitation of traditional medical AI. Key highlights include:
๐น 778 Hours of Massive Multimodal Data: The dataset covers 400 complete clinical surgeries and 9 major robotic platforms (e.g., dVRK, CMR Versius, Kuka). It includes 65% clinical data, 23% bench-top experiments, and 12% simulation data.
๐น Three High-Value Specialized Datasets from Our Lab:
- Dual-Source Ultrasound Dataset:ย Experts-level trajectories covering in-vivo porcine EUS and human forearm scanning, overcoming complex organ environments and multi-device calibration.
- Robotic Surgery Skill Dataset:ย Multi-modal data (RGB/RGB-D + Kinematics) for tissue manipulation and suturing, featuring millisecond-level synchronization and dual-mode control (teleoperation & automation).
- Flexible Endoscope Tracking Baseline:ย A standardized dataset addressing hysteresis and deformation in flexible endoscopy, supporting nanosecond-level time synchronization.
๐น Surgical VLA & World Models:
- GR00T-H:ย A 3B-parameter Vision-Language-Action model based on NVIDIA Isaac GR00T, capable of long-horizon dexterous tasks like end-to-end suturing.
- Cosmos-H-Surgical-Simulator:ย An action-conditioned world model that boosts simulation efficiency by over 70x, bridging the sim-to-real gap.
๐ฏ Key Results: โ Global Standardization: First effort to unify medical robotic data across different devices and institutions under CC-BY-4.0. โ Efficiency Boost: Accelerated surgical simulation (600 sims in 40 mins) to generate high-fidelity video-action pairs. โ Clinical Relevance: Successfully captured nearly 500 hours of real-world clinical data for hernia, gallbladder, and uterine surgeries.
๐ก Why it matters: This initiative provides the foundational “bedrock” for Medical Physical AI. By sharing high-quality, synchronized data for surgery, ultrasound, and endoscopy, we are lowering the barrier for researchers worldwide to develop autonomous surgical agents that are both explainable and adaptive.
๐ฑ Whatโs next? Our lab is continuing to deepen research in: ๐น Reasoning-based autonomous control for surgical robots. ๐น Cross-platform generalization of Medical VLA models. ๐น Clinical translation of Embodied AI to improve patient outcomes.
Datasets address: https://huggingface.co/datasets/nvidia/PhysicalAI-Robotics-Open-H-Embodiment
Project website: https://github.com/open-h
#NVIDIAGTC2026 #MedicalRobotics #EmbodiedAI #HuggingFace #CUHK #OpenSource #HealthcareInnovation

