We are excited to share our paper “OSSAR: Towards Open-Set Surgical Activity Recognition in Robot-assisted Surgery” which has been accepted for IEEE International Conference on Robotics and Automation (ICRA) 2024!

In this work, we tackle the challenge of open-set recognition in surgical robotics. Our novel OSSAR framework improves the ability to classify known surgical activities while also detecting unknown activities that weren’t seen during training.

Key contributions:

โ€ข A hyperspherical reciprocal point strategy to better separate known and unknown classes

โ€ข A calibration technique to reduce overconfident misclassifications 

โ€ข New open-set benchmarks on the JIGSAWS dataset and our novel DREAMS dataset for endoscopic procedures

โ€ข State-of-the-art performance on open-set surgical activity recognition tasks

This research takes an important step towards more robust and generalizable AI systems for surgical robots. We hope it will help pave the way for safer and more capable robot-assisted surgeries.

Thank all the amazing co-authors Long Bai, Guankun Wang, Jie Wang, Xiaoxiao Yang, Huxin Gao, Xin Liang, An Wang, Mobarakol Islam, and Hongliang Ren

and our institutions (The Chinese University of Hong Kong, Beijing Institute of Technology, Qilu Hospital of Shandong University, Tongji University, University College London, National University of Singapore) for their support.

You can find more details in our paper https://lnkd.in/gDsjVDSP

No alternative text description for this image
No alternative text description for this image
No alternative text description for this image
No alternative text description for this image

๐ŸŽ‰Our recent work “Surgical-VQLA++: Adversarial Contrastive Learning for Calibrated Robust Visual Question-Localized Answering in Robotic Surgery” has been accepted by Information Fusion!

This paper is an extended version of our #ICRA2023 Surgical-VQLA. Our method can serve as an effective and reliable tool to assist in surgical education and clinical decision-making by providing more insightful analyses of surgical scenes.

โœจ Key Contributions in the journal version:

– A dual calibration module is proposed to align and normalize multimodal representations. 

– A contrastive training strategy with adversarial examples is employed to enhance robustness.

– Various optimization function is widely explored.

– The EndoVis-18-VQLA & EndoVis-17-VQLA datasets are further extended.

– Our proposed solution presents superior performance and robustness against real-world image corruption.

Conference Version (ICRA 2023): https://lnkd.in/gHscT3eN

Journal Version (Information Fusion): https://lnkd.in/gQNWwHmt

Code & Dataset: https://lnkd.in/g7CTuyAH

Thank all of the collaborators for their effort: Long Bai, Guankun Wang, An Wang, and Prof. Hongliang Ren from CUHK, Dr. Mobarakol Islam from WEISS, UCL, and Dr. Lalithkumar Seenivasan from JHU.

No alternative text description for this image
No alternative text description for this image
No alternative text description for this image

๐ŸŽ‰ Check out our #MICCAI2024 accepted paper “EndoUIC: Promptable Diffusion Transformer for Unified Illumination Correction in Capsule Endoscopy”.

In this work, through incorporating a set of learnable parameters to prompt the learning targets, the diffusion model can effectively address the unified illumination correction challenge in capsule endoscopy. We also propose a new capsule endoscopy dataset including underexposed and overexposed images, as well as the ground truth.

Thanks to all of our collaborators from multiple institutions: Long Bai, Qiaozhi Tan, Zhicheng He, Sishen YUAN, Prof. Hongliang Ren from CUHK & SZRI, Tong Chen from USYD, Wan Jun Nah from Universiti Malaya, Yanheng Li from CityU HK, Prof. Zhen CHEN, Prof. Jinlin Wu, Prof. Hongbin Liu from CAIR HK, Dr. Mobarakol Islam from WEISS, UCL, and Dr. Zhen Li from Qilu Hospital of SDU.

Paper: https://lnkd.in/gJaqikqj

Code & Dataset: https://lnkd.in/ghYauAGM

No alternative text description for this image
No alternative text description for this image

๐Ÿ“Š **Empowering Robotic Surgery with SAM 2: An Empirical Study on Surgical Instrument Segmentation** ๐Ÿค–๐Ÿ‘จโ€โš•๏ธ

We’re excited to share our latest research on the Segment Anything Model (SAM) 2. This empirical evaluation uncovers SAM 2’s robustness and generalization capabilities in surgical image/video segmentation, a critical component for enhancing precision and safety in the operating room.

๐Ÿ”ฌ **Key Findings**:

– In general, SAM 2 outperforms its predecessor in instrument segmentation, showing a much improved zero-shot generalization capability to the surgical domain.

– Utilizing bounding box prompts, SAM 2 achieves remarkable results, setting a new benchmark in the surgical image segmentation.

– With a single point as the prompt on the first frame, SAM 2 demonstrates substantial improvements on video segmentation over SAM, which requires point prompts on every frames. This suggests great potential in addressing video-based surgical tasks. 

– Resilience Under Common Corruptions: SAM 2 shows impressive robustness against real-world image corruption, maintaining performance under various challenges such as compression, noise, and blur.

๐Ÿ”ง **Practical Implications**:

– With faster inference speeds, SAM 2 is poised to provide quick, accurate segmentation, making it a valuable asset in the clinical setting.

๐Ÿ”— **Learn More**:

For those interested in the technical depth, our paper is available on [arXiv](https://lnkd.in/gHfdrvj3).

We’re eager to engage with the community and explore how SAM 2 can revolutionize surgical applications.

Thanks to the team contributions of Jieming YU, An Wang, Wenzhen Dong, Mengya Xu, Jie Wang, Long Bai, Hongliang Ren from Department of Electronic Engineering, The Chinese University of Hong Kong and Shenzhen Research Institute of CUHK, and Mobarakol Islam from WEISS – Wellcome / EPSRC Centre for Interventional and Surgical Sciences, UCL.

No alternative text description for this image
No alternative text description for this image
No alternative text description for this image

๐Ÿ› ๏ธ Introducing CAT-SD: Privacy-Centric AI in Robotic Surgery ๐Ÿค–

In our recent work, “Privacy-Preserving Synthetic Continual Semantic Segmentation for Robotic Surgery”, which was published in IEEE Transactions on Medical Imaging, we propose a state-of-the-art framework for continual semantic segmentation in robotic surgery. This breakthrough addresses catastrophic forgetting in DNNs, enhancing surgical precision without compromising patient privacy.

๐Ÿ”’ Privacy-First Synthetic Data: We’ve crafted a solution that blends open-source instrument data with synthesized backgrounds, ensuring real patient data remains confidential.

๐Ÿ’ก Innovative Features:

– Class-Aware Temperature Normalization (CAT) to prevent forgetting of previously learned tasks.

– Multi-Scale Shifted-Feature Distillation (SD) to preserve spatial relationships for robust feature learning.

Check the paper at https://lnkd.in/eTy8KAC5

Code is also available at https://lnkd.in/eMzNs2Be

Co-authors: Mengya Xu, Mobarakol Islam, Long Bai, Hongliang Ren

No alternative text description for this image
No alternative text description for this image
No alternative text description for this image
No alternative text description for this image