🎉Our recent work “Surgical-VQLA++: Adversarial Contrastive Learning for Calibrated Robust Visual Question-Localized Answering in Robotic Surgery” has been accepted by Information Fusion!

This paper is an extended version of our #ICRA2023 Surgical-VQLA. Our method can serve as an effective and reliable tool to assist in surgical education and clinical decision-making by providing more insightful analyses of surgical scenes.

✨ Key Contributions in the journal version:

– A dual calibration module is proposed to align and normalize multimodal representations.

– A contrastive training strategy with adversarial examples is employed to enhance robustness.

– Various optimization function is widely explored.

– The EndoVis-18-VQLA & EndoVis-17-VQLA datasets are further extended.

– Our proposed solution presents superior performance and robustness against real-world image corruption.

Conference Version (ICRA 2023): https://lnkd.in/gHscT3eN

Journal Version (Information Fusion): https://lnkd.in/gQNWwHmt

Code & Dataset: https://lnkd.in/g7CTuyAH

Thank all of the collaborators for their effort: Long Bai, Guankun Wang, An Wang, and Prof. Hongliang Ren from CUHK, Dr. Mobarakol Islam from WEISS, UCL, and Dr. Lalithkumar Seenivasan from JHU.

No alternative text description for this image

News

Tags