{"id":3254,"date":"2025-08-04T12:50:03","date_gmt":"2025-08-04T12:50:03","guid":{"rendered":"http:\/\/www.labren.org\/mm\/?p=3254"},"modified":"2025-08-05T12:58:01","modified_gmt":"2025-08-05T12:58:01","slug":"%f0%9f%8e%89our-recent-work-surgical-vqla-adversarial-contrastive-learning-for-calibrated-robust-visual-question-localized-answering-in-robotic-surgery-has-been-accepted-by-information-fusion","status":"publish","type":"post","link":"http:\/\/www.labren.org\/mm\/news\/%f0%9f%8e%89our-recent-work-surgical-vqla-adversarial-contrastive-learning-for-calibrated-robust-visual-question-localized-answering-in-robotic-surgery-has-been-accepted-by-information-fusion\/","title":{"rendered":"\ud83c\udf89Our recent work &#8220;Surgical-VQLA++: Adversarial Contrastive Learning for Calibrated Robust Visual Question-Localized Answering in Robotic Surgery&#8221; has been accepted by Information Fusion!"},"content":{"rendered":"\n<p>This paper is an extended version of our <strong>#ICRA2023<\/strong> Surgical-VQLA. Our method can serve as an effective and reliable tool to assist in surgical education and clinical decision-making by providing more insightful analyses of surgical scenes.<\/p>\n\n\n\n<p>\u2728 Key Contributions in the journal version:<\/p>\n\n\n\n<p>&#8211; A dual calibration module is proposed to align and normalize multimodal representations.&nbsp;<\/p>\n\n\n\n<p>&#8211; A contrastive training strategy with adversarial examples is employed to enhance robustness.<\/p>\n\n\n\n<p>&#8211; Various optimization function is widely explored.<\/p>\n\n\n\n<p>&#8211; The EndoVis-18-VQLA &amp; EndoVis-17-VQLA datasets are further extended.<\/p>\n\n\n\n<p>&#8211; Our proposed solution presents superior performance and robustness against real-world image corruption.<\/p>\n\n\n\n<p>Conference Version (ICRA 2023): https:\/\/lnkd.in\/gHscT3eN<\/p>\n\n\n\n<p>Journal Version (Information Fusion): https:\/\/lnkd.in\/gQNWwHmt<\/p>\n\n\n\n<p>Code &amp; Dataset: https:\/\/lnkd.in\/g7CTuyAH<\/p>\n\n\n\n<p>Thank all of the collaborators for their effort: <a href=\"https:\/\/www.linkedin.com\/company\/103371608\/admin\/page-posts\/published\/#\">Long Bai<\/a>, Guankun Wang, <a href=\"https:\/\/www.linkedin.com\/company\/103371608\/admin\/page-posts\/published\/#\">An Wang<\/a>, and Prof. <a href=\"https:\/\/www.linkedin.com\/company\/103371608\/admin\/page-posts\/published\/#\">Hongliang Ren<\/a> from CUHK, Dr. <a href=\"https:\/\/www.linkedin.com\/company\/103371608\/admin\/page-posts\/published\/#\">Mobarakol Islam<\/a> from WEISS, UCL, and Dr. <a href=\"https:\/\/www.linkedin.com\/company\/103371608\/admin\/page-posts\/published\/#\">Lalithkumar Seenivasan<\/a> from JHU.<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/media.licdn.com\/dms\/image\/v2\/D5622AQEEUKcp5PRW9w\/feedshare-shrink_800\/feedshare-shrink_800\/0\/1723433432659?e=1756944000&amp;v=beta&amp;t=Yx0k9xtYPHmheA7y7hJm_sCPlZ9xp6rba_wfNhl6pUw\" alt=\"No alternative text description for this image\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/media.licdn.com\/dms\/image\/v2\/D5622AQGHxLbh7wP6jw\/feedshare-shrink_800\/feedshare-shrink_800\/0\/1723433433353?e=1756944000&amp;v=beta&amp;t=vB2_J6FoDH7EU9AbeJmaNIhieIu4_fyZLWorLLekRPg\" alt=\"No alternative text description for this image\" \/><\/figure>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/media.licdn.com\/dms\/image\/v2\/D5622AQEfF4bi4Wnaig\/feedshare-shrink_800\/feedshare-shrink_800\/0\/1723433434110?e=1756944000&amp;v=beta&amp;t=esP5VRSaGhysCbqFHiIhyAqtEzv86ASH3ZR04Um7dPw\" alt=\"No alternative text description for this image\" \/><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>This paper is an extended version of our #ICRA2023 Surgical-VQLA. Our method can serve as an effective and reliable tool to assist in surgical education and clinical decision-making by providing more insightful analyses of surgical scenes. \u2728 Key Contributions in the journal version: &#8211; A dual calibration module is proposed\u2026 <a class=\"continue-reading-link\" href=\"http:\/\/www.labren.org\/mm\/news\/%f0%9f%8e%89our-recent-work-surgical-vqla-adversarial-contrastive-learning-for-calibrated-robust-visual-question-localized-answering-in-robotic-surgery-has-been-accepted-by-information-fusion\/\">Continue reading<\/a><\/p>\n","protected":false},"author":17,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0,"footnotes":""},"categories":[4],"tags":[126,140,146,142,11,145,118,127,141,144,143],"class_list":["post-3254","post","type-post","status-publish","format-standard","hentry","category-news","tag-cuhk","tag-informationfusion","tag-jhu","tag-multimodal","tag-robotics","tag-surgery","tag-surgicalai","tag-ucl","tag-visionlanguage","tag-visualgrounding","tag-vqa"],"_links":{"self":[{"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/posts\/3254","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/comments?post=3254"}],"version-history":[{"count":1,"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/posts\/3254\/revisions"}],"predecessor-version":[{"id":3255,"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/posts\/3254\/revisions\/3255"}],"wp:attachment":[{"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/media?parent=3254"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/categories?post=3254"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/tags?post=3254"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}