{"id":3356,"date":"2025-08-05T05:38:32","date_gmt":"2025-08-05T05:38:32","guid":{"rendered":"http:\/\/www.labren.org\/mm\/?p=3356"},"modified":"2025-08-05T10:41:25","modified_gmt":"2025-08-05T10:41:25","slug":"%f0%9f%8e%89-%f0%9f%8e%89-we-are-thrilled-to-announce-that-our-latest-work-on-surgical-workflow-recognition-has-been-accepted-by-the-information-fusion-journal-if-14-8","status":"publish","type":"post","link":"http:\/\/www.labren.org\/mm\/news\/%f0%9f%8e%89-%f0%9f%8e%89-we-are-thrilled-to-announce-that-our-latest-work-on-surgical-workflow-recognition-has-been-accepted-by-the-information-fusion-journal-if-14-8\/","title":{"rendered":"\ud83c\udf89 \ud83c\udf89 We are thrilled to announce that our latest work on surgical workflow recognition has been accepted by the Information Fusion journal (IF: 14.8)."},"content":{"rendered":"\n<p>\ud83d\uddde\ufe0f We propose a multimodal Graph Representation network with Adversarial feature Disentanglement (GRAD) for robust surgical workflow recognition in challenging scenarios with domain shifts or corrupted data. Specifically, we introduce a Multimodal Disentanglement Graph Network (MDGNet) that captures fine-grained visual information while explicitly modeling the complex relationships between vision and kinematic embeddings through graph-based message modeling. To align feature spaces across modalities, we propose a Vision-Kinematic Adversarial (VKA) framework that leverages adversarial training to reduce modality gaps and improve feature consistency. Furthermore, we design a Contextual Calibrated Decoder, incorporating temporal and contextual priors to enhance robustness against domain shifts and corrupted data.<\/p>\n\n\n\n<p>\ud83d\uddde\ufe0f Extensive comparative and ablation experiments demonstrate the effectiveness of our model and proposed modules. Specifically, we achieved an accuracy of 86.87% and 92.38% on two public datasets, respectively. Moreover, our robustness experiments show that our method effectively handles data corruption during storage and transmission, exhibiting excellent stability and robustness. Our approach aims to advance automated surgical workflow recognition, addressing the complexities and dynamism inherent in surgical procedures.<\/p>\n\n\n\n<p>\ud83d\udcd1 Paper link: https:\/\/lnkd.in\/g9SWYPJg<\/p>\n\n\n\n<p>\ud83d\udc4f \ud83d\udc4f We extend our gratitude to our co-authors and collaborators from around the world, including <a href=\"https:\/\/www.linkedin.com\/company\/103371608\/admin\/page-posts\/published\/#\">Long Bai<\/a>, Boyi Ma, <a href=\"https:\/\/www.linkedin.com\/company\/103371608\/admin\/page-posts\/published\/#\">Ruohan Wang<\/a>, Guankun Wang, <a href=\"https:\/\/www.linkedin.com\/company\/103371608\/admin\/page-posts\/published\/#\">Beilei Cui<\/a>, <a href=\"https:\/\/www.linkedin.com\/company\/103371608\/admin\/page-posts\/published\/#\">Zhongliang Jiang<\/a>, <a href=\"https:\/\/www.linkedin.com\/company\/103371608\/admin\/page-posts\/published\/#\">Mobarakol Islam<\/a>, <a href=\"https:\/\/www.linkedin.com\/company\/103371608\/admin\/page-posts\/published\/#\">Zhe Min<\/a>, <a href=\"https:\/\/www.linkedin.com\/company\/103371608\/admin\/page-posts\/published\/#\">Jiewen Lai<\/a>, <a href=\"https:\/\/www.linkedin.com\/company\/103371608\/admin\/page-posts\/published\/#\">Nassir Navab<\/a>, and <a href=\"https:\/\/www.linkedin.com\/company\/103371608\/admin\/page-posts\/published\/#\">Hongliang Ren<\/a>!<\/p>\n\n\n\n<figure class=\"wp-block-image\"><img decoding=\"async\" src=\"https:\/\/media.licdn.com\/dms\/image\/v2\/D5622AQH44AWzdj86Vw\/feedshare-shrink_800\/B56ZdLm7CkHUAk-\/0\/1749320212095?e=1757548800&amp;v=beta&amp;t=RSgBF11AR4aN9XMVeqpBDMw3Ev3szbMF8c8g7NqUeLU\" alt=\"No alternative text description for this image\" \/><\/figure>\n","protected":false},"excerpt":{"rendered":"<p>\ud83d\uddde\ufe0f We propose a multimodal Graph Representation network with Adversarial feature Disentanglement (GRAD) for robust surgical workflow recognition in challenging scenarios with domain shifts or corrupted data. Specifically, we introduce a Multimodal Disentanglement Graph Network (MDGNet) that captures fine-grained visual information while explicitly modeling the complex relationships between vision and\u2026 <a class=\"continue-reading-link\" href=\"http:\/\/www.labren.org\/mm\/news\/%f0%9f%8e%89-%f0%9f%8e%89-we-are-thrilled-to-announce-that-our-latest-work-on-surgical-workflow-recognition-has-been-accepted-by-the-information-fusion-journal-if-14-8\/\">Continue reading<\/a><\/p>\n","protected":false},"author":17,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"ngg_post_thumbnail":0,"footnotes":""},"categories":[4],"tags":[],"class_list":["post-3356","post","type-post","status-publish","format-standard","hentry","category-news"],"_links":{"self":[{"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/posts\/3356","targetHints":{"allow":["GET"]}}],"collection":[{"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/users\/17"}],"replies":[{"embeddable":true,"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/comments?post=3356"}],"version-history":[{"count":1,"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/posts\/3356\/revisions"}],"predecessor-version":[{"id":3357,"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/posts\/3356\/revisions\/3357"}],"wp:attachment":[{"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/media?parent=3356"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/categories?post=3356"},{"taxonomy":"post_tag","embeddable":true,"href":"http:\/\/www.labren.org\/mm\/wp-json\/wp\/v2\/tags?post=3356"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}