Thrilled to share our latest work on enabling robust sparse-to-dense reconstruction for endoscopic surgical robots — bridging the gap between 𝐬𝐩𝐚𝐫𝐬𝐞 𝐬𝐞𝐧𝐬𝐨𝐫 𝐝𝐚𝐭𝐚 𝐚𝐧𝐝 𝐡𝐢𝐠𝐡-𝐪𝐮𝐚𝐥𝐢𝐭𝐲 𝟑𝐃 𝐦𝐚𝐩𝐩𝐢𝐧𝐠 using a novel 𝐝𝐢𝐟𝐟𝐮𝐬𝐢𝐨𝐧-𝐛𝐚𝐬𝐞𝐝 framework.
Fine-tuning foundational models often fails due to a lack of dense ground truth, and self-supervised methods struggle with scale ambiguity, sparse depth sensors offer a reliable geometric prior.
This motivated us to develop EndoDDC, a method that robustly generates dense depth maps by fusing RGB images with sparse depth inputs.
🧠✨ 𝐖𝐡𝐚𝐭 𝐰𝐞 𝐝𝐞𝐯𝐞𝐥𝐨𝐩𝐞𝐝:
A diffusion-driven depth completion architecture that:
🔹 Integrates sparse depth and RGB inputs to overcome the limitations of pure visual estimation.
🔹 Utilizes a Multi-scale Feature Extraction and Depth Gradient Fusion module to capture fine-grained surface orientation and local structure.
🔹 Optimizes depth maps iteratively using a conditional diffusion model, refining geometry even in regions with weak textures or reflections.
🎯 𝐊𝐞𝐲 𝐑𝐞𝐬𝐮𝐥𝐭𝐬:
✅ 25.55% and 9.03% improvement in accuracy on the StereoMIS and C3VD dataset compared to SOTA surgical estimators like EndoDAC.
✅ 7.35% and 5.28% reduction in RMSE on StereoMIS and C3VD compared to the best depth completion baseline (OGNI-DC).
✅ Outperformed foundational models (DepthAnything-v2) and standard depth completion (Marigold-DC) methods in both accuracy and robustness.
💡 𝐖𝐡𝐲 𝐢𝐭 𝐦𝐚𝐭𝐭𝐞𝐫𝐬:
This work demonstrates that diffusion models can effectively solve the “sparse-to-dense” challenge in medical imaging. By providing accurate depth completion despite complex lighting and texture conditions, EndoDDC has the potential to significantly enhance autonomous navigation, procedural safety, and spatial awareness in minimally invasive surgery.
🔖 #DepthCompletion #DiffusionModel #EndoscopicSurgery #SurgicalNavigation #ICRA #CUHKEngineering #CUHK