Surgical Tracking Based on Stereo Vision and Depth Sensing

Project Goals:

The objective of this research is to incorporate multiple sensors at broad spectrum, including stereo infrared (IR) cameras, color (or RGB) cameras and depth sensors to perceive the surgical environment. Features extracted from each modality can contribute to the cognition of complex surgical environment or procedures. Additionally, their combination can provide higher robustness and accuracy beyond what is obtained from single sensing modality. As a preliminary study, we propose a multi-sensor fusion approach for localizing surgical instruments. We developed an integrated dual Kinect tracking system to validate the proposed hierarchical tracking approach.


This project considers the problem of improving the surgical instrument tracking accuracy by multi-sensor fusion technique in computer vision. We proposed a hierarchical fusion algorithm for integrating the tracking results from depth sensor, IR camera pair and RGB camera pair. Fig. 1 summarized the algorithm involved in this project. It can be divided into the “low-level” and the “high-level” fusion.

Fig. 1 Block diagram of hierarchical fusion algorithm.

The low-level fusion is to improve the speed and robustness of marker feature extraction before triangulating the tool tip position in IR and RGB camera pair. The IR and RGB camera are modeled as pin-hole cameras.  The depth data of the tool can be used as a priori for marker detection. The working area of the tracking tool is supposed to be limited in a reasonable volume v(x, y, z) that can be used to refine the search area for feature extraction, which could reduce the computational cost for real-time applications.
The high-level fusion is to reach a highly accurate tracking result by fusing two measurements. We employ the covariance intersection (CI) algorithm to estimate a new tracking result with less covariance.


To demonstrate the proposed algorithm, we designed a hybrid marker-based tracking tool (Fig. 2) that incorporates the cross-based feature in visible modality and retro-reflective marker based feature in infra-red modality to get a fused tracking of the customized tool tip. To evaluate the performance of the proposed method, we employ two Kinects to build the experimental setup. Fig. 3 shows the prototype of multi-sensor fusion tracker for the experiment, which indicates that the CI-based fusion approaches obviously tend to be better than the separate IR tracker or RGB tracker.  The mean error and deviation of the fusion algorithm are all improved.
Hybrid marker

Fig. 3 Dual Kinect tracking system

People Involved

Staffs: Wei LIU, Shuang SONG, Andy Lim
Advisor: Dr. Hongliang Ren
Collaborator: Wei ZHANG


[1] Ren, H.; LIU, W. & LIM, A. Marker-Based Instrument Tracking Using Dual Kinect Sensors for Navigated Surgery IEEE Transactions on Automation Science and Engineering, 2013
[2] Liu, W.; Ren, H.; Zhang, W. & Song, S. Cognitive Tracking of Surgical Instruments Based on Stereo Vision and Depth Sensing, ROBIO 2013, IEEE International Conference on Robotics and Biomimetics, 2013

Related FYP Project

Andy Lim: Marker-Based Surgical Tracking With Multiple Modalities Using Microsoft Kinect


[1] H. Ren, D. Rank, M. Merdes, J. Stallkamp, and P. Kazanzides, “Multi-sensor data fusion in an integrated tracking system for endoscopic surgery,” IEEE Transactions on Information Technology in Biomedicine, vol. 16, no. 1, pp. 106 – 111, 2012.
[2] W. Liu, C. Hu, Q. He, and M.-H. Meng, “A three-dimensional visual localization system based on four inexpensive video cameras,” in Information and Automation (ICIA), 2010 IEEE International Conference on. IEEE, 2010, pp. 1065–1070.
[3] F. Faion, S. Friedberger, A. Zea, and U. D. Hanebeck, “Intelligent sensor-scheduling for multi-kinect-tracking,” in Intelligent Robots and Systems (IROS), 2012 IEEE/RSJ International Conference on. IEEE, 2012, pp. 3993–3999.

Bookmark the permalink.

Comments are closed.