This paper compares the performance of three different neural network structures based on the U-Net for tissue segmentation. The models subject of this study comprise temporal layers such as Long Short Term Memory cells and Attention Gate block. Results show that the model benefits from the implementation of temporal layers along with attention-based layers, even in case of a limited dataset. The proposed method allows to extract fundamental features from the scene which can be fed to a system to perform autonomous surgical gestures such as tissue retraction, suturing and ablation.

In order to achieve a robust model for tissue segmentation, Long Short term memory cells have been implemented to model the temporal dependencies between the subsequent frames of an endoscopic video. Additionally, Attention Gate blocks have been adopted with the aim of further enhancing the model’s performance compared to a standard straight-forward model (i.e. the U-Net).

doi: 10.1109/TMRB.2021.3054326. https://ieeexplore.ieee.org/document/9335948