Among them, the encoding function $F$ can be implemented by a simple depthwise separable convolution or other modules. The simplified code of the PEG part is as follows. The input feat_token is a tensor of shape
To test algorithms on in vivo data, we used the Benchmark for Automatic Glottis Segmentation (BAGLS) dataset10. We extended BAGLS by manually annotating the training and the test dataset of single endoscopic images using anatomical landmarks, such as anterior commissure and arytenoid cartilages ...
The two heads consist of different numbers of Depthwise Separable Convolution Modules that combine the depthwise separable convolution and residual connection. The details of Depthwise Separable Convolution Module that represent the DW Module in Figure 11 can be seen in [15]. In decoder, K represents...