ARFB的结构 为了通过ARFB进一步获得更多预测参数,残差分支和主分支都添加了全连接层(FC层)以进行参数扩展,如上图所示。在此过程中,多个层次的信息可以进行融合,并且由于加入了FC层,所有分支的维度都能匹配。 ARFB模块的最后一步是主通道和残差通道之间的加法运算,这使得网络能够学习结合每个残差单元的残差。与原始的...
Weakly-supervised object detection (WSOD) aims to train an object detector only requiring the image-level annotations. Recently, some works have managed to select the accurate boxes generated from a well-trained WSOD network to supervise a semi-supervised detection framework for better performance. ...
3. Yan Liu, Zhijie Zhang, Li Niu, Junjie Chen, Liqing Zhang, “Mixed Supervised Object Detection by Transferring Mask Prior and Semantic Similarity”, NeurIPS, 2021.4. Li Niu, “Weak Novel Categories without Tears: A Survey on Weak-Shot Learning”, arXiv preprint arXiv:2110.02651, 2021....
Yan Liu, Zhijie Zhang, Li Niu, Junjie Chen, Liqing Zhang, "Mixed Supervised Object Detection by Transferring Mask Prior and Semantic Similarity", NeurIPS, 2021. Li Niu, "Weak Novel Categories without Tears: A Survey on Weak-Shot Learning", arXiv preprint arXiv:2110.02651, 2021. Yuanyi Zhong...
2. Weakly Supervised Salient Object Detection Using Image Labels 本文是做弱监督的显著性检测,采用了迭代的方法。做法上感觉没有特别多可以借鉴的地方,所以没有仔细看。3. Weakly-Supervised Semantic Segmentation by Iteratively Mining Common Object Features 1. Weakly Supervised Instance Segmentati...
supervised object detection (WSOD) and segmentation by utilizing the pre-learned world knowledge contained in a vision foundation model, i.e., the Segment Anything Model (SAM). WeakSAM addresses two critical limitations in traditional WSOD retraining, i.e., pseudo ground truth (PGT) incompleteness...
Annotating labels for object detection involves either pixel-by-pixel labeling of class information or marking of tight bounding boxes for every image in a dataset. Consequently, creating datasets for artificial intelligence (AI) models for insect detection in a supervised manner can be very laborious...
Recently, several research [35,50] focus on weakly supervised violence detection, where only video-level labels are available in the training set. Compared with annotating frame-level labels, assigning video-level labels is labor-saving. Thus, forming large-scale datasets of untrimmed videos and tr...
Additionally, in comparison to ShapeNet, the gap between AML and fully-supervised approaches (Dai17 and Sup) is surprisingly small—not reflecting the difference in supervision. This means that even under full supervision, these object categories are difficult to complete. In terms of accuracy (Acc...