基于区域的卷积神经网络(Region-based convolutional neural networks, or regions with CNN feature, R-CNNs)是将深度模型应用于目标检测的一种前沿方法[Girshick et al., 2014]。在本节中,我们将讨论R-CNN和对它们的一系列改进:Fast R-CNN[Girshick, 2015],Faster R-CNN[Ren et al., 2015],和Mask R-CNN...
This conclusion is supported by the experiments we conducted in multiple real-world outdoor scenarios, using the data acquired from advertising panels placed in crowded urban environments.Ehsan YaghoubiPendar AlirezazadehEduardo AssunoJoo C. NevesHugo Proena会议论文...
However, existing end-to-end detection frameworks of convolutional neural networks (CNNs) are mostly designed for 2D images. In this paper, we propose 3D context enhanced region-based CNN (3DCE) to incorporate 3D context information efficiently by aggregating feature maps of 2D images. 3DCE is...
多尺度(M):采用SPPnet的多尺度策略(FRCN和MR-CNN均采用)。比例尺定义为图像最短边的大小。在训练中,一个尺度是随机选择的,而在测试时,推理在所有尺度上运行。对于VGG16网络,我们使用s∈{480,576,688,864,900}进行训练,测试时使用s∈{480,576,688,864,1000},最大维数上限为1000。选择比例和上限是由于GPU内...
Faster R-CNN评估了每个区域的10层子网络以达到良好的精度,但是R-FCN每个区域的成本可以忽略不计。在测试时使用300个RoI,Faster R-CNN每张图像花费0.42s,比我们的R-FCN慢了2.5倍,R-FCN每张图像只有0.17s(在K40 GPU上,这个数字在Titan X GPU上是0.11s)。R-FCN的训练速度也快于Faster R-CNN。此外,硬示例...
R-CNN 目标检测系列将目标检测问题分为两个步骤:卷积特征提取+候选区域分类,这两个步骤通过 RoI 池化层连接起来。卷积特征提取独立于RoI,RoI后面的计算不能共享计算。造成这种情况是由于历史原因:早期的网络模型如 AlexNet and VGG Nets 有两个子网络:卷积网络以空间池化层结束,全链接层。这个空间池化层就演变为后来...
R-FCN的训练速度也比Faster R-CNN快。此外,困难样本挖掘[22]不增加R-FCN训练的成本(表3)。从2000个RoIs开始挖掘时,训练R-FCN是可行的,在这种情况下,Faster R-CNN是R-FCN的6倍慢(2.9s vs. 0.46s)。但是实验表明,从更大的候选集合中挖掘(例如2000)没有任何好处(表3),因此我们在本文的其他部分使用300个...
Compared to faster R-CNN and single shot multiBox detector (SSD), the proposed algorithm achieves better result for FOD detection on airfield pavement in the experiment. 展开 关键词: foreign object debris object detection convolutional neural network vehicular imaging sensors ...
However, because of the single output of the CNN, the boundaries of overlapping parts are difficult to determine. The individual optimization of CNNs in multiple fragmented processes increases algorithm complexity and makes quality control difficult. In contrast, the primary technique used in our ...
(CNN).After fine-tuning the models,the output from the last convolutional layer is given to a grouped 2D-local binary pattern descriptor(G2DLBP).Statistical features from the descriptor are applied to different classifiers,and the best clas-sifier for each image region model is identified....