The output feature of the -th layer will be denoted as . Consequently, the input image will be denoted as and the final output feature map will correspond to . Each convolutional layer has its own configuration containing 3 parameters values – kernel size, stride, and padding. We’ll ...
KiKi is the output achieved after the two convolution layers, and spatial loss of these convolutional operations is recovered by dense feature concatenation. The first aggregated rich, dense concatenated feature, A1Concat.AConcat.1, is the output of FiFi and KiKi given by Equation (2): A1Concat...
Layer Size (Height × Width × Number of Channels), (Stride) Pool-3 2 × 2 (S = 2) 3 × 3 × 256 (S = 1) 3 × 3 × 256 (S = 1) Concatenation-4 1 × 1 (S = 1) Pool-4 2 × 2 (S = 2) UnPool-4 2 × 2 (S = 2) 3 × 3 × 256 (S = 1) 3 × 3 ×...