A pretrained CNN (the VGG network) is used for image feature extraction for the similarity evaluation of two different types of images (BIM rendered and real images). Experiments were performed in real buildings to verify the method, and the matching accuracy is 91.61% for a total of 143 ...