The weights assigned to each layer are normalized and dynamically updated during the pre-training process, with their absolute values indicating the significance of each layer for the reconstruction task. We track the changes in these weights and illustrate them in Figure 1. ...