To construct Hiera’s full 56×56 position embedding, we tile the window embedding and interpolate the global embedding, then add the results (see Fig. 3). This is similar to the learned behavior of absolute embeddings (Fig. 2), but we can now interpolate properly (Fig. 1e): tile the...