models import resnet50 from vit_pytorch.distill import DistillableViT, DistillWrapper teacher = resnet50(pretrained = True) v = DistillableViT( image_size = 256, patch_size = 32, num_classes = 1000, dim = 1024, depth = 6, heads = 8, mlp_dim = 2048, dropout = 0.1, emb_dropout ...
Pytorch reimplementation of the Vision Transformer (An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale) - ViT-pytorch/utils/scheduler.py at main · jeonsworld/ViT-pytorch