...self_attention_mask (using torch.gt) and enc_self_attn...
ifattn_maskisnotNone: scores.masked_fill_(attn_mask,-1e9) scores.masked_fill_(attn_mask,-1e9)# Fills elements of self tensor with value where mask is one. attn=nn.Softmax(dim=-1)(scores) context=torch.matmul(attn,V) returncontext,attn ...