Transformer architectures have been successfully used in learning source code representations. The fusion between a graph representation likeSyntax Tree (AST) and a source code sequence makes the use of current approaches computationally intractable for large input sequence lengths. Source code can have ...