processed = tokenizer("this <test1> that <test2> this", return_special_tokens_mask=True) But it didn't recognize <test1> and <test2>. The workaround is processed = tokenizer("this <test1> that <test2> this") processed['special_tokens_mask'] = tokenizer.get_special_tokens_mask(processed...