For details about the original model, check out BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding, HuggingFace's Transformers: State-of-the-art Natural Language Processing. Tokenization occurs using the BERT tokenizer (see the demo code for impl...