BERT employs Transformer, an attention method that discovers semantic aspects of speech (or sub-words) in a text. The attention mechanism of the transformer is the core component of BERT. The attention mechanism helps extract the semantic meaning of a term in a sentence that is frequently tied...