In the second cell, it is said that restart runtime after running it. After restarting kernal, (or even if I don't) I am getting this error: ModuleNotFoundError: No module named 'transformers.tokenization_bert'. It is from the first import of the 3rd cell, from nemo.collections import...
Maybe it can help someone using Tensorflow 2 andbert-for-tf2. There was a little change to create an instance of FullTokenizer: from bert import bert_tokenization ... tokenizer = bert_tokenization.FullTokenizer( vocab_file="resources/model/bert/bert_vocab.txt", do_lower_case=True ) ...
Exception: The tokenized stories directory F:\Data Science\NLP\BERT\PreSumm\merged_stories_tokenized contains 0 files, but it should contain the same number as F:\Data Science\NLP\BERT\PreSumm\raw_stories (which has 2 files). Was there an error during tokenization?