Python packages SpaCy and GiNZA were used to remove Japanese stop words and implement tokenization. White space characters joined tokenized words into the text in the original order. The Python package scikit-learn was used to convert the white space-joined texts into unigram tokens and calculate ...
Python packages SpaCy and GiNZA were used to remove Japanese stop words and implement tokenization. White space charac- ters joined tokenized words into the text in the original order. The Python package scikit-learn was used to con- vert the white space-joined texts into unigram tokens and ...