Description:ad - The number of documents in corpus cw - The number of words in all documents unq - The number of unique words collected in corpusVocab\data\ ad=1 cw=23832 unq=9390 \words: 33 а 244 | 1 | 0.010238 | 0.000000 | -3.581616 34 б 11 | 1 | 0.000462 | 0.000000 | -...
Description:oc - Occurrence in corpus dc - Occurrence in documents tf - Term frequency — the ratio of a word occurrence to the total number of words in a document. Thus, the importance of a word is evaluated within a single document, calculation formula is: [tf = oc / cw] idf - ...
Description:oc - Occurrence in corpus dc - Occurrence in documents tf - Term frequency — the ratio of a word occurrence to the total number of words in a document. Thus, the importance of a word is evaluated within a single document, calculation formula is: [tf = oc / cw] idf - ...