Singaporean English (STB), and Hindi-Indian English code-switching data (Hi-En-CS). Three resources are in French (Frb, xUGC, FSMB), one includes CS data in French and transliterated dialectal North-African Arabic (NBZ)
Thesimilaritywill belong to the[-1, 1]range,1meaning the absolute match. Pros: Computationally cheap. Only unimodal embeddings are required. Unimodal encoding is faster than joint encoding. Suitable for retrieval in large collections. Cons: