A main factor could be that the use of social media data, which are potentially one of the main sources for extracting Cantonese text in natural contexts and building benchmarks for NLP tasks, faces a lot of legal obstacles. The use of those data must comply with the requirements of the ...
Apart from the well-established challenges that language use poses (e.g., ambiguity, sarcasmdialects, slang, neologisms), two factors in the event add further linguistic complexity, namely that of actor role and associated context. In contrast to tasks where adequate information is provided in ...