Dataset Loaders Edit No data loaders found. You can submit your data loader here. Tasks Edit Question Answering Visual Question Answering (VQA) Optical Character Recognition (OCR) Similar Datasets TallyQA TallyQA InfographicVQA TextCaps TextCaps DocVQA DocVQA Usage...
Many experiments demonstrate that without pretraining, our proposed method achieves better performance than the standard transformer and outperforms partial state-of-the-art methods on the VQA-v2 dataset.doi:10.1007/s10489-023-04564-xHaiying Xia...
a short answer to the question (one or a few words).As you can see in the illustration bellow, two different triplets (but same image) of the VQA dataset are represented. The models need to learn rich multimodal representations to be able to give the right answers.The VQA task is still...