针对于NarrativeQA数据集,原文提出一个强大的基准模型--多跳指针生成器(Multi-Hop Pointer-Generator Model,MHPGM),它可以在上下文中推理,收集和合成互斥的信息用来生成答案。具体的讲,MHPGM使用多重注意力机制来执行多跳推理,并使用指针生成解码器来合成答案,在长篇段落内有效地读取和推理并合成与问题一致的答案。 另...
However, by simply evaluating the correctness of the answers, it is unclear to what extent these systems have learned the ability to perform multi-hop reasoning. In this paper, we propose an additional sub-question evaluation for the multi-hop QA dataset HotpotQA, in order to shed some light...
Our dataset is created by utilizing three existing multi-hop datasets: HotpotQA, 2WikiMultihopQA, and MuSiQue. Instead of relying solely on factual reasoning, we enhance the existing multi-hop questions by adding another layer of questioning that involves one, two, or all three of the ...
Table 1: Comparison of VidQA benchmarks. Our proposed benchmark focuses on assessing multi-hop reasoning and grounding abilities within long-form egocentric videos.Dataset Annotation Avg. Ego? Time Multi- Dataset Annotation Duration (s) Ego? Labels Spans Conventional VidQA Benchmarks MovieQA Manual...
大模型(LLM)最新论文摘要 | Towards Robust Temporal Reasoning of Large Language Models via a Multi-Hop QA Dataset and Pseudo-Instruction Tuning Authors: Qingyu Tan, Hwee Tou Ng, Lidong Bing Knowledge in the real world is being updated constantly. However, it is costly to frequently update large...
machine-reading-comprehensionmulti-hop-reasoningmulti-hop-dataset UpdatedFeb 22, 2023 Python Code for KU Leuven LIIR lab's submission to the TextGraphs-14 shared task on Multi-Hop Inference for Explanation Regeneration natural-language-processingdeep-learningtransformerspytorchmulti-hop-reasoningexplanation-...
However, building a dataset that contains complex questions with sub-questions and their corresponding documents requires costly human annotation. To address the issue, we propose a new method for weakly supervised multi-hop retriever pre-training without human efforts. Our method includes 1) a pre-...
The HotpotQA Dataset questions are designed to require multi-hop reasoning, ie searching for the answer in different sources. The queries are also highly diverse and not limited to any pre-existing knowledge bases or schema; while the provided sentence-level supporting facts enable QA systems to ...
[ISWC 2021] Graphhopper: Multi-hop Scene Graph Reasoning for Visual Question Answering. [ACL 2021] In Factuality: Efficient Integration of Relevant Facts for Visual Question Answering. [KDD 2021] Select, Substitute, Search: A New Benchmark for Knowledge-Augmented Visual Question Answering. ...
We introduce VIMQA, a new Vietnamese dataset with over 10,000 Wikipedia-based multi-hop question-answer pairs. The dataset is human-generated and has four main features: (1) The questions require advanced reasoning over multiple paragraphs. (2) Sentence-level supporting facts are provided, ...