Question-Answer Generation for Data Augmentation.

However, data augmentation in natural language processing is much less studied. Here, we describe two methods for data augmentation for Visual Question Answering (VQA). ... The first uses existing semantic annotations to generate new questions. The second method is a generative approach using recurrent neural networks. ... Data augmentation is generating new training data from existing examples. In this paper, we explore two data augmentation methods for generating new question-answer (QA) pairs for images. ...

doi:10.18653/v1/w17-3529 dblp:conf/inlg/KafleYK17 fatcat:qjatdchlcfg4bfd355jgmorltq

However, in low-resource settings, the amount of seed data samples to use for data augmentation is very small, which makes generated samples suboptimal and less diverse. ... existing LLM-powered data augmentation baselines. ... : {question 3} Answer Options: {answer options 3} Answer: {answer 3} Question: {question} I want you to act as a question and answer generator. ...

arXiv:2402.13482v1 fatcat:fpoym6cwerdgbj2v52o46gswpi

This paper introduces a generative model for data augmentation by leveraging the correlations among multiple modalities. ... Experiments on Visual Question Answering as downstream task demonstrate the effectiveness of the proposed generative model, which is able to improve strong UpDn-based models to achieve state-of-the-art ... tasks for cross-modal data augmentation. ...

arXiv:2105.04780v2 fatcat:jhstbw4wbzgq7dwaca62xyrphq

Open Access Multiple Versions

In this paper, we propose a novel data augmentation method, referred to as Controllable Rewriting based Question Data Augmentation (CRQDA), for machine reading comprehension (MRC), question generation, ... We treat the question data augmentation task as a constrained question rewriting problem to generate context-relevant, high-quality, and diverse question data samples. ... Conclusion In this work, we present a novel question data augmentation method, called CRQDA, for contextrelevant answerable and unanswerable question generation. ...

arXiv:2010.01475v1 fatcat:5ncgxm5okzbmdbl4slqggrrsum

With the augmented dataset, we design a contrastive training objective for learning to rank question answer pairs. ... In this work, we propose a novel and easy-to-apply data augmentation strategy, namely Bilateral Generation (BiG), with a contrastive training objective for improving the performance of ranking question ... ., 2020) as the conditional generation model for data augmentation. ...

arXiv:2106.11096v2 fatcat:3j5icauh4bhmzhhyxp2dlnxqcm

Open Access Multiple Versions

Finally, we connect between robustness and generalization, demonstrating the predictive power of RAD for performance on unseen augmentations. ... Our proposed augmentations are designed to make a focused intervention on a specific property of the question such that the answer changes. ... In our augmentations, we generate "yes/no" questions from "number" and "other" questions. For example, consider the question-answer pair "What color is the vehicle? ...

arXiv:2106.04484v2 fatcat:astp5refr5fdrk7yxtlnvmiwne

Multiple Versions

This work serves to be the first analysis of LLMs as synthetic data augmenters for QA systems, highlighting the unique opportunities and challenges. ... Additionally, we release augmented versions of low resource datasets, that will allow the research community to create further benchmarks for evaluation of generated datasets. ... Our approach begins by generating supplementary contexts, questions, and answers to augment training sets. ...

arXiv:2309.12426v1 fatcat:pvn72q6otbdhvm62fpcc46lqme

Open Access

In this work, we propose a data augmentation technique by automatically generating relevant unanswerable questions according to an answerable question paired with its corresponding paragraph that contains ... We also present a way to construct training data for our question generation models by leveraging the existing reading comprehension dataset. ... Acknowledgments We thank anonymous reviewers for their helpful comments. Qin and Liu were supported by National Natural Science Foundation of China (NSFC) via grants 61632011 and 61772156. ...

doi:10.18653/v1/p19-1415 dblp:conf/acl/ZhuDWWQL19 fatcat:biioiptm65bzznmmcbxqaqlqi4

Question Answering (QA) is key for making possible a robust communication between human and machine. ... Modern language models used for QA have surpassed the human-performance in several essential tasks; however, these models require large amounts of human-generated training data which are costly and time-consuming ... Before this paper, (i) generating training data for SQuAD question-answering and (ii) using unsupervised methods [instead of supervised methods] to generate training data directly on question-answering ...

arXiv:2010.01611v2 fatcat:6hdetreda5egharx3kfo7ok7ja

Open Access Multiple Versions

Additionally, for both passage retrieval and answer generation, we augmented the training data provided by the task organizers with automatically generated question-answer pairs created from Wikipedia ... We devised several approaches combining different model variants for three main components: Data Augmentation, Passage Retrieval, and Answer Generation. ... Due to computational limitations, in our data augmented setting for generation model fine-tuning, we use 2K question-answer pairs with positive/negative passages for each language for our final results ...

arXiv:2205.14981v1 fatcat:72s2goyk2zhzvii7knfvicwkku

Open Access

We leverage Large Language Models (LLMs), which have shown to have strong reasoning ability, as an automatic data annotator that generates question-answer annotations for chart images. ... We hope our work underscores the potential of synthetic data and encourages further exploration of data augmentation using LLMs for reasoning-heavy tasks. ... Prompts Tab. 8 shows the prompts to prompt the LLM-based data generator, for controllably generating questions and answers. D. ...

arXiv:2403.16385v2 fatcat:3noigtasabbohmunjwplkwfqyy

Open Access Multiple Versions

METHODOLOGY Data collection consists of 100,000 humans generated question-answer pairs with 50,000 unanswerable questions. ... INTRODUCTION Question generation (QG) and question answering (QA) are challenging machine reading comprehension tasks. ... The robustness of U-Net was evaluated using the modified dataset containing augmented data. ...

doi:10.5281/zenodo.7146028 fatcat:neakwbluxvfolca7tcnihxfnwu

Open Access

Question generation (QG), a method for augmenting QA datasets, can be a solution for such performance degradation if QG can properly debias QA datasets. ... Question answering (QA) models for reading comprehension have been demonstrated to exploit unintended dataset biases such as question-context lexical overlap. ... Acknowledgements We would like to thank the anonymous reviewers for their detailed and valuable comments. ...

arXiv:2109.11256v1 fatcat:jatzah33qjbkxjqprpud6jctb4

In this paper, instead of directly manipulating images and questions, we use generated adversarial examples for both images and questions as the augmented data. ... On the other hand, the data augmentation, as one of the major tricks for DNN, has been widely used in many computer vision tasks. ... Although there are few works studying the data augmentation problem for VQA [18, 35, 33, 1] , they merely generate either new questions or images. ...

arXiv:2007.09592v1 fatcat:y7wzntyz6rbhtcvugijeaz445i

I have used augmentation techniques for increasing data and VGG19 for extraction of the feature from a picture and prediction. VQGR is implemented using Tensorflow. ... In addition, systems able to understand clinical pictures and answer the questions about its content can assist objective decision making, objective education. ... Technique Of Data Augmentation Technique Of Data Augmentation is a method utilized for augmenting the size of data by adding on moderately adapted duplicate of already available data or recently generated ...

dblp:conf/clef/Chebbi21 fatcat:hbspwbgbqvahhkcnlglr2xzttq

Data Augmentation for Visual Question Answering

Preserved Fulltext

Retrieval-Augmented Data Augmentation for Low-Resource Domain Tasks [article]

Preserved Fulltext

Cross-Modal Generative Augmentation for Visual Question Answering [article]

Preserved Fulltext

Other Versions

Tell Me How to Ask Again: Question Data Augmentation with Controllable Rewriting in Continuous Space [article]

Preserved Fulltext

Learning to Rank Question Answer Pairs with Bilateral Contrastive Data Augmentation [article]

Preserved Fulltext

Other Versions

Are VQA Systems RAD? Measuring Robustness to Augmented Data with Focused Interventions [article]

Preserved Fulltext

Other Versions

Can LLMs Augment Low-Resource Reading Comprehension Datasets? Opportunities and Challenges [article]

Preserved Fulltext

Learning to Ask Unanswerable Questions for Machine Reading Comprehension

Preserved Fulltext

When in Doubt, Ask: Generating Answerable and Unanswerable Questions, Unsupervised [article]

Preserved Fulltext

Other Versions

ZusammenQA: Data Augmentation with Specialized Models for Cross-lingual Open-retrieval Question Answering System [article]

Preserved Fulltext

Synthesize Step-by-Step: Tools, Templates and LLMs as Data Generators for Reasoning-Based Chart VQA [article]

Preserved Fulltext

A Framework for Evaluating MRC Approaches with Unanswerable Questions

Preserved Fulltext

Can Question Generation Debias Question Answering Models? A Case Study on Question-Context Lexical Overlap [article]

Preserved Fulltext

Semantic Equivalent Adversarial Data Augmentation for Visual Question Answering [article]

Preserved Fulltext

Chabbiimen at VQA-Med 2021: Visual Generation of Relevant Natural Language Questions from Radiology Images for Anomaly Detection

Preserved Fulltext