GDA: Generative Data Augmentation Techniques for Relation Extraction Tasks

Hu, Xuming; Liu, Aiwei; Tan, Zeqi; Zhang, Xin; Zhang, Chenwei; King, Irwin; Yu, Philip S.

Computer Science > Computation and Language

arXiv:2305.16663 (cs)

[Submitted on 26 May 2023 (v1), last revised 15 Jun 2023 (this version, v2)]

Title:GDA: Generative Data Augmentation Techniques for Relation Extraction Tasks

Authors:Xuming Hu, Aiwei Liu, Zeqi Tan, Xin Zhang, Chenwei Zhang, Irwin King, Philip S. Yu

View PDF

Abstract:Relation extraction (RE) tasks show promising performance in extracting relations from two entities mentioned in sentences, given sufficient annotations available during training. Such annotations would be labor-intensive to obtain in practice. Existing work adopts data augmentation techniques to generate pseudo-annotated sentences beyond limited annotations. These techniques neither preserve the semantic consistency of the original sentences when rule-based augmentations are adopted, nor preserve the syntax structure of sentences when expressing relations using seq2seq models, resulting in less diverse augmentations. In this work, we propose a dedicated augmentation technique for relational texts, named GDA, which uses two complementary modules to preserve both semantic consistency and syntax structures. We adopt a generative formulation and design a multi-tasking solution to achieve synergies. Furthermore, GDA adopts entity hints as the prior knowledge of the generative model to augment diverse sentences. Experimental results in three datasets under a low-resource setting showed that GDA could bring {\em 2.0\%} F1 improvements compared with no augmentation technique. Source code and data are available.

Comments:	Accepted to ACL 2023 (Findings), Long Paper, 12 pages
Subjects:	Computation and Language (cs.CL)
MSC classes:	68T01
ACM classes:	I.2.7
Cite as:	arXiv:2305.16663 [cs.CL]
	(or arXiv:2305.16663v2 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2305.16663
Journal reference:	ACL 2023

Submission history

From: Aiwei Liu [view email]
[v1] Fri, 26 May 2023 06:21:01 UTC (1,529 KB)
[v2] Thu, 15 Jun 2023 02:43:12 UTC (1,528 KB)

🚨2024-09-29: arxiv.org is experience DB issues. The announce tonight will be 3 hours later than usual.🚨

Computer Science > Computation and Language

Title:GDA: Generative Data Augmentation Techniques for Relation Extraction Tasks

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

🚨2024-09-29: arxiv.org is experience DB issues. The announce tonight will be 3 hours later than usual.🚨

Computer Science > Computation and Language

Title:GDA: Generative Data Augmentation Techniques for Relation Extraction Tasks

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators