A Closer Look at Data Bias in Neural Extractive Summarization Models

Zhong, Ming; Wang, Danqing; Liu, Pengfei; Qiu, Xipeng; Huang, Xuanjing

Computer Science > Computation and Language

arXiv:1909.13705 (cs)

[Submitted on 30 Sep 2019]

Title:A Closer Look at Data Bias in Neural Extractive Summarization Models

Authors:Ming Zhong, Danqing Wang, Pengfei Liu, Xipeng Qiu, Xuanjing Huang

View PDF

Abstract:In this paper, we take stock of the current state of summarization datasets and explore how different factors of datasets influence the generalization behaviour of neural extractive summarization models. Specifically, we first propose several properties of datasets, which matter for the generalization of summarization models. Then we build the connection between priors residing in datasets and model designs, analyzing how different properties of datasets influence the choices of model structure design and training methods. Finally, by taking a typical dataset as an example, we rethink the process of the model design based on the experience of the above analysis. We demonstrate that when we have a deep understanding of the characteristics of datasets, a simple approach can bring significant improvements to the existing state-of-the-art model.A

Comments:	EMNLP 2019 Workshop on New Frontiers in Summarization
Subjects:	Computation and Language (cs.CL)
Cite as:	arXiv:1909.13705 [cs.CL]
	(or arXiv:1909.13705v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.1909.13705

Submission history

From: Pengfei Liu [view email]
[v1] Mon, 30 Sep 2019 13:55:10 UTC (358 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CL

< prev | next >

new | recent | 2019-09

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Ming Zhong
Danqing Wang
Pengfei Liu
Xipeng Qiu
Xuanjing Huang

export BibTeX citation

Computer Science > Computation and Language

Title:A Closer Look at Data Bias in Neural Extractive Summarization Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:A Closer Look at Data Bias in Neural Extractive Summarization Models

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators