SummPip

This code is for Sigir 2020 paper SummPip: Unsupervised Multi-Document Summarization with Sentence Graph Compression

Python version: this code is in Python3.6

Dataset

source data which has minimal text pre-processing

target data (for evaluation)

Test SummPip

Step1: place downloaded dataset in the folder ./dataset/multi_news/.

Step2: download the pre-trained word2vec model and place it in the folder ./word_vec/multi_news.

If you want to run SummPip on your own dataset, you need to pre-train a W2V model yourself first with gensim.

Step3: Unsupervised Extractive Summarisation

python run_main.py

You may want to change -nb_clusters and -nb_words to control the length of the output summary when applying SummPip on your own dataset.

Citation

Please cite if you use our code in production or publications

@inproceedings{zhao2020summpip,
  title={SummPip: Unsupervised Multi-Document Summarization with Sentence Graph Compression},
  author={Zhao, Jinming and Liu, Ming and Gao, Longxiang and Jin, Yuan and Du, Lan and Zhao, He and Zhang, He and Haffari, Gholamreza},
  booktitle={Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval},
  pages={1949--1952},
  year={2020}
}

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
resources		resources
README.md		README.md
__init__.py		__init__.py
run_multinews.py		run_multinews.py
sentence_graph.py		sentence_graph.py
summarizer.py		summarizer.py
takahe.py		takahe.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SummPip

Dataset

Test SummPip

Citation

About

Releases

Packages

Languages

mingzi151/SummPip

Folders and files

Latest commit

History

Repository files navigation

SummPip

Dataset

Test SummPip

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages