This repository contains the controllable text summarization (CTS) survey papers and is based on our paper, "Controllable Text Summarization: Unraveling Challenges, Approaches, and Prospects -- A Survey"
You can cite our paper as the following
@misc{urlana2023controllable,
title={Controllable Text Summarization: Unraveling Challenges, Approaches, and Prospects -- A Survey},
author={Ashok Urlana and Pruthwik Mishra and Tathagato Roy and Rahul Mishra},
year={2023},
eprint={2311.09212},
archivePrefix={arXiv},
primaryClass={cs.CL}
}
We group the papers according to the controllable aspects as Length, Coverage, Style, Abstractivity, Salience, Entity, Topic, Role, Diversity, Structure.
Paper | Datasets Used |
---|---|
MACSUM: Controllable Summarization with Mixed Attributes TACL -2023 code data | CNN Daily Mail, QMSum |
Abstractive Document Summarization with Summary-length Prediction EACL-2023 | CNNDM, NYT, WikiHow |
Length Control in Abstractive Summarization by Pretraining Information Selection ACL-2022 code | CNN-DailyMail, XSUM |
Generating Multiple-Length Summaries via Reinforcement Learning for Unsupervised Sentence Summarization EMNLP-2022 code | DUC2004 |
A Character-Level Length-Control Algorithm for Non-Autoregressive Sentence Summarization Neurips-2022 code | Gigaword, DUC2004 |
CTRLSUM: Towards Generic Controllable Text Summarization EMNLP-2022 code | CNNDM, arXiv, BIGPATENT |
A New Approach to Overgenerating and Scoring Abstractive Summaries NAACL-2021 code data | Gigaword, Newsroom |
Controllable Summarization with Constrained Markov Decision Process TACL-2021 code | CNNDM, Newsroom, DUC-2002 |
Lenatten: An effective length controlling unit for text summarization ACL-2021 code | CNNDM |
Interpretable multi headed attention for abstractive summarization at controllable lengths COLING-2020 | MSR Narratives and Thinking-Machines |
Positional Encoding to Control Output Sequence Length NAACL-2019 code | JAMUS corpus (Japanese) of different number of characters present in the summary |
Global Optimization under Length Constraint for Neural Text Summarization ACL-2019 | CNNDM, Mainichi |
A Large-Scale Multi-Length Headline Corpus for Analyzing Length-Constrained Headline Generation Model Evaluation INLG-2019 data | JAMUS corpus (Japanese) of different number of characters present in the summary |
Controllable Abstractive Summarization ACL-NMT(W)-2018 | CNN-DailyMail |
Unsupervised Sentence Compression using Denoising Auto-Encoders CoNLL-2018 code | Gigaword |
Controlling Length in Abstractive Summarization Using a Convolutional Neural Network EMNLP-2018 code | CNNDM, DMQA |
Controlling Output Length in Neural Encoder-Decoders EMNLP-2016 code | DUC2004, Gigaword |
A Neural Attention Model for Abstractive Sentence Summarization EMNLP-2015 | NYT, DUC2004 |
Paper | Datasets Used |
---|---|
MACSUM: Controllable Summarization with Mixed Attributes TACL -2023 code data | CNN Daily Mail, QMSum |
SWING : Balancing Coverage and Faithfulness for Dialogue Summarization EACL-2023 code | DIALOG-SUM, SAMSUM |
Unsupervised Multi-Granularity Summarization EMNLP-2022 data | GranuDUC, MultiNews, DUC2004, Arxiv |
Multi-LexSum: Real-World Summaries of Civil Rights Lawsuits at Multiple Granularities NIPS-2022 code data | Multi-LexSum |
Controllable Abstractive Dialogue Summarization with Sketch Supervision ALC-IJCNLP-2021 code | SAMSum |
SemSUM: Semantic Dependency Guided Neural Abstractive Summarization AAAI-2020 data | Gigaword, DUC2004 and MSR abstractive summarization dataset |
Get to the point: Summarization with pointer generator networks ACL-2017 code | CNNDM |
Paper | Datasets Used |
---|---|
Overview of the BioLaySumm 2023 Shared Task on Lay Summarization of Biomedical Research Articles ACL-BIoNLP(W)-2023 | PLOS and eLife |
Generating Summaries with Controllable Readability Levels EMNLP-2023 code | CNNDM |
HYDRASUM: Disentangling Style Features in Text Summarization with Multi-Decoder Models EMNLP-2022 code | CNN Daily Mail, XSUM, Newsroom |
Readability Controllable Biomedical Document Summarization EMNLP-2022 data | TS and PLS |
Inference time style control for summarization NAACL-2021 code | CNNDM |
Hooks in the Headline: Learning to Generate Headlines with Controlled Styles ACL-2020 code | NYT, CNN |
Generating Formality-tuned Summaries Using Input-dependent Rewards CoNLL-2019 | CNN Daily Mail + Webis-TLDR-17 corpus |
Controllable Abstractive Summarization ACL-NMT(W)-2018 | CNN-DailyMail |
Paper | Datasets Used |
---|---|
Controllable Summarization with Constrained Markov Decision Process TACL-2021 code | CNNDM, Newsroom, DUC-2002 |
Controlling the Amount of Verbatim Copying in Abstractive Summarization AAAI-2020 code | Gigaword, Newsroom |
Improving Abstraction in Text Summarization EMNLP-2018 | CNNDM |
Get to the point: Summarization with pointer generator networks ACL-2017 code | CNNDM |
SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents AAAI-2017 code | CNN/DM, DUC2002 |
Paper | Datasets Used |
---|---|
Incorporating Question Answering-Based Signals into Abstractive Summarization via Salient Span Selection EACL-2023 | CNNDM, XSUM, NYTimes |
SOCRATIC Pretraining: Question-Driven Pretraining for Controllable Summarization ACL-2023 code | QMSum and SQuALITY |
Guiding Generation for Abstractive Text Summarization based on Key Information Guide Network NAACL-HLT-2018 | CNNDM |
SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents AAAI-2017 code | CNN/DM, DUC2002 |
Paper | Datasets Used |
---|---|
SOCRATIC Pretraining: Question-Driven Pretraining for Controllable Summarization ACL-2023 code | QMSum and SQuALITY |
Extractive Entity-Centric Summarization as Sentence Selection using Bi-Encoders AACL-2022 | EntSum |
CTRLSUM: Towards Generic Controllable Text Summarization EMNLP-2022 code | CNNDM, arXiv, BIGPATENT |
ENTSUM: A Data Set for Entity-Centric Summarization ACL-2022 code data | CNNDM, NYT |
Controllable Summarization with Constrained Markov Decision Process TACL-2021 code | CNNDM, Newsroom, DUC-2002 |
Controllable Neural Dialogue Summarization with Personal Named Entity Planning EMNLP-2021 code | SAMSum |
Controllable Abstractive Sentence Summarization with Guiding Entities COLING-2020 code | Gigaword, DUC2004 |
Controllable Abstractive Summarization ACL-NMT(W)-2018 | CNN-DailyMail |
Paper | Datasets Used |
---|---|
MACSUM: Controllable Summarization with Mixed Attributes TACL -2023 code data | CNN Daily Mail, QMSum |
Topic-aware Multimodal Summarization AACL-2022 code data | MSMO |
NEWTS: A Corpus for News Topic-Focused Summarization ACL-2022 data | NEWTS |
ASPECTNEWS: Aspect-Oriented Summarization of News Documents ACL-2022 code data | ASPECTNEWS |
Aspect-controllable opinion summarization EMNLP-2021 code | SPACE, OPOSUM+ |
Decision-Focused Summarization EMNLP-2021 code data | Yelp's businesses, reviews, and user data |
CATS: Customizable Abstractive Topic-based Summarization ACM-2021 code | CNNDM, AMI , ICSI, ADSE |
WikiAsp: A Dataset for Multi-domain Aspect-based Summarization TACL-2021 code data | WikiAsp |
Summarizing Text on Any Aspects: A Knowledge-Informed Weakly-Supervised Approach EMNLP-2020 code | CNN -Dailymail, MA News, All the News |
OPINIONDIGEST: A Simple Framework for Opinion Summarization ACL-2020 code | Hotel, Yelp |
Read what you need: Controllable Aspect-based Opinion Summarization of Tourist Reviews SIGIR-2020 code data | Tourism Reviews |
Generating topic-oriented summaries using neural attention NAACL-HLT-2018 | CNNDM |
Vocabulary Tailored Summary Generation ACL-2018 | CNNDM |
Paper | Datasets Used |
---|---|
Other Roles Matter! Enhancing Role-Oriented Dialogue Summarization via Role Interactions ACL-2022 code data | CSDS, MC |
Towards Modeling Role-Aware Centrality for Dialogue Summarization AACL-2022 data | CSDS, MC |
CSDS: A fine-grained Chinese dataset for customer service dialogue summarization EMNLP-2021 code data | CSDS |
Paper | Datasets Used |
---|---|
A Well-Composed Text is Half Done! Composition Sampling for Diverse Conditional Generation ACL-2022 code | CNN/DailyMail and Xsum and question generation (SQuAD) |
Paper | Datasets Used |
---|---|
STRONG – Structure Controllable Legal Opinion Summary Generation IJCNLP-AACL-2023 | CanLII |
SentBS: Sentence-level beam search for controllable summarization EMNLP-2022 code | Meta Review Dataset (MReD) |
MReD: A Meta-Review Dataset for Structure-Controllable Text Generation ACL-2022 code data | MReD |
Planning with Learned Entity Prompts for Abstractive Summarization TACL-2021 | CNN/DailyMail, XSum, SAMSum, and BillSum |