Ensemble summarization models to leverage performance on CoronaNet

Date

2021-05

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

The COVID-19 pandemic is the most fast-spreading and devastating event in recent history. CoronaNet Research Project produced the COVID-19 dataset regarding how governments responded to this pandemic. The dataset has given hand-written summaries for each recorded case and source URL links. The text data from the links are generally long articles, and we intend to apply NLP summarization in such a context. There are two approaches for summarization tasks. The abstractive methods, which extract the text from the source text to form summaries, are usually not very flexible but simpler. The other approach is abstractive techniques, which are usually more complicated than abstractive methods but more flexible in semantics. Since the deep learning field advancements, NLP summarization tasks have seen many successes and applications in our daily lives. In this thesis, we built a system that ensembles pre-training NLP models to leverage the abstractive summarization performance. We included focal loss function to improve performance by focusing on the samples with lower scores. Our proposed ensemble method improved the overall ROUGE scores compared to the individual models.

Description

Keywords

NLP, Text Summarization, Deep Learning

Citation