I am currently working on summarizing chat context where it helps an agent in understanding previous context quickly. It interests me to apply the deep learning models to existing datasets and how they perform on them. I believe news articles are rich in grammar and vocabulary which allows us to gain greater insights.
The dataset consists of 4515 examples and contains Author_name, Headlines, Url of Article, Short text, Complete Article. I gathered the summarized news from Inshorts and only scraped the news articles from Hindu, Indian times and Guardian. Time period ranges from febrauary to august 2017.
I would like to thank the authors of Inshorts for their amazing work
- Generating short length descriptions(headlines) from text(news articles).
- Summarizing large amount of information which can be represented in compressed space
When I was working on the summarization task I didn't find any open source data-sets to work on, I believe there are people just like me who are working on these tasks and I hope it helps them.
It will be really helpful if anyone found nice insights from this data and can share their work. Thankyou...!!!