Project

Article Summary Deep Learning

About

Summary

Using deep learning and scraping to analyze/summarize articles! Just drop in any URL!

Installation & Usage

Github Repo

# You will need to have Python 3 and Pip installed

$ git clone https://github.com/ianramzy/article-summary-deep-learning.git

$ cd article-summary-deep-learning

$ pip install -r requirements.txt

$ python -m spacy download en_core_web_sm

$ python3 app.py

Inspiration

In a day and age where information is not only abundant but overflowing, I wanted to create a faster way parse all of it. I originally wanted a browser extension that would summarize any page that I was reading in a few bullet points so that I could save time. But I later settled on a web app for more flexibility.

What It Does

The application takes in a URL and makes request to that page scraping all of the data from it. The data is then cleaned (removing noise) and a name entity recognition is performed where all entities are labeled in the text (that's the cool visualization/highlighted part). It then takes the most common entity in the text and assumes it is the subject of the article and proceeds to extract facts from the text itself and renders the frontend.

Technologies

The web application is built using flask, and the frontend is pure css and html. For web scraping beautifulsoup4 was used and deep learning was done using Spacy and textacy. Specifically Spacy for named entity recognition (NER) and textacy for sentence structure analysis.

Future Ideas And Whats Next

I would like to gather more data for training the models. And I would also like to add a feature where the algorithm will look for more sources on the internet about a topic (i.e. Obama) and gather more data to analyse there.

Another project by Ian Ramzy