Oscar Detector with Sentiment Analysis

Author: Jun Zhang <[email protected]>

# Updated on Feb 17th, 2020 - Add the support of GPT-style Transformer model

In this project, I proposed a new way to detect the Oscar winners with sentiment analysis. We collect the reviews toward one movie of all the judges and apply sentiment analysis on the movie reviews. The result is regarded as evidence that if the judges are going to vote for that moview. We collect the voting and outputs the most possible winner. The presentation of the project can be found here.

Currently the part of sentiment analysis is implemented as a RESTful API with Flask and hosted in Azure server. Now the simple website support three different models (more models are explored and can be checked in the jupyter_notebook folder): trigram+SVM , a BERT based model which is fine-tuned on the movie review data and a GPT-style transformer with fine-tuning. The trigram + SVM scores around 90% ACC and F1 but performs just so so on the short reviews and not well on the negation while the BERT based model scores mores than 93% on ACC and F1 and performs quite well on the short especially emotional reviews. The GPT-style transformer model scores 92% on ACC and F1. The later two models both perform quite well on the ambiguous cases but BERT-based model is slightly better. A test example for BERT-based model and GPT-style Transformer can be seen from Figure 1 and Figure 2. A reasonable idea behind this is that word-embedding models like BERT can capture the deep contextual meaning of each words within each sentences while n-gram models fail to do that.

Figure 1

Figure 2

To run the program with trigram + SVM model:

    python3 app.py

To run the program with BERT model:

    python3 app.py -model bert -modelPath #where you store bert model#

To run the program with GPT-style Transformer model:

    python3 app.py -model transformer -modelPath #where you store transformer model#

A ready-to-used model trained by me with BERT base uncased on movie review data can be found here.

Fine-tuned transformer model on IMDB dataset can be found here

To install the dependencies:

    pip3 install -r requirements.txt

A live demo you can try out here.

app.py main application
utlis.py preprocessing functions
utlis_bert.py preprocessing functions for BERT model
test.db SQLAlchemy database
data three datasets used in the projects, MDB dataset (movie_data), Rotten Tomato(rottenTomatoes) dataset from Kaggle and one dataset from the company(movie_review_data).
model - pretrained models used for feature extraction and prediction
jupyter_notebook sourcecode for experiments on training.
env virtual environment
static, templates source code for the webpage
IMG figures of test example

Reference

Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018).

Radford A, Narasimhan K, Salimans T, et al. Improving language understanding by generative pre-training[J]. URL https://s3-us-west-2. amazonaws. com/openai-assets/researchcovers/languageunsupervised/language understanding paper. pdf, 2018.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Oscar Detector with Sentiment Analysis

Author: Jun Zhang <[email protected]>

Figure 1

Figure 2

Contents

Reference

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
IMG		IMG
__pycache__		__pycache__
data		data
env		env
jupyter_notebook		jupyter_notebook
model		model
static/css		static/css
templates		templates
.DS_Store		.DS_Store
.gitattributes		.gitattributes
Procfile		Procfile
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt
test.db		test.db
utlis.py		utlis.py
utlis_Transformer.py		utlis_Transformer.py
utlis_bert.py		utlis_bert.py

Jun-Zhang-32108/Sentiment-Analysis

Folders and files

Latest commit

History

Repository files navigation

Oscar Detector with Sentiment Analysis

Author: Jun Zhang <[email protected]>

Figure 1

Figure 2

Contents

Reference

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages