Quora Insincere Questions Classification

LSTM with Attention classifier models for Kaggle Quora Insincere Questions Classification competition dataset. See here for a definition of sincere and insincere.

Disclaimer: none of this code will win you the competition, probably ;-)

Attention Sandwich Model

attention-sandwich-model/attention-sandwich-model.ipynb --- A number of the public kernels use the same code for creating a drop-on attention layer that sits nicely on top of the two LSTMs that make up the recurrent part of the model and does improve the performance of the LSTMs on this NLP classification task.

I wanted to dig around in the guts of how to implement an attention mechanism for NLP classification. In this notebook, I implement an attention algorithm that sits between two LSTMs (or more, if we decide to continue stacking, why not) to create an attention sandwich, and serves to make the hidden states of the first LSTM for each timestep accessible to each timestep of the second LSTM, based on attention weights (which are also learned). This has the advantage that for each point in its sequence, the second LSTM (which we'll call, for no particular reason, LSTM_q) is able to look at a much wider input (context) than just it's own hidden state and the current output of the previous LSTM (which we'll call LSTM_p), and means that the two LSTMs aren't necessarily aligned (i.e., they don't need to have the same number of timesteps).

The resulting model is quite slow to train but without ensembling it does get better results than the alternative Attention class in the public kernels, whether this is simply due to having more units in my LSTMs, I'm not sure, but I think it's worth persisting with.

Data Augmentation

An attempt to improve performance by running data augmentation on a fit generator. The generator makes noisy copies of each batch of embedded word vectors. Problems - the augmentation algorithm is sloooow, and needs optimising. Some suggestions that this improves performance a bit but currently too costly to run on the kaggle kernel within the time limit. Work in progress.

Install instructions

(optional) create yourself a new, clean Anaconda environment with something like conda create --name kaggle --clone myenv or where myenv is the environment you use for deep learning (i.e. tensorflow, keras, GPU enabled if you have one)
clone this repo into an appropriate place using git clone [email protected]:nicksexton/quora-insincere-questions
cd quora-insincere-questions
(if you don't have it already) install kaggle CLI with pip install kaggle. you'll then need to generate and copy/paste a kaggle access token into a config file, this is documented on the instructions for the kaggle cli api.
download the dataset by running kaggle competitions download quora-insincere-questions-classification. Note that this will trigger a very large (~8GiB?) download including four word embeddings files so you could maybe tweak this (see kaggle cli api.
unzip the embeddings files
the directory structure should such that input directory sits alongside the kernel directories, i.e.: quora-insincere-questions/input/embeddings/... and quora-insincere-questions/attention-sandwich-model/... etc.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
attention-sandwich-model		attention-sandwich-model
data-augmentation		data-augmentation
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Quora Insincere Questions Classification

Attention Sandwich Model

Data Augmentation

Install instructions

About

Releases

Packages

Languages

License

nicksexton/quora-insincere-questions

Folders and files

Latest commit

History

Repository files navigation

Quora Insincere Questions Classification

Attention Sandwich Model

Data Augmentation

Install instructions

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages