Provides examples of potential preprocessing techniques to improve SVM performance. This repo is setup and tested to train on Google Cloud.
These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.
You'll need Python 3.5+. You'll also need the Sentiment140 Dataset available at: http://help.sentiment140.com/for-students
You'll need to retriev ethe dataset from either the Google Drive or Stanford Link on that page. After downloading the data, there will be two (2) files present. We only care about the training.x.x.csv file. Ignore the manual.
Rename and save this training dataset to svm_trainer/data/ as "stanford140.csv"
Setup project as follows:
#setup virtual environment
virtualenv venv
source venv/bin/activate
# install python requirements
pip install -r requirements.txt
# run locally with:
python test.py
Google Cloud deployment instructions coming soon.
- Ben Krig - Initial work - Ben Krig
- Salvatore Nicosia - Initial work
- Nick Schiffer - Initial work
- Darren Truong - Initial work
This project is licensed under the MIT License - see the LICENSE.md file for details