Skip to content

SVM Sentiment Analysis on Stanford140 Twitter Dataset

Notifications You must be signed in to change notification settings

benkrig/svm-twitter-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Code style: black License: MIT

Feature Extraction Optimization for SVM

Provides examples of potential preprocessing techniques to improve SVM performance. This repo is setup and tested to train on Google Cloud.

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes. See deployment for notes on how to deploy the project on a live system.

Prerequisites

You'll need Python 3.5+. You'll also need the Sentiment140 Dataset available at: http://help.sentiment140.com/for-students

You'll need to retriev ethe dataset from either the Google Drive or Stanford Link on that page. After downloading the data, there will be two (2) files present. We only care about the training.x.x.csv file. Ignore the manual.

Rename and save this training dataset to svm_trainer/data/ as "stanford140.csv"

Setup project as follows:

#setup virtual environment
virtualenv venv
source venv/bin/activate

# install python requirements
pip install -r requirements.txt

# run locally with:
python test.py

Deployment

Google Cloud deployment instructions coming soon.

Authors

  • Ben Krig - Initial work - Ben Krig
  • Salvatore Nicosia - Initial work
  • Nick Schiffer - Initial work
  • Darren Truong - Initial work

License

This project is licensed under the MIT License - see the LICENSE.md file for details

About

SVM Sentiment Analysis on Stanford140 Twitter Dataset

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages