forked from WladimirSidorenko/SemEval-2016
-
Notifications
You must be signed in to change notification settings - Fork 0
Twitter Sentiment System for SemEval 2016
License
Herbchn/SemEval-2016
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
****************************************************** * SemEval-2016 Task 4: Sentiment Analysis on Twitter * * * * TRAINING + DEV DATA * * * * http://alt.qcri.org/semeval2016/task4/ * * [email protected] * * * ****************************************************** TRAINING + DEV dataset for SemEval-2016 Task 4 Version 1.0: October 15, 2015 Task organizers: * Preslav Nakov, Qatar Computing Research Institute, HBKU * Alan Ritter, The Ohio State University * Sara Rosenthal, Columbia University * Fabrizio Sebastiani, Qatar Computing Research Institute, HBKU * Veselin Stoyanov, Facebook NOTES 1. Please note that by downloading the Twitter data you agree to abide by the Twitter terms of service (https://twitter.com/tos), and in particular you agree not to redistribute the data and to delete tweets that are marked deleted in the future. 2. The distribution consists of a set of Twitter status IDs with annotations for Subtasks A, B, C, D, and E: topic polarity and trends toward a topic. There are exactly 100 tweets provided per topic and a total of 100 topics. You should use the downloading script to obtain the corresponding tweets: https://github.com/aritter/twitter_download 3. The "neutral" label in the annotations stands for objective_OR_neutral. FILES data/train/src/100_topics_100_tweets.topic-two-point.subtask-BD.train.txt -- training input for subtasks B and D data/train/src/100_topics_100_tweets.topic-five-point.subtask-CE.train.txt -- training input for subtasks C and E data/dev/src/100_topics_100_tweets.topic-two-point.subtask-BD.dev.txt -- dev input for subtasks B and D data/dev/src/100_topics_100_tweets.topic-five-point.subtask-CE.dev.txt -- dev input for subtasks C and E INPUT DATA FORMAT -----------------------SUBTASK A----------------------------------------- The format for the training/dev file is as follows: id<TAB>label where "label" can be 'positive', 'neutral' or 'negative'. -----------------------SUBTASKS B,D-------------------------------------- ** Task we might deal with. The format for the training/dev file is as follows: topic<TAB>id<TAB>label where "label" can be 'positive' or 'negative' (note: no 'neutral'!). -----------------------SUBTASKS C,E-------------------------------------- * Task we are dealing with. The format for the training/dev file is as follows: topic<TAB>id<TAB>label where "label" can be -2, -1, 0, 1, or 2, corresponding to "strongly negative", "negative", "negative or neutral", "positive", and "strongly positive". LICENSE The accompanying dataset is released under a Creative Commons Attribution 3.0 Unported License (http://creativecommons.org/licenses/by/3.0/). CITATION You can cite the folowing paper when referring to the dataset: @InProceedings{Rosenthal-EtAl:2015:SemEval, author = {Sara Rosenthal and Alan Ritter and Veselin Stoyanov and Svetlana Kiritchenko and Saif Mohammad and Preslav Nakov}, title = {SemEval-2015 Task 10: Sentiment Analysis in Twitter}, booktitle = {Proceedings of the 9th International Workshop on Semantic Evaluation (SemEval 2015)}, year = {2015}, publisher = {Association for Computational Linguistics}, } USEFUL LINKS: Google group: [email protected] SemEval-2016 Task 4 website: http://alt.qcri.org/semeval2016/task4/ SemEval-2016 website: http://alt.qcri.org/semeval2016/
About
Twitter Sentiment System for SemEval 2016
Resources
License
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published
Languages
- Python 94.8%
- Batchfile 4.3%
- Cuda 0.9%