-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Grid search implemented for some of the classes (#32)
* Added emotion feature * Added emotion feature fully working * Gridsearch implemented for LR and SVM * Added gridsearch for Multinomial NB
- Loading branch information
Showing
3 changed files
with
49 additions
and
17 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,22 +1,23 @@ | ||
# Contains the raw data downloaded from https://www.kaggle.com/c/reddit-comment-classification-comp-551/data | ||
raw_data_dir_path: str = "../data/raw_data" | ||
# Contains all the data in feature vector form | ||
# Contains all the cleaned processed raw data | ||
processed_dir_path: str = "../data/processed_data" | ||
# Contain csv files of different vocabularies | ||
vocabularies_dir_path: str = "../data/vocabularies" | ||
# Path to which scripts will dump data | ||
results_dir_path: str = "../results" | ||
|
||
# These are all the different dictionary names ("LEMMA", "STEM") | ||
vocabularies_to_run = ["LEMMA"] | ||
vocabularies_to_run = ["STEM", "LEMMA"] | ||
|
||
# These are all the different vectorizers to run ("BINARY", "TFIDF") | ||
vectorizers_to_run = ["TFIDF"] | ||
|
||
# These are all the models to run and compare performance on a k fold cross validation ("LR", "NB", "MNNB", "KNN", "DT", "RF", "SVM", "SUPER") | ||
models_to_run = ["MNNB", "LR", "SVM", "RF", "DT"] | ||
models_to_run = ["MNNB"] | ||
|
||
# If this is true, run gridsearch on each model (This will significantly increase the runtime of the validation pipeline for model types that support gridsearch) | ||
run_grid_search = True | ||
|
||
# Config to run for kaggle | ||
kaggle_vocab = "LEMMA" | ||
kaggle_vocab = "STEM" | ||
kaggle_vectorizer = "TFIDF" | ||
kaggle_model = "MNNB" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters