Contains codes for OOS analysis for the Energy Project
- Preprocessing for rolling topics
./NYtime_dtm.py
./dtm_numeric.py --dtmPath=/shared/share_mamaysky-glasserman/energy_drivers/2023/DataProcessing/NYtime_dtm_Clustering_C --outputPath=/shared/share_mamaysky-glasserman/energy_drivers/2023/DataProcessing/NYtime_dtm_numeric_441
- Obtain rolling topic models using Louvain and Genetic Algorithm, and store the membership (30min at most for each rolling model)
./run_Rolling_Topic_Models.sh
- Generate clustering_C.csv (refer to
louvain_rolling.ipynb
) after storing monthly dtm word frequencies
./concat.py --concat_info='' --concat_dtm='' --dtmPath=/shared/share_mamaysky-glasserman/energy_drivers/2023/DataProcessing/NYtime_dtm_Clustering_C --save_monthly_freq=True
- Run topic allocation for each month based on backward-looking topic models
./run_topic_allocation.sh
- Combine info
./run_info.sh
NOTE: No need to concat all files
- Fix dates
./run_date_fixed_measures.sh
- Daily aggregates (~25min)
./agg_daily.py --concatPath=/shared/share_mamaysky-glasserman/energy_drivers/2023/DataProcessing/rolling_combined_info --local_topic_model=True
Go through train_demo.ipynb
to generate results stored under ./res_Forward_Lasso
. Then run through Main_Table.ipynb
to generate the tables.