1st place solution in Yandex Cup 2023 - ML RecSys (audio classification).
Make final submission (inference of each model + blending):
pip install -r requirements.txt
- Download and unpack embeddings
- Download and unpack models. Structure is exactly the same as in
models
, but filespytorch_model.bin
are incheckpoint-*
directories. - Modify
models_dir
,embeddings_dir
,blended_output_path
(optional, ensemble output),ensemble_path
(optional, ensemble config path: choose between two files inensembles
directory) inbin/predict_ensemble.sh
- Run inference:
cd bin bash predict_ensemble.sh
I have loaded two best submissions:
ensembles/ensemble_v30.tsv
- public score: 0.3095, private score: 0.3128.ensembles/ensemble_v31.tsv
- public score: 0.3087, private score: 0.3120. Was selected as final.
Re-train whole ensemble (train each model on 3 splits of 10-folds kf):
cd bin
bash re_train_ensemble.sh
Code for models selection: src/beam_search.py