GitHub - ololo123321/yandex-cup-2023

1st place solution in Yandex Cup 2023 - ML RecSys (audio classification).

Make final submission (inference of each model + blending):

pip install -r requirements.txt
Download and unpack embeddings
Download and unpack models. Structure is exactly the same as in models, but files pytorch_model.bin are in checkpoint-* directories.
Modify models_dir, embeddings_dir, blended_output_path (optional, ensemble output), ensemble_path (optional, ensemble config path: choose between two files in ensembles directory) in bin/predict_ensemble.sh
Run inference:
```
cd bin
bash predict_ensemble.sh
```

I have loaded two best submissions:

ensembles/ensemble_v30.tsv - public score: 0.3095, private score: 0.3128.
ensembles/ensemble_v31.tsv - public score: 0.3087, private score: 0.3120. Was selected as final.

Re-train whole ensemble (train each model on 3 splits of 10-folds kf):

cd bin
bash re_train_ensemble.sh

Code for models selection: src/beam_search.py

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
bin		bin
config		config
ensembles		ensembles
jobs		jobs
models		models
splits		splits
src		src
.gitignore		.gitignore
README.md		README.md
leaderboard.png		leaderboard.png
preza.pdf		preza.pdf
requirements.txt		requirements.txt
solution.txt		solution.txt

Provide feedback