Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Huge memory consumption during training #83

Open
LaCandela opened this issue Apr 23, 2024 · 0 comments
Open

Huge memory consumption during training #83

LaCandela opened this issue Apr 23, 2024 · 0 comments

Comments

@LaCandela
Copy link

LaCandela commented Apr 23, 2024

I noticed that during training the process consumes lots of CPU memory. For example, if I use 8 workers with batch size of 45 the memory consumption is more than 50GB and it is slowly increasing up to 60GB when my system shuts down.
Has anyone noticed this before? What is the memory intensive part of the training pipeline? Is it related to some inefficient mmcv operation?

I start the training with this command:
CUDA_VISIBLE_DEVICES=0 python tools/train.py --config configs/fastbev/exp/paper/fastbev_m0_r18_s256x704_v200x200x4_c192_d2_f4.py --work-dir /Fast-BEV/temp

Could it be a problem that I don't use slurm?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant