Time-Efficient Training for 3D Unet Model using MONAI Framework

Time-Efficient Training for 3D Unet Model using MONAI Framework  

  By: rezasafdari on July 15, 2024, 10:26 a.m.

Hello there,

I have developed a basic 3D Unet model for Task 1 of this competition utilizing the MONAI framework. In my code, I randomly selected 6 patches of size 96x96x96 for training the model on two L30s Nvidia GPUs. However, training the model is quite time-consuming, taking roughly an hour for each epoch. The situation is even more problematic for 5-fold training. Has anyone else encountered a similar issue? I am open to any suggestions for addressing this problem.

 Last edited by: rezasafdari on July 15, 2024, 12:09 p.m., edited 1 time in total.

Re: Time-Efficient Training for 3D Unet Model using MONAI Framework  

  By: shadab on July 16, 2024, 6 a.m.

Have you tried mixed precision training? And are you distributing your training across the two GPUs you have (via DistributedDataParallel() or something similar)?

Re: Time-Efficient Training for 3D Unet Model using MONAI Framework  

  By: rezasafdari on July 16, 2024, 8:22 a.m.

Thanks for replying. Yes, I have employed both mixed precision and distributed training by using the DistributedDataParallel() module.

Re: Time-Efficient Training for 3D Unet Model using MONAI Framework  

  By: jdex on July 19, 2024, 10:06 p.m.

This sounds like a classical data-bottleneck. Have you tried to pre-compute patches? If you have enough disk space you also could save them as an uncompressed file type i.e. *.npy. Sometimes the network can also be a limiting factor. Or your data is located on a slow harddrive. For the datacentric code i also did the sampling and transformations in advance. Best Jakob

Re: Time-Efficient Training for 3D Unet Model using MONAI Framework  

  By: rezasafdari on July 24, 2024, 9:18 a.m.

Hello jdex, Your recommendation proved to be extremely valuable in resolving the data bottleneck problem I encountered. By pre-computing patches and storing them as uncompressed *.npy files, I've noticed a significant improvement in the performance of the training process. I truly appreciate your assistance. Thank you!

 Last edited by: rezasafdari on July 24, 2024, 9:18 a.m., edited 1 time in total.