Call for training environment suggestions Final Phase

Call for training environment suggestions Final Phase  

  By: LuukBoulogne on Feb. 24, 2022, 9:21 a.m.

Dear participants,

To decide on the training environment specifics for the Final phase, we would like to hear what environment your Algorithm needs. With your suggestions, we hope to find a balance of not using excessive computational resources, while still satisfying your algorithm's appetite for them.

Variables to discuss include available GPU memory, number of GPUs, CPU memory, training time, the need for multiple successful training runs, etc. If we decide to do multiple training runs, we are also interested to know what feedback you would find useful, (e.g. your algorithm's predictions on the public training set). We're happy to hear your thoughts.

Kind regards, Luuk

Re: Call for training environment suggestions Final Phase  

  By: mono1ith on Feb. 24, 2022, 10:16 a.m.

I suggest that we can have a (virtual) machine at least 4 GPUs, each with a GPU memory of at least 24G. For each GPU, at least 32G CPU memory may be required. So the CPU memory can be 128 G. For such a machine, it may take less than 10 hours to train a 3D resnet34 for 100 epochs with input size of 256x224x224 and 2K data samples.

Cross validation is a good way to generate multiple models that can be ensembled to boost accuracy. The number of folds for cross validation is an arbitrary choice, depending on the computational resources, the amount of data and personal preference. For 10K samples provided in the final round, I guess 3-5 folds may be sufficient. With the above mentioned machine, it may take 10h * 5 times of 2K samples * 0.5 less epochs * 4 fold hours, or 4.17 days to finish training. So I guess a week may be sufficient.

Hope it helps.

Re: Call for training environment suggestions Final Phase  

  By: simon.j on Feb. 24, 2022, 10:53 a.m.

Hello,

On my side I am in favor of having the possibility to run multiple training runs (e.g. 3 to 5) and get access to a text file containing information about these runs

Re: Call for training environment suggestions Final Phase  

  By: miriamelia on March 3, 2022, 3:11 p.m.

Hi Luuk,

one question: what's the deadline for input?

Thanks, Miriam

Re: Call for training environment suggestions Final Phase  

  By: lorenjul on March 4, 2022, 3:54 p.m.

Hello,

for the training environment, we would be happy to get access to two A-100 GPUs with at least 40GB memory, 16 CPU cores, and a total of 96GB CPU memory. Our training should run in about half a week.

Thanks!

Re: Call for training environment suggestions Final Phase  

  By: LuukBoulogne on March 7, 2022, 8:56 a.m.

Thank you all for your suggestions. We are now considering inviting the five best performing teams to train on the full training set with: Two A-100 GPUs of 40G each, 128G cpu mem, 16 cpus, 120 hours of training time

We are discussing allowing multiple training runs and sending log files as feedback to the participants, or providing the results on the public training set. So far, we did not come up with a way to do so that guarantees the security of the private training data.

We will post the final training environment specifics at the start of next week. We aim to incorporate any additional feedback that gets posted this week. Because this is very close to the deadline for the Qualification phase, the Qualification phase deadline will be extended by at least two weeks.

Best, Luuk

Re: Call for training environment suggestions Final Phase  

  By: simon.j on March 7, 2022, 5:19 p.m.

Hello,

Thanks for the update. I don't understand why the qualification phase needs to be extended following the precisions you provided. There are still two weeks to submit models and after that one to four weeks to prepare the scripts for the final round.

Regarding the multiple training runs, a possibility is to provide a number of runs and a total compute budget (e.g. 4 runs in less than 240 GPU-hours).

If only 5 teams are qualified, maybe you can manually check if no sensitive data is leaked in the logs before releasing them ? Else, providing the results on the public training set would be great.

Best Simon

Re: Call for training environment suggestions Final Phase  

  By: 王呆鹅 on March 8, 2022, 3:03 a.m.

Hello,

I am confused about 120 hours for training time. Is it the longest time for each experiment? Or it is a total time limit that we can use at most 120 hours for training during April 1st to April 21st?

Re: Call for training environment suggestions Final Phase  

  By: Sidraaleem on March 12, 2022, 5:21 p.m.

@LuukBoulogne, the challenge is very interesting, and it would be great if the deadline is extended.

Re: Call for training environment suggestions Final Phase  

  By: LuukBoulogne on March 18, 2022, 10:20 a.m.

Hi everyone,

Thank you all for your input. We had some internal discussion and yesterday made announcements with the rules, training environment, and a deadline extension. If you have any further comments or questions, do not hessitate to post them.

Best, Luuk