Data Preparation and Training Resources ¶
By: capparella.1746513 on July 12, 2022, 6:25 p.m.
Hello!
This is my very first "Medical Challenge" I joined, and I did it for research, and as part of my Master Thesis, so I'm working solo and with limited resources: for these reasons I have some doubts about data preparation and training resources:
1) I saw from MIC-DKFZ nnUNet GiHub page that nnUNet requires both a folder structure and the dataformat close to the ones used for Medical Segmentation Dechathlon (MSD). Indeed I see some similarities with yours (as shown in picai_baseline and picai_prep), but then I saw also some parts that did not match:
from MIC-DKFZ instructions of dataset conversion:
"...Imaging modalities are identified by nnU-Net by their suffix: a four-digit integer at the end of the filename. Imaging files must therefore follow the following naming convention: case_identifier_XXXX.nii.gz. Hereby, XXXX is the modality identifier..." and they refer explicitly to the MSD 'BrainTumor' task modalities, i.e.: FLAIR (0000), T1w (0001), T1gd (0002) and T2w (0003). Their codes (in particular 0000, 0001 and 0002) match the ones created for the files in 'imagesTr' (created with prepare_data.py), but do not match the actual modalities used in this challenge. So: did I set 'prepare_data' setting in a bad way? Am I missing something? Can these codes have arbitrary meaning according to the challenge?
2) I have very limited resources ( GTX 1050, 8G, mobile version) and I have not been guaranteed other resources, so I was wondering if keep going with this challenge would have been possible to me: do you have any benchmark, expected minum requirements and so on?
3) When I ran the 'prepare_data.py' it took nearly 2hrs to complete the archive generation: is it normal? Is it due to my resources or some bad setting?
Thanks in advance for your help, hope to hear from you soon!
Mattia