The algorithm failed on one or more cases.

The algorithm failed on one or more cases.  

  By: kabbas5707efb29ff33724134 on Aug. 14, 2023, 7:10 p.m.

Hello, can you please list my error for phase-2? I am using the same image I run for the phase-1; not sure why it generates the error this time for phase-2,

Thanks Cheers Abbas

Re: The algorithm failed on one or more cases.  

  By: giansteve on Aug. 14, 2023, 7:31 p.m.

Hi, I could see the following message in the Stderr log: "2023-08-14T18:51:42.973000+00:00 Bus error (core dumped)". The overall reason for failing is: "Time limit exceeded"

I hope this helps

Best Gian Marco

Re: The algorithm failed on one or more cases.  

  By: kabbas5707efb29ff33724134 on Aug. 14, 2023, 7:35 p.m.

I see; thanks for your reply, Can you please mention the dimension of all images? Not sure for which image it failed? If the time limit is 50 minutes for five images, then did it stop before that? It stopped just after maybe 20 minutes.

Cheers Abbas

Re: The algorithm failed on one or more cases.  

  By: giansteve on Aug. 14, 2023, 7:38 p.m.

It seems it stopped at the last picture. The limit is still the same for phase 1, therefore 10 min per case

Re: The algorithm failed on one or more cases.  

  By: kabbas5707efb29ff33724134 on Aug. 14, 2023, 7:41 p.m.

Yes, But it did not last for 50 minutes. It died before that.

Re: The algorithm failed on one or more cases.  

  By: giansteve on Aug. 14, 2023, 7:45 p.m.

I paste here all the information I get, I hope this answer your question :)

Stdout:

2023-08-14T18:49:38.942000+00:00 nnUNet_raw is not defined and nnU-Net can only be used on data for which preprocessed files are already present on your system. nnU-Net cannot be used for experiment planning and preprocessing like this. If this is not intended, please read documentation/setting_up_paths.md for information on how to set this up properly.
2023-08-14T18:49:38.942000+00:00 nnUNet_preprocessed is not defined and nnU-Net can not be used for preprocessing or training. If this is not intended, please read documentation/setting_up_paths.md for information on how to set this up.
2023-08-14T18:49:38.942000+00:00 nnUNet_results is not defined and nnU-Net cannot be used for training or inference. If this is not intended behavior, please read documentation/setting_up_paths.md for information on how to set this up.
2023-08-14T18:49:38.942000+00:00 input FILES  ['c61c27ee-fd41-4c28-8b87-33cff36059bc.mha']
2023-08-14T18:49:38.942000+00:00 hash                                 -1774304799504854648
2023-08-14T18:49:38.942000+00:00 path    /input/images/ct/c61c27ee-fd41-4c28-8b87-33cff...
2023-08-14T18:49:38.942000+00:00 Name: 0, dtype: object
2023-08-14T18:49:38.942000+00:00 0
2023-08-14T18:49:38.942000+00:00 Image Name  c61c27ee-fd41-4c28-8b87-33cff36059bc.mha
2023-08-14T18:49:38.942000+00:00 Working on device:  cuda
2023-08-14T18:49:38.942000+00:00 Model_1 input files ['test_1_0000.nrrd']
2023-08-14T18:49:38.942000+00:00 There are 1 cases in the source folder
2023-08-14T18:49:38.942000+00:00 I am process 0 out of 1 (max process ID is 0, we start counting with 0!)
2023-08-14T18:49:38.942000+00:00 There are 1 cases that I would like to predict
2023-08-14T18:50:02.948000+00:00 nnUNet_raw is not defined and nnU-Net can only be used on data for which preprocessed files are already present on your system. nnU-Net cannot be used for experiment planning and preprocessing like this. If this is not intended, please read documentation/setting_up_paths.md for information on how to set this up properly.
2023-08-14T18:50:02.948000+00:00 nnUNet_preprocessed is not defined and nnU-Net can not be used for preprocessing or training. If this is not intended, please read documentation/setting_up_paths.md for information on how to set this up.
2023-08-14T18:50:05.948000+00:00 
2023-08-14T18:50:05.948000+00:00 Predicting test_1:
2023-08-14T18:50:05.948000+00:00 perform_everything_on_gpu: True
2023-08-14T18:50:05.948000+00:00 Input shape: torch.Size([1, 180, 234, 234])
2023-08-14T18:50:05.948000+00:00 step_size: 0.5
2023-08-14T18:50:05.948000+00:00 mirror_axes: (0, 1, 2)
2023-08-14T18:50:05.948000+00:00 n_steps 18, image size is torch.Size([180, 234, 234]), tile_size [128, 128, 128], tile_step_size 0.5
2023-08-14T18:50:05.948000+00:00 steps:
2023-08-14T18:50:05.948000+00:00 [[0, 52], [0, 53, 106], [0, 53, 106]]
2023-08-14T18:50:05.948000+00:00 preallocating arrays
2023-08-14T18:50:05.948000+00:00 running prediction
2023-08-14T18:50:21.953000+00:00 Input shape: torch.Size([1, 180, 234, 234])
2023-08-14T18:50:21.953000+00:00 step_size: 0.5
2023-08-14T18:50:21.953000+00:00 mirror_axes: (0, 1, 2)
2023-08-14T18:50:21.953000+00:00 n_steps 18, image size is torch.Size([180, 234, 234]), tile_size [128, 128, 128], tile_step_size 0.5
2023-08-14T18:50:21.953000+00:00 steps:
2023-08-14T18:50:21.953000+00:00 [[0, 52], [0, 53, 106], [0, 53, 106]]
2023-08-14T18:50:21.953000+00:00 preallocating arrays
2023-08-14T18:50:21.953000+00:00 running prediction
2023-08-14T18:50:35.956000+00:00 Input shape: torch.Size([1, 180, 234, 234])
2023-08-14T18:50:35.957000+00:00 step_size: 0.5
2023-08-14T18:50:35.957000+00:00 mirror_axes: (0, 1, 2)
2023-08-14T18:50:35.957000+00:00 n_steps 18, image size is torch.Size([180, 234, 234]), tile_size [128, 128, 128], tile_step_size 0.5
2023-08-14T18:50:35.957000+00:00 steps:
2023-08-14T18:50:35.957000+00:00 [[0, 52], [0, 53, 106], [0, 53, 106]]
2023-08-14T18:50:35.957000+00:00 preallocating arrays
2023-08-14T18:50:35.957000+00:00 running prediction
2023-08-14T18:50:49.960000+00:00 Input shape: torch.Size([1, 180, 234, 234])
2023-08-14T18:50:49.960000+00:00 step_size: 0.5
2023-08-14T18:50:49.960000+00:00 mirror_axes: (0, 1, 2)
2023-08-14T18:50:49.960000+00:00 n_steps 18, image size is torch.Size([180, 234, 234]), tile_size [128, 128, 128], tile_step_size 0.5
2023-08-14T18:50:49.960000+00:00 steps:
2023-08-14T18:50:49.960000+00:00 [[0, 52], [0, 53, 106], [0, 53, 106]]
2023-08-14T18:50:49.960000+00:00 preallocating arrays
2023-08-14T18:50:49.960000+00:00 running prediction
2023-08-14T18:51:04.964000+00:00 Input shape: torch.Size([1, 180, 234, 234])
2023-08-14T18:51:04.964000+00:00 step_size: 0.5
2023-08-14T18:51:04.964000+00:00 mirror_axes: (0, 1, 2)
2023-08-14T18:51:04.964000+00:00 n_steps 18, image size is torch.Size([180, 234, 234]), tile_size [128, 128, 128], tile_step_size 0.5
2023-08-14T18:51:04.964000+00:00 steps:
2023-08-14T18:51:04.964000+00:00 [[0, 52], [0, 53, 106], [0, 53, 106]]
2023-08-14T18:51:04.964000+00:00 preallocating arrays
2023-08-14T18:51:04.964000+00:00 running prediction

Stderr:

2023-08-14T18:50:20.952000+00:00 
2023-08-14T18:50:20.952000+00:00   0%|          | 0/18 [00:00<?, ?it/s]
2023-08-14T18:50:20.952000+00:00   6%|         | 1/18 [00:01<00:33,  2.00s/it]
2023-08-14T18:50:20.952000+00:00  11%|         | 2/18 [00:02<00:20,  1.29s/it]
2023-08-14T18:50:20.952000+00:00  17%|█▋        | 3/18 [00:03<00:15,  1.06s/it]
2023-08-14T18:50:20.952000+00:00  22%|██▏       | 4/18 [00:04<00:13,  1.06it/s]
2023-08-14T18:50:20.952000+00:00  28%|██▊       | 5/18 [00:05<00:11,  1.12it/s]
2023-08-14T18:50:20.952000+00:00  33%|███▎      | 6/18 [00:05<00:10,  1.17it/s]
2023-08-14T18:50:20.952000+00:00  39%|███▉      | 7/18 [00:06<00:09,  1.20it/s]
2023-08-14T18:50:20.952000+00:00  44%|████▍     | 8/18 [00:07<00:08,  1.22it/s]
2023-08-14T18:50:20.952000+00:00  50%|█████     | 9/18 [00:08<00:07,  1.24it/s]
2023-08-14T18:50:20.952000+00:00  56%|█████▌    | 10/18 [00:09<00:06,  1.25it/s]
2023-08-14T18:50:20.952000+00:00  61%|██████    | 11/18 [00:09<00:05,  1.26it/s]
2023-08-14T18:50:20.952000+00:00  67%|██████▋   | 12/18 [00:10<00:04,  1.26it/s]
2023-08-14T18:50:20.952000+00:00  72%|███████▏  | 13/18 [00:11<00:03,  1.27it/s]
2023-08-14T18:50:20.952000+00:00  78%|███████▊  | 14/18 [00:12<00:03,  1.27it/s]
2023-08-14T18:50:20.952000+00:00  83%|████████▎ | 15/18 [00:12<00:02,  1.27it/s]
2023-08-14T18:50:20.952000+00:00  89%|████████▉ | 16/18 [00:13<00:01,  1.27it/s]
2023-08-14T18:50:20.952000+00:00  94%|█████████▍| 17/18 [00:14<00:00,  1.27it/s]
2023-08-14T18:50:20.952000+00:00 100%|██████████| 18/18 [00:15<00:00,  1.27it/s]
2023-08-14T18:50:20.952000+00:00 100%|██████████| 18/18 [00:15<00:00,  1.17it/s]
2023-08-14T18:50:34.956000+00:00 
2023-08-14T18:50:34.956000+00:00   0%|          | 0/18 [00:00<?, ?it/s]
2023-08-14T18:50:34.956000+00:00   6%|         | 1/18 [00:00<00:07,  2.42it/s]
2023-08-14T18:50:34.956000+00:00  11%|         | 2/18 [00:01<00:10,  1.58it/s]
2023-08-14T18:50:34.956000+00:00  17%|█▋        | 3/18 [00:01<00:10,  1.42it/s]
2023-08-14T18:50:34.956000+00:00  22%|██▏       | 4/18 [00:02<00:10,  1.36it/s]
2023-08-14T18:50:34.956000+00:00  28%|██▊       | 5/18 [00:03<00:09,  1.32it/s]
2023-08-14T18:50:34.956000+00:00  33%|███▎      | 6/18 [00:04<00:09,  1.31it/s]
2023-08-14T18:50:34.956000+00:00  39%|███▉      | 7/18 [00:05<00:08,  1.29it/s]
2023-08-14T18:50:34.956000+00:00  44%|████▍     | 8/18 [00:05<00:07,  1.29it/s]
2023-08-14T18:50:34.956000+00:00  50%|█████     | 9/18 [00:06<00:07,  1.28it/s]
2023-08-14T18:50:34.956000+00:00  56%|█████▌    | 10/18 [00:07<00:06,  1.28it/s]
2023-08-14T18:50:34.956000+00:00  61%|██████    | 11/18 [00:08<00:05,  1.27it/s]
2023-08-14T18:50:34.956000+00:00  67%|██████▋   | 12/18 [00:09<00:04,  1.27it/s]
2023-08-14T18:50:34.956000+00:00  72%|███████▏  | 13/18 [00:09<00:03,  1.27it/s]
2023-08-14T18:50:34.956000+00:00  78%|███████▊  | 14/18 [00:10<00:03,  1.27it/s]
2023-08-14T18:50:34.956000+00:00  83%|████████▎ | 15/18 [00:11<00:02,  1.27it/s]
2023-08-14T18:50:34.956000+00:00  89%|████████▉ | 16/18 [00:12<00:01,  1.27it/s]
2023-08-14T18:50:34.956000+00:00  94%|█████████▍| 17/18 [00:13<00:00,  1.27it/s]
2023-08-14T18:50:34.956000+00:00 100%|██████████| 18/18 [00:13<00:00,  1.27it/s]
2023-08-14T18:50:34.956000+00:00 100%|██████████| 18/18 [00:13<00:00,  1.30it/s]
2023-08-14T18:50:49.960000+00:00 
2023-08-14T18:50:49.960000+00:00   0%|          | 0/18 [00:00<?, ?it/s]
2023-08-14T18:50:49.960000+00:00   6%|         | 1/18 [00:00<00:07,  2.38it/s]
2023-08-14T18:50:49.960000+00:00  11%|         | 2/18 [00:01<00:10,  1.57it/s]
2023-08-14T18:50:49.960000+00:00  17%|█▋        | 3/18 [00:02<00:10,  1.41it/s]
2023-08-14T18:50:49.960000+00:00  22%|██▏       | 4/18 [00:02<00:10,  1.35it/s]
2023-08-14T18:50:49.960000+00:00  28%|██▊       | 5/18 [00:03<00:09,  1.32it/s]
2023-08-14T18:50:49.960000+00:00  33%|███▎      | 6/18 [00:04<00:09,  1.30it/s]
2023-08-14T18:50:49.960000+00:00  39%|███▉      | 7/18 [00:05<00:08,  1.29it/s]
2023-08-14T18:50:49.960000+00:00  44%|████▍     | 8/18 [00:05<00:07,  1.28it/s]
2023-08-14T18:50:49.960000+00:00  50%|█████     | 9/18 [00:06<00:07,  1.28it/s]
2023-08-14T18:50:49.960000+00:00  56%|█████▌    | 10/18 [00:07<00:06,  1.27it/s]
2023-08-14T18:50:49.960000+00:00  61%|██████    | 11/18 [00:08<00:05,  1.27it/s]
2023-08-14T18:50:49.960000+00:00  67%|██████▋   | 12/18 [00:09<00:04,  1.27it/s]
2023-08-14T18:50:49.960000+00:00  72%|███████▏  | 13/18 [00:09<00:03,  1.27it/s]
2023-08-14T18:50:49.960000+00:00  78%|███████▊  | 14/18 [00:10<00:03,  1.26it/s]
2023-08-14T18:50:49.960000+00:00  83%|████████▎ | 15/18 [00:11<00:02,  1.26it/s]
2023-08-14T18:50:49.960000+00:00  89%|████████▉ | 16/18 [00:12<00:01,  1.26it/s]
2023-08-14T18:50:49.960000+00:00  94%|█████████▍| 17/18 [00:13<00:00,  1.26it/s]
2023-08-14T18:50:49.960000+00:00 100%|██████████| 18/18 [00:13<00:00,  1.26it/s]
2023-08-14T18:50:49.960000+00:00 100%|██████████| 18/18 [00:13<00:00,  1.30it/s]
2023-08-14T18:51:03.963000+00:00 
2023-08-14T18:51:03.964000+00:00   0%|          | 0/18 [00:00<?, ?it/s]
2023-08-14T18:51:03.964000+00:00   6%|         | 1/18 [00:00<00:07,  2.40it/s]
2023-08-14T18:51:03.964000+00:00  11%|         | 2/18 [00:01<00:10,  1.56it/s]
2023-08-14T18:51:03.964000+00:00  17%|█▋        | 3/18 [00:02<00:10,  1.41it/s]
2023-08-14T18:51:03.964000+00:00  22%|██▏       | 4/18 [00:02<00:10,  1.35it/s]
2023-08-14T18:51:03.964000+00:00  28%|██▊       | 5/18 [00:03<00:09,  1.31it/s]
2023-08-14T18:51:03.964000+00:00  33%|███▎      | 6/18 [00:04<00:09,  1.29it/s]
2023-08-14T18:51:03.964000+00:00  39%|███▉      | 7/18 [00:05<00:08,  1.28it/s]
2023-08-14T18:51:03.964000+00:00  44%|████▍     | 8/18 [00:05<00:07,  1.27it/s]
2023-08-14T18:51:03.964000+00:00  50%|█████     | 9/18 [00:06<00:07,  1.27it/s]
2023-08-14T18:51:03.964000+00:00  56%|█████▌    | 10/18 [00:07<00:06,  1.27it/s]
2023-08-14T18:51:03.964000+00:00  61%|██████    | 11/18 [00:08<00:05,  1.26it/s]
2023-08-14T18:51:03.964000+00:00  67%|██████▋   | 12/18 [00:09<00:04,  1.26it/s]
2023-08-14T18:51:03.964000+00:00  72%|███████▏  | 13/18 [00:09<00:03,  1.26it/s]
2023-08-14T18:51:03.964000+00:00  78%|███████▊  | 14/18 [00:10<00:03,  1.26it/s]
2023-08-14T18:51:03.964000+00:00  83%|████████▎ | 15/18 [00:11<00:02,  1.26it/s]
2023-08-14T18:51:03.964000+00:00  89%|████████▉ | 16/18 [00:12<00:01,  1.26it/s]
2023-08-14T18:51:03.964000+00:00  94%|█████████▍| 17/18 [00:13<00:00,  1.26it/s]
2023-08-14T18:51:03.964000+00:00 100%|██████████| 18/18 [00:13<00:00,  1.26it/s]
2023-08-14T18:51:03.964000+00:00 100%|██████████| 18/18 [00:13<00:00,  1.29it/s]
2023-08-14T18:51:18.968000+00:00 
2023-08-14T18:51:18.968000+00:00   0%|          | 0/18 [00:00<?, ?it/s]
2023-08-14T18:51:18.968000+00:00   6%|         | 1/18 [00:00<00:07,  2.38it/s]
2023-08-14T18:51:18.968000+00:00  11%|         | 2/18 [00:01<00:10,  1.56it/s]
2023-08-14T18:51:18.968000+00:00  17%|█▋        | 3/18 [00:02<00:10,  1.41it/s]
2023-08-14T18:51:18.968000+00:00  22%|██▏       | 4/18 [00:02<00:10,  1.34it/s]
2023-08-14T18:51:18.968000+00:00  28%|██▊       | 5/18 [00:03<00:09,  1.31it/s]
2023-08-14T18:51:18.968000+00:00  33%|███▎      | 6/18 [00:04<00:09,  1.29it/s]
2023-08-14T18:51:18.968000+00:00  39%|███▉      | 7/18 [00:05<00:08,  1.28it/s]
2023-08-14T18:51:18.968000+00:00  44%|████▍     | 8/18 [00:05<00:07,  1.27it/s]
2023-08-14T18:51:18.968000+00:00  50%|█████     | 9/18 [00:06<00:07,  1.27it/s]
2023-08-14T18:51:18.968000+00:00  56%|█████▌    | 10/18 [00:07<00:06,  1.26it/s]
2023-08-14T18:51:18.968000+00:00  61%|██████    | 11/18 [00:08<00:05,  1.26it/s]
2023-08-14T18:51:18.968000+00:00  67%|██████▋   | 12/18 [00:09<00:04,  1.26it/s]
2023-08-14T18:51:18.968000+00:00  72%|███████▏  | 13/18 [00:09<00:03,  1.26it/s]
2023-08-14T18:51:18.968000+00:00  78%|███████▊  | 14/18 [00:10<00:03,  1.26it/s]
2023-08-14T18:51:18.968000+00:00  83%|████████▎ | 15/18 [00:11<00:02,  1.25it/s]
2023-08-14T18:51:18.968000+00:00  89%|████████▉ | 16/18 [00:12<00:01,  1.25it/s]
2023-08-14T18:51:18.968000+00:00  94%|█████████▍| 17/18 [00:13<00:00,  1.25it/s]
2023-08-14T18:51:18.968000+00:00 100%|██████████| 18/18 [00:13<00:00,  1.25it/s]
2023-08-14T18:51:18.968000+00:00 100%|██████████| 18/18 [00:13<00:00,  1.29it/s]
2023-08-14T18:51:42.973000+00:00 Bus error (core dumped)

Best Gian

Re: The algorithm failed on one or more cases.  

  By: kabbas5707efb29ff33724134 on Aug. 14, 2023, 7:49 p.m.

Thanks for the detailed error, Do we need to change any path from phase-1? As my network can find only 2 cases for prediction, not 5?

Cheers Abbas

Re: The algorithm failed on one or more cases.  

  By: apepe on Aug. 14, 2023, 8:03 p.m.

Dear Abbas,

There's no change required from phase 1, unless you have a specific assertion on the number of cases.

The time limit is per case and the executions are not strictly parallel. GC can also execute one segmentation job at a time and therefore trigger the time limit earlier (if one case exceeds the limit or fails)

Antonio

Re: The algorithm failed on one or more cases.  

  By: kabbas5707efb29ff33724134 on Aug. 14, 2023, 8:06 p.m.

I see; thanks for your reply. We expect the same type of image in phase-2 as in phase-1; not sure what to do next. As all my algorithm is the same,

Re: The algorithm failed on one or more cases.  

  By: apepe on Aug. 14, 2023, 9:35 p.m.

Your algorithm takes about 10 minutes for each case and exceeds 10minutes on case 3. This causes the failure.

Perhaps you still have a way to reduce the complexity.

Best Antonio

Re: The algorithm failed on one or more cases.  

  By: kabbas5707efb29ff33724134 on Aug. 14, 2023, 10:37 p.m.

Thanks for the details, Can you please list the error from the latest submission? I have tried to remove some more complex operations during inference.

Thanks Cheers ABBAS

Re: The algorithm failed on one or more cases.  

  By: apepe on Aug. 14, 2023, 10:45 p.m.

Still the same issue. You need about 5 minutes (not 10) per case with the exception of case 3 where you need more than 10 minutes. Therefore the job is killed by the platform.

Best Antonio

Re: The algorithm failed on one or more cases.  

  By: kabbas5707efb29ff33724134 on Aug. 15, 2023, 12:14 a.m.

Thanks for updating me, Just made another submission it lasted for a longer time than previous ones, but again died, is it the same case?

Cheers Abbas

Re: The algorithm failed on one or more cases.  

  By: apepe on Aug. 15, 2023, 7:30 a.m.

Yes Your algorithm seems to go in a loop with that case.

Re: The algorithm failed on one or more cases.  

  By: kabbas5707efb29ff33724134 on Aug. 15, 2023, 10:37 a.m.

Hello, can you please send me the log file of my last two submissions ? Thanks Cheers Abbas

 Last edited by: kabbas5707efb29ff33724134 on Aug. 15, 2023, 12:59 p.m., edited 1 time in total.