The algorithm failed on one or more cases

The algorithm failed on one or more cases  

  By: fory on July 31, 2023, 10:49 a.m.

Dear organizers,

My algorithm has passed the offline test with the provided "test.sh" in uNet_baseline and the online test "try out algorithm" in 4 minutes. But it failed after I submit it to Preliminary test. Unfortunately, I cannot find any error information about it. May I ask how to solve this problem?

Thanks!

 Last edited by: fory on Aug. 15, 2023, 12:59 p.m., edited 1 time in total.

Re: The algorithm failed on one or more cases  

  By: Marcel.Frueh on Aug. 4, 2023, 4:05 p.m.

Hi!

The error is: RuntimeError: Input type (torch.cuda.DoubleTensor) and weight type (torch.cuda.FloatTensor) should be the same

Re: The algorithm failed on one or more cases  

  By: anissa218 on Sept. 3, 2023, 7:40 a.m.

Hi, my algorithm also failed after passing test.sh, do you have the error message?

Thanks,

Anissa

Re: The algorithm failed on one or more cases  

  By: Sepideh on Sept. 5, 2023, 12:54 p.m.

Hi dear organizer, I got the same error for my last submission in the preliminary test set. Could you please send the error to me? Thanks a lot.

Re: The algorithm failed on one or more cases  

  By: Marcel.Frueh on Sept. 6, 2023, 9:32 a.m.

Hi everyone:

@ Sepideh: torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 1.95 GiB (GPU 0; 14.76 GiB total capacity; 8.99 GiB already allocated; 1.90 GiB free; 11.95 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF @anissa: "Error Message: Algorithm container image would not start" which is likely due to some error within your docker container

Best Marcel

Re: The algorithm failed on one or more cases  

  By: mattttteo on Sept. 9, 2023, 9:04 p.m.

Hey guys,

I got the same error and would be happy if someone could check it. Have I run into the time limit? As of right now I convert my niftii files to mha only at the very end, maybe I'll have to change that. The text says one prediction every 10 minutes but not how exactly this is determined. Also just to sure, right now there is a time limit of 1 hour on all test submissions right?

Thanks for the help, Matthias

 Last edited by: mattttteo on Sept. 9, 2023, 9:05 p.m., edited 1 time in total.

Re: The algorithm failed on one or more cases  

  By: Marcel.Frueh on Sept. 10, 2023, 11:04 a.m.

Hi Matthias,

2023-09-09T20:23:56.303000+00:00 Traceback (most recent call last): 2023-09-09T20:23:56.303000+00:00 File "/opt/app/sw_interactive_segmentation/src/test.py", line 176, in 2023-09-09T20:23:56.303000+00:00 main() 2023-09-09T20:23:56.303000+00:00 File "/opt/app/sw_interactive_segmentation/src/test.py", line 172, in main 2023-09-09T20:23:56.303000+00:00 run(args) 2023-09-09T20:23:56.303000+00:00 File "/opt/app/sw_interactive_segmentation/src/test.py", line 82, in run 2023-09-09T20:23:56.303000+00:00 evaluator = get_test_evaluator( 2023-09-09T20:23:56.303000+00:00 File "/opt/app/sw_interactive_segmentation/src/sw_interactive_segmentation/api.py", line 365, in get_test_evaluator 2023-09-09T20:23:56.303000+00:00 init(args) 2023-09-09T20:23:56.303000+00:00 File "/opt/app/sw_interactive_segmentation/src/sw_interactive_segmentation/utils/helper.py", line 436, in wrapper 2023-09-09T20:23:56.303000+00:00 return f(args, *kwargs) 2023-09-09T20:23:56.303000+00:00 File "/opt/app/sw_interactive_segmentation/src/sw_interactive_segmentation/api.py", line 804, in init 2023-09-09T20:23:56.303000+00:00 cp.random.seed(seed=args.seed) 2023-09-09T20:23:56.303000+00:00 File "/home/user/.local/lib/python3.10/site-packages/cupy/random/_generator.py", line 1255, in seed 2023-09-09T20:23:56.303000+00:00 get_random_state().seed(seed) 2023-09-09T20:23:56.303000+00:00 File "/home/user/.local/lib/python3.10/site-packages/cupy/random/_generator.py", line 1287, in get_random_state 2023-09-09T20:23:56.303000+00:00 rs = RandomState(seed) 2023-09-09T20:23:56.303000+00:00 File "/home/user/.local/lib/python3.10/site-packages/cupy/random/_generator.py", line 57, in init 2023-09-09T20:23:56.303000+00:00 self._generator = curand.createGenerator(method) 2023-09-09T20:23:56.303000+00:00 File "cupy_backends/cuda/libs/curand.pyx", line 95, in cupy_backends.cuda.libs.curand.createGenerator 2023-09-09T20:23:56.303000+00:00 File "cupy_backends/cuda/libs/curand.pyx", line 99, in cupy_backends.cuda.libs.curand.createGenerator 2023-09-09T20:23:56.303000+00:00 File "cupy_backends/cuda/libs/curand.pyx", line 88, in cupy_backends.cuda.libs.curand.check_status 2023-09-09T20:23:56.303000+00:00 cupy_backends.cuda.libs.curand.CURANDError: CURAND_STATUS_INITIALIZATION_FAILED

Re: The algorithm failed on one or more cases  

  By: ceilinghans on Sept. 12, 2023, 11:56 a.m.

Hi, my algorithm failed after I submit it to final test set, do you have the error message?

Thanks, Ceiling

 Last edited by: ceilinghans on Sept. 12, 2023, 12:48 p.m., edited 1 time in total.

Re: The algorithm failed on one or more cases  

  By: Marcel.Frueh on Sept. 13, 2023, 5:35 a.m.

Hi!

Your error is: ValueError: operands could not be broadcast together with shapes (327,400,400) (196,256,256) (These values vary depending on the input data)

Re: The algorithm failed on one or more cases  

  By: ceilinghans on Sept. 14, 2023, 7:06 a.m.

Thanks a lot.

Re: The algorithm failed on one or more cases  

  By: zaxos021 on Sept. 18, 2023, 10:29 a.m.

Hi, can we get our error message as well please?

Thank you so much in advance!

Re: The algorithm failed on one or more cases  

  By: Marcel.Frueh on Sept. 18, 2023, 12:42 p.m.

Hi!

2023-09-18T10:20:45.256000+00:00 Exception in thread Thread-5 (results_loop): 2023-09-18T10:20:45.256000+00:00 Traceback (most recent call last): 2023-09-18T10:20:45.256000+00:00 File "/usr/local/lib/python3.10/threading.py", line 1016, in _bootstrap_inner 2023-09-18T10:20:45.256000+00:00 self.run() 2023-09-18T10:20:45.256000+00:00 File "/usr/local/lib/python3.10/threading.py", line 953, in run 2023-09-18T10:20:45.256000+00:00 self._target(self._args, *self._kwargs) 2023-09-18T10:20:45.256000+00:00 File "/home/user/.local/lib/python3.10/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 92, in results_loop 2023-09-18T10:20:45.256000+00:00 raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the print" 2023-09-18T10:20:45.256000+00:00 RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message 2023-09-18T10:20:45.256000+00:00 Traceback (most recent call last): 2023-09-18T10:20:45.256000+00:00 File "/home/user/.local/bin/nnUNetv2_predict", line 8, in 2023-09-18T10:20:45.256000+00:00 sys.exit(predict_entry_point()) 2023-09-18T10:20:45.256000+00:00 File "/home/user/.local/lib/python3.10/site-packages/nnunetv2/inference/predict_from_raw_data.py", line 525, in predict_entry_point 2023-09-18T10:20:45.256000+00:00 predict_from_raw_data(args.i, 2023-09-18T10:20:45.256000+00:00 File "/home/user/.local/lib/python3.10/site-packages/nnunetv2/inference/predict_from_raw_data.py", line 236, in predict_from_raw_data 2023-09-18T10:20:45.256000+00:00 for preprocessed in mta: 2023-09-18T10:20:45.256000+00:00 File "/home/user/.local/lib/python3.10/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 204, in next 2023-09-18T10:20:45.256000+00:00 item = self.__get_next_item() 2023-09-18T10:20:45.256000+00:00 File "/home/user/.local/lib/python3.10/site-packages/batchgenerators/dataloading/multi_threaded_augmenter.py", line 189, in __get_next_item 2023-09-18T10:20:45.256000+00:00 raise RuntimeError("One or more background workers are no longer alive. Exiting. Please check the " 2023-09-18T10:20:45.256000+00:00 RuntimeError: One or more background workers are no longer alive. Exiting. Please check the print statements above for the actual error message 2023-09-18T10:20:46.257000+00:00 Traceback (most recent call last): 2023-09-18T10:20:46.257000+00:00 File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main 2023-09-18T10:20:46.257000+00:00 return _run_code(code, main_globals, None, 2023-09-18T10:20:46.257000+00:00 File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code 2023-09-18T10:20:46.257000+00:00 exec(code, run_globals) 2023-09-18T10:20:46.257000+00:00 File "/opt/app/process.py", line 90, in 2023-09-18T10:20:46.257000+00:00 Lion().process() 2023-09-18T10:20:46.257000+00:00 File "/opt/app/process.py", line 82, in process 2023-09-18T10:20:46.257000+00:00 self.predict() 2023-09-18T10:20:46.257000+00:00 File "/opt/app/process.py", line 69, in predict 2023-09-18T10:20:46.257000+00:00 lion(TRACER_MODEL, self.lion_work_dir_input, self.lion_work_dir_output,ACCELERATOR) 2023-09-18T10:20:46.257000+00:00 File "/home/user/.local/lib/python3.10/site-packages/lionz/lionz.py", line 333, in lion 2023-09-18T10:20:46.257000+00:00 segmentation_file = predict_tumor(workflow_dir, model_name, output_dir, accelerator) 2023-09-18T10:20:46.257000+00:00 File "/home/user/.local/lib/python3.10/site-packages/lionz/predict.py", line 95, in predict_tumor 2023-09-18T10:20:46.257000+00:00 mask_path = get_files(output_dir, '.nii.gz')[0] 2023-09-18T10:20:46.257000+00:00 IndexError: list index out of range

 Last edited by: Marcel.Frueh on Sept. 19, 2023, 4:27 a.m., edited 2 times in total.

Re: The algorithm failed on one or more cases  

  By: shadab on Sept. 22, 2023, 4:46 a.m.

I just submitted my algorithm and got the same error (the algorithm failed on one or more cases). Will it be possible to get a more detailed error log at this stage?

Re: The algorithm failed on one or more cases  

  By: Marcel.Frueh on Sept. 22, 2023, 6:22 a.m.

2023-09-22T04:25:22.345000+00:00 Traceback (most recent call last): 2023-09-22T04:25:22.345000+00:00 File "/opt/conda/lib/python3.10/runpy.py", line 196, in _run_module_as_main 2023-09-22T04:25:22.345000+00:00 return _run_code(code, main_globals, None, 2023-09-22T04:25:22.345000+00:00 File "/opt/conda/lib/python3.10/runpy.py", line 86, in _run_code 2023-09-22T04:25:22.345000+00:00 exec(code, run_globals) 2023-09-22T04:25:22.345000+00:00 File "/opt/algorithm/process.py", line 5, in 2023-09-22T04:25:22.345000+00:00 import monai_unet 2023-09-22T04:25:22.345000+00:00 File "/opt/algorithm/monai_unet.py", line 24, in 2023-09-22T04:25:22.345000+00:00 from monai.metrics import DiceMetric, compute_meandice 2023-09-22T04:25:22.345000+00:00 ImportError: cannot import name 'compute_meandice' from 'monai.metrics' (/home/algorithm/.local/lib/python3.10/site-packages/monai/metrics/init.py)

Re: The algorithm failed on one or more cases  

  By: ArMo on Sept. 22, 2023, 6:45 a.m.

Hi, could I get my error log of the last submission in final test please? Thanks a lot!