Failed Submission

Failed Submission  

  By: MaKaNu on Aug. 14, 2024, 5:37 p.m.

Good Day,

Our submission failed with the following issue:

Failed The algorithm failed on one or more cases.

Can you provide deeper insights?

We monitored our runs while debugging and the inference of Running Docker inference locally was always inside the HW Limits.

The Maximum RAM usage: ca 12 GB The Maximum VRAM: 950 MB

We are, for a few reasons, mostly cpu bound.

Re: Failed Submission  

  By: imran.muet on Aug. 14, 2024, 11:44 p.m.

Good Day,

Upon reviewing the logs, we identified the following issue:

2024-08-14T16:59:28.075000+00:00 Traceback (most recent call last):
2024-08-14T16:59:28.075000+00:00   File "/opt/app/inference.py", line 154, in <module>
2024-08-14T16:59:28.075000+00:00     raise SystemExit(run())
2024-08-14T16:59:28.076000+00:00   File "/opt/app/inference.py", line 53, in run
2024-08-14T16:59:28.076000+00:00     os.environ['PYTORCH_CUDA_ALLOC_CONF'] = 'expandable_segments:True'
2024-08-14T16:59:28.076000+00:00 NameError: name 'os' is not defined

The error stems from a missing import statement for the os module in the inference.py script. This oversight causes the script to fail when attempting to set the environment variable PYTORCH_CUDA_ALLOC_CONF.

To resolve this, the os module needs to be imported at the beginning of the script:

import os

Re: Failed Submission  

  By: MaKaNu on Aug. 15, 2024, 7:56 a.m.

But os was imported at the beginning of inference.py

OR at least at the same position the example repo placed it.

Re: Failed Submission  

  By: MaKaNu on Aug. 15, 2024, 11:55 a.m.

We tested a bit further and updated the test_run.sh like the following (Our environment is not docker ready):

  • changed the image name
  • removed --gpus all

We also added the test directory with the test images to be able to run the test_run.sh script.

After that, the local test run worked fine. I try to upload a new Container with the changes mentioned.

Re: Failed Submission  

  By: imran.muet on Aug. 15, 2024, 3:36 p.m.

Since we don’t have access to your inference code, we can’t determine the exact cause of the issue. However, we can review the error log generated by your algorithm, which has already been shared above. According to the log, it appears that your inference file is not importing the os module.

Re: Failed Submission  

  By: MaKaNu on Aug. 15, 2024, 6:27 p.m.

We actually figured out, why os module was missing. For some reason not our inference instead your's were uploaded into the image. I am sitting the complete day figuring out, what is wrong with our new images since the output is always empty...

Re: Failed Submission  

  By: imran.muet on Aug. 15, 2024, 7:36 p.m.

We are glad to know that you figured it out!