Question about the usage of GPU in docker

Question about the usage of GPU in docker  

  By: mart.vanrijthoven on March 31, 2022, 10:26 a.m.

Dear Alex Lee,

Your submssion failed due to a timeout. It seems that you are not using a GPU (i.e., torch-cpu)

Please let us know if we can be of any assistance to solve this issue.

Best wishes, Mart

 Last edited by: mart.vanrijthoven on Aug. 15, 2023, 12:56 p.m., edited 2 times in total.

Re: Question about the usage of GPU in docker  

  By: aivis_alex on April 1, 2022, 6:57 a.m.

Hi, I'm alex in team AIVIS.

As you mentioned, we failed to run our method in dockerby using CPU and we thought that it came from the running timeover issue. So we built the algorithm in GPU version via pytorch-1.11.0_cuda_11.3. However, we've got.... a error message as below:

ERROR message.~ "File "/venv/lib/python3.8/site-packages/torch/cuda/init.py", line 216, in _lazy_init torch._C._cuda_init()

RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx"

As far as we know, pytorch supports the maximum version of CUDA as 11.3. Would you let me know the reason of this error?

Thanks

BR, Alex

Re: Submission failed, Alex lee (AIVIS)  

  By: mart.vanrijthoven on April 1, 2022, 8:37 a.m.

Dear Alex,

Which base docker are you using? You can also send me your Dockerfile via email, and I can look at what might be the cause of this error.

Best wishes, Mart

Re: Submission failed, Alex lee (AIVIS)  

  By: aivis_alex on April 1, 2022, 9:49 a.m.

I just send our docker files to your email as below:

mart.vanrijthoven@radboudumc.nl

Thanks

BR, Alex

Re: Submission failed, Alex lee (AIVIS)  

  By: mart.vanrijthoven on April 1, 2022, 10:21 a.m.

Dear Alex,

Thanks for the files! It looks like you are building your docker with "FROM ubuntu:20.04" which is what provide by the tiger-aglorithm example which does not use any GPU libraries.

You might want to change the first line of the dockerfile "FROM ubuntu:20.04" to "FROM nvidia/cuda:11.1-devel-ubuntu20.04" Or if you want to reduce the docker image, you can also try to use "FROM nvidia/cuda:11.1-runtime-ubuntu20.04"

I hope this will solve your problem. Please let me know otherwise.

Best wishes, Mart

Re: Submission failed, Alex lee (AIVIS)  

  By: aivis_alex on April 1, 2022, 10:24 a.m.

Thanks!

We will try and let you know the reusult.

Have a nice weekend :)