About test evaluation

About test evaluation  

  By: caixc on Oct. 9, 2022, 2:10 p.m.

In our solution, it takes a lot of time to do initialization work, such as load lib, build model and load checkpoint from disk. Does the docker do the same preparation work for every single image during testing phase and is all these time taken into account by runtime? Thanks in advance.

Re: About test evaluation  

  By: junma on Oct. 10, 2022, 4:24 a.m.

Hi,

Does the docker do the same preparation work for every single image during testing phase and is all these time taken into account by runtime?

It depends on what you pack inside your docker. The evaluate code starts to count time after starting the docker.

Here is the evaluation code https://neurips22-cellseg.grand-challenge.org/metrics/

BTW, most of the teams (80%) can finish the inference within time tolerance.

Re: About test evaluation  

  By: caixc on Oct. 10, 2022, 12:10 p.m.

As the code shows, it will repeat the docker for every image, which means it repeat importing, loading checkpoints etc. every time. In most cases, just one sentence "import torch" takes more than 3s, not alone other operation and model inference..

Re: About test evaluation  

  By: junma on Oct. 10, 2022, 3:02 p.m.

As mentioned in the evaluation page, the testing images will be evaluated one by one, which can obtain the runing time for each case.

This is not ideal in real practice, but it is best way in a challenge setting to obtain per case running time.

The same strategy was used in the following challenges. The top solutions could give you some insights.

https://flare.grand-challenge.org/ https://flare22.grand-challenge.org/

"import torch" takes more than 3s This is not normal:-

Re: About test evaluation  

  By: caixc on Oct. 11, 2022, 3:31 a.m.

Thank you so much for your advice :)