PI-CAI Eval: extract lesion candidates!

PI-CAI Eval: extract lesion candidates!  

  By: svesal on Sept. 1, 2022, 3:55 p.m.

Dear all,

We have some confusion about our local validation (CV) and how exactly the PI-CAI eval code works. Initially, we thought that we should only save our detection_maps (model prediction probability after softmax). However, in the process.py code, we saw you are applying the "extract_lesion_candidates". Based on the code we understood you apply this to remove some of the FPs.

Do we need to also apply this function to our prediction maps similar to what you have done? Also, how does the "threshold ="dynamic" work to assign a single probability value to the entire predicted lesion for correct evaluation?

I am sorry, but the evaluation part is not clear. I feel adding a paragraph on how to correctly generate "detection maps" based on your evaluation code would be great.

Best, Sulaiman

Re: PI-CAI Eval: extract lesion candidates!  

  By: anindo on Sept. 1, 2022, 10:28 p.m.

Hi Sulaiman,

As listed on the page where we define all tasks for AI (https://pi-cai.grand-challenge.org/AI/) and the README for the picai_eval repo, we expect csPCa detection maps as one of the outputs of your submitted model, and not the output of a segmentation model after softmax activation. We define detection maps as 3D maps (with the same spatial dimensions and resolution as the input T2W image) containing non-overlapping, non-connected csPCa lesion detections. All voxels constituting each predicted lesion detection must comprise a single floating point value between 0-1, representing that lesion’s overall likelihood of harboring csPCa. In other words, while a softmax volume can have as many unique values as the number of voxels in that volume, the number of unique values in your detection map must be ≤ the number of csPCa lesions predicted in that detection map + 1 (for the value 0, representing the background). You can pass training scans through our trained baseline AI models to acquire examples of such detection maps. Alternatively, you can also refer to the "lesion candidates" block from Figure 1 of this recent publication from your team, which seems similar to our intended detection maps.

We leave it entirely upto the participants on how they would like to generate these csPCa detection maps. At our institution, and similarly for all baseline AI models shared in this challenge, we use a function titled "extract_lesion_candidates" to convert softmax predictions to detection maps, that we've made publicly available. As documented in the picai_eval repo, you can click here to learn more about how this function can be applied during evaluation. Now, "extract_lesion_candidates" comes with a range of different threshold modes (e.g. "static", "dynamic") that can be used. Their differences are documented in the function's source code itself and under the function header. From our internal tests, we find that the "dynamic" mode is typically most accurate. Click here to see a depiction of its working principle. Needless to say, participants can use this same function for their AI models, adapt it, or design something entirely different as they best see fit.

For transparency, we require detection maps as your output (rather than softmax volumes) so that we can definitively evaluate object/lesion-level detection performance using PR and FROC curves. With softmax volumes, there's a lot of ambiguity on how this can be handled —e.g. what is the overall single likelihood of csPCa per predicted lesion, what constitutes as the spatial boundaries of each predicted lesion, and in turn, what constitutes as object-level hits (TP) or misses (FN) as per our given hit criterion.

Hope this helps.

 Last edited by: anindo on Aug. 15, 2023, 12:57 p.m., edited 15 times in total.

Re: PI-CAI Eval: extract lesion candidates!  

  By: joeran.bosma on Sept. 2, 2022, 9:42 a.m.

Hi Sulaiman,

To add to Anindo's description, if you're using the "extract_lesion_candidates" function, then it will take the maximum confidence within the lesion candidate as its overall lesion confidence. The nnDetection paper has a section on deciding this lesion confidence based on the softmax values under "nnU-Net as an Object Detection Baseline". They emperically choose the best out of four options: (max, mean, median, 95% percentile) using cross-validation. Changing this may improve your performance, but we leave this up to you to decide.

Kind regards, Joeran

 Last edited by: anindo on Aug. 15, 2023, 12:57 p.m., edited 1 time in total.