Hi Sulaiman,
As listed on the page where we define all tasks for AI (https://pi-cai.grand-challenge.org/AI/) and the README for the picai_eval repo, we expect csPCa detection maps as one of the outputs of your submitted model, and not the output of a segmentation model after softmax activation. We define detection maps as 3D maps (with the same spatial dimensions and resolution as the input T2W image) containing non-overlapping, non-connected csPCa lesion detections. All voxels constituting each predicted lesion detection must comprise a single floating point value between 0-1, representing that lesion’s overall likelihood of harboring csPCa. In other words, while a softmax volume can have as many unique values as the number of voxels in that volume, the number of unique values in your detection map must be ≤ the number of csPCa lesions predicted in that detection map + 1 (for the value 0, representing the background). You can pass training scans through our trained baseline AI models to acquire examples of such detection maps. Alternatively, you can also refer to the "lesion candidates" block from Figure 1 of this recent publication from your team, which seems similar to our intended detection maps.
We leave it entirely upto the participants on how they would like to generate these csPCa detection maps. At our institution, and similarly for all baseline AI models shared in this challenge, we use a function titled "extract_lesion_candidates" to convert softmax predictions to detection maps, that we've made publicly available. As documented in the picai_eval repo, you can click here to learn more about how this function can be applied during evaluation. Now, "extract_lesion_candidates" comes with a range of different threshold modes (e.g. "static", "dynamic") that can be used. Their differences are documented in the function's source code itself and under the function header. From our internal tests, we find that the "dynamic" mode is typically most accurate. Click here to see a depiction of its working principle. Needless to say, participants can use this same function for their AI models, adapt it, or design something entirely different as they best see fit.
For transparency, we require detection maps as your output (rather than softmax volumes) so that we can definitively evaluate object/lesion-level detection performance using PR and FROC curves. With softmax volumes, there's a lot of ambiguity on how this can be handled —e.g. what is the overall single likelihood of csPCa per predicted lesion, what constitutes as the spatial boundaries of each predicted lesion, and in turn, what constitutes as object-level hits (TP) or misses (FN) as per our given hit criterion.
Hope this helps.