Update on PUMA Challenge Scoring Issue ¶
By: mschuiveling on March 13, 2025, 12:05 p.m.
Dear PUMA Challenge participants,
We've identified an issue with the way the nuclei F1 score is calculated on the Grand Challenge platform. Due to the evaluation container processing samples individually, the F1 score displayed is an average of the per-sample F1 scores rather than a single metric calculated using the total TP, FP, and FN across all samples. While this is still a valid metric, it is not the intended method of evaluation.
To address this, we decided the following approach:
- We will use both the currently displayed averaged F1 score and the F1 score computed using the total TP, FP, and FN across all samples to determine the top three participating teams.
- The final results will be posted after the challenge deadline on Friday.
We are very sorry for the inconvenience and inconsistency this has led to. To ensure the integrity of the evaluation, we have also rechecked the rest of the evaluation process, which remains solid.
If you have any questions, please do not hesitate to contact us through a reply on this post or an email to m.schuiveling@umcutrecht.nl.
Kind regards,
On behalf of the PUMA challenge team,
Mark