F1 score calculation after aggregation on the challenge platform ¶
By: Biototem on March 10, 2025, 3:53 p.m.
Hi,
I am referring to the descriptions in another post regarding the calculation of the aggregated F1 score for each nuclei class, which states: "The number of samples is not considered in the calculation of the Macro F1 score for nuclei. We accumulate true positives, false positives, and false negatives per class across all samples and use these totals to compute the F1 score per class. The final Macro F1 score is then obtained by averaging across classes, without using the number of samples."
In the results for the ten images in the preliminary test set (phase 1), the TP, FN, and FP numbers for each image were also provided. We manually summed up the TP, FN, and FP numbers across these ten images. However, we found that the manually calculated F1 score for each class after aggregation differed from the score calculated on the platform in the "aggregates" section.
Could you please check to see if there was an error in our calculations or if we misunderstood the calculation process? Thank you very much for your assistance.