Evaluation metrics adjusted

Evaluation metrics adjusted  

  By: sofisappia on July 24, 2024, 3:11 p.m.

Dear participants,

We have updated our evaluation metric from the standard Dice Similarity Coefficient (DSC) to a "soft" version (DSC_soft). This change addresses cases where no annotation is present in the ground truth for the predicted frame, but annotations exist in neighboring frames within the same sweep and within a 15-frame distance.

Key points of the new DSC_soft metric:

  1. If no annotation is found in the same frame as the predicted mask, the DSC is computed using the nearest annotated frame within:
    • The same sweep
    • A maximum distance of 15 frames
  2. The resulting DSC is then adjusted by a coefficient based on the distance between the current frame and the nearest annotated frame.

Due to this change, we have re-evaluated all submissions. As a result:

  • You may have received a notification about a new result for your algorithm submission.
  • You might notice shifts in the Preliminary Development Phase leaderboard.

We believe this modification will provide a more accurate assessment of segmentation performance, especially in cases with sparse annotations. For more detailed information on the DSC_soft metric, please refer to our evaluation guidelines here.

If you have any questions or concerns about this update, please don't hesitate to contact us.

Re: Evaluation metrics adjusted  

  By: tanyaakumu on July 24, 2024, 4:12 p.m.

Hello,

Is the final ranking score updated with the new metrics? The previous ranking score was: 0.5(1-NAE) + 0.25(DSC) + 0.25(WFSS). 1. Does this now change to: 0.5(1-NAE) + 0.25(DSC_soft) + 0.25(WFSS)? 2. Is the NAE computation on the nearest annotated frame?

Thank you

Re: Evaluation metrics adjusted  

  By: sofisappia on July 24, 2024, 4:17 p.m.

Yes, thank you for your question. Indeed, the aggregated score metric is updated as you mention:

score = 0.5 (1 - NAE_AC) + 0.25 (DSC_soft) + 0.25 (WFSS)

This updated score metric is considered for the ranking.

We did not change the computation of NAE_AC, as for this metric we already consider the mean circumference measurement at the sweep corresponding to the selected frame. This means that it already accounts for cases with near annotated cases at the sweep level, with no frame distance restriction.