What is the issue about?

A minor issue with the F1 score computation in relation to the validation leaderboard was found thanks to @youngjinshin. In summary, the evaluation code is not sorting the cell predictions correctly based on the confidence score. You can find more detailed information about the issue in this GitHub link.

The error is negligible

After conducting a thorough local re-evaluation of most of the available submissions and comparing them with the current leaderboard evaluation, we obtained the following statistics: 0.000037 +- 0.000038 (mean +- stdev), with the maximum error being 0.00015 (0.015%). The impact of this error on the leaderboard ranking is so minimal that it will not affect the overall ranking.

Considering the above-mentioned reasons and the fact that we are nearing the end of the validation stage, we have decided not to re-evaluate the entire leaderboard with the new evaluation.

Please note that the validation score serves as a reference for participants to select their best model and has no impact on the final score.

What is next?

We understand that this situation may not be ideal for participants, and we apologize for any inconvenience caused. To address this, we are committed to the following actions:

  • We will fix the error for the final test stage, as this is crucial since only a single submission will determine the final results.
  • Although the error is marginal, you are welcome to contact oncology-ai-research@lunit.io if you would like us to re-evaluate any of your submissions.

Additional information

It is important to clarify that this issue did not result in any failed submissions and it is nearly impossible.