Clarification on baseline results in Survival leaderboard ¶
By: f.ciompi on April 13, 2022, 3:15 p.m.
Dear participants,
We have just updated the description of the text in the Evaluation section as follows:
OLD text: In leaderboard 2, we will publish the C-index of a baseline model where only predefined clinical variables are included (age, morphology subtype, grade, molecular subtype, stage, surgery, adjuvant therapy). In order to be eligible for an award, models trained with TILs scores added to other clinical variables will need to have a c-index higher than the one of the baseline model.
NEW text: In leaderboard 2, the C-index of a baseline survival model where only predefined clinical variables are included (age, morphology subtype, grade, molecular subtype, stage, surgery, adjuvant therapy) is 0.63 [CI: 0.42, 0.82]. In order to be eligible for an award, models trained with TILs scores added to other clinical variables will need to have a C-index higher than the one of the baseline survival model. Additionally, we also report on L2 the performance of a regression model trained with all aforementioned clinical variables plus the TILs score produced by the TIGER baseline algorithm, which results in a C-index of 0.70 [CI: 0.51, 0.87].
This is because we have published the results of the TIGER baseline algorithm (developed by Cyril de Kock) on the survival leaderboard, but we realized that this could lead to some confusion because of the use of the term "baseline". Therefore, in the new text we made an explicit distinction between "baseline survival model" (the prediciton model trained in L2 only using clinical variables) and "TIGER baseline algorithm" (the algorithm that produces the TILs score, which was also used as baseline method for detection and segmentation tasks in leaderboard 1).
We hope it is now clear that the baseline result for the survival leaderboard is a C-index of 0.63, obtained by building a prediction model solely using clinical variables, and that the C-index of 0.70 is obtained by training a prediction model using the same clinical variables, plus the TILs score predicted by the TIGER baseline algorithm.
Regards, Francesco