Leaderboard1 evaluation

Leaderboard1 evaluation  

  By: krlee999 on Feb. 11, 2022, 3:45 a.m.

Dear organizers,

I have two questions about evaluation. What does "ts_dice" mean in the baseline example? Is it the dice index of the predicted lymphocytes? Second, are we supposed to predict the lymphocytes outside the tumor border as true positive? I am asking this because according to the recommendations by "TILs Working Group", TIL score doesnt count the lymphocytes outside the big tumor border and I was wondering if I should make predictions patchwise or consider the WSI globally.

Thank you

Re: Leaderboard1 evaluation  

  By: crunch on Feb. 11, 2022, 9:49 a.m.

Hi,

ts_dice is the average of the invasive tumor vs. rest and stroma vs rest dices evaluated on the segmentation.

For the til-scoring indeed only the lymphocytes in the tumor bulk matter, but leaderboard 1 evaluates the lymphocyte detection performance in general, therefore detections are expected whereever the tissue mask is 1.

Re: Leaderboard1 evaluation  

  By: zq1992 on Feb. 12, 2022, 4:56 a.m.

Dear Organizers,

When testing the segmentation and detection algorithms in phase one, will you provide the mask of tumor bulk? Or we have to decide the tumor bulk by ourselves?

Regards, Chu

Re: Leaderboard1 evaluation  

  By: crunch on Feb. 12, 2022, 8:12 a.m.

Hi, tumor bulk masks for the leaderboards will not be provided. All slides come with tissue masks, where segmentation and detection results are expected. In leaderboard 1 those are just small rois, so detecting the tumor bulk is neither necessary nor possible (when only looking at the tissue where mask==1). The tissue masks in leaderboard 2 will cover all (foreground) tissue of which the tumor bulk is only a part. There, determining the tumor bulk will be necessary to compute a good til score.

 Last edited by: crunch on Aug. 15, 2023, 12:55 p.m., edited 1 time in total.

Re: Leaderboard1 evaluation  

  By: a.tsakiroglou on Feb. 15, 2022, 2:32 p.m.

Dear organizers

Thank you for the replies above. Could you please also clarify:

  • Will the entire WSI be available to us in Leaderboard 1 or just some ROI image crops? If the entire WSI is provided then it could be possible to determine the approximate tumour bulk area to identify whether the ROI fall within it. Is this assumption correct? In the comment above it is mentioned that finding the tumour bulk is not possible, which has me a bit confused.

  • Are the relevant ROI on test set denoted by giving us a tissue mask where only these ROI areas are set to 1? Does that mean that a tissue mask for the entire WSI that shows which areas are white background will not be available for the test set in Leaderboard 1?

  • When looking at just ROI images we will not always know how close we are to the tumour. Are you assuming that healhty stroma and tumour associated stroma are different enough to discriminate between them without knowing the tumour location? Unless we should assume that all ROI in the Leaderboard 1 test set fall within the tumour bulk area and therefore all stroma in the ROI is tumour-associated?

Thanks Anna Maria

Re: Leaderboard1 evaluation  

  By: crunch on Feb. 15, 2022, 7:27 p.m.

Hi, In both leaderboard the complete whole slide image is provided. The provided tissue masks for leaderboard 1 will only highlight small rois - what I meant was that it would not be possible to determine the bulk looking only at the rois. For leaderboard 2 the masks will cover the complete tissue of the slide. In leaderboard 1 only the segmentation and detection is evaluated and in leaderboard 2 only the til score. The algorithm has to work with both leaderboards, therefore it has to produce segmentation and detection everywhere where the tissue mask=1 and also provide the til-score.

I hope that answered your questions!

Here the details of the evaluation process ( now with pictures! :-)) https://tiger.grand-challenge.org/Evaluation/

 Last edited by: crunch on Aug. 15, 2023, 12:55 p.m., edited 1 time in total.

Re: Leaderboard1 evaluation  

  By: Lengzhuo on March 10, 2022, 7:10 a.m.

Dear organizer, I want to ask one question about the evaluation of the segmentation task. The Dice score only consider stroma and invasive tumor. Is that mean I can treat tumor-associated stroma and inflamed stroma as one category (all output labels are only 2 or 6), and treat all the other categories(unlabeled, in-situ tumor, healthy glands, necrosis not in-situ and rest) as zeros? Or I have to distinguish between tumor-associated stroma and inflamed stroma? Thank you!

Re: Leaderboard1 evaluation  

  By: crunch on March 10, 2022, 7:37 a.m.

Hi, yes, tumor-associted stroma (2) and inflamed stroma (6) are treated as one class when computing the dice score, so its (2,6) vs all others and (1) vs all others.

Re: Leaderboard1 evaluation  

  By: crunch on March 15, 2022, 1:16 p.m.

here a trimmed verion of the segmentation evaluation code (updated): https://github.com/DIAGNijmegen/pathology-tiger-algorithm-example/blob/main/evaluations/eval_utils.py

 Last edited by: crunch on Aug. 15, 2023, 12:56 p.m., edited 1 time in total.

Re: Leaderboard1 evaluation  

  By: kaczmarj on April 3, 2022, 1:39 p.m.

I found the segmentation evaluation here https://github.com/DIAGNijmegen/pathology-tiger-algorithm-example/blob/4b77d78642af0f89e3a6cd6c183cc02a85cb3572/evaluations/eval_utils.py#L93-L110.

The previous link did not work for me (https://github.com/DIAGNijmegen/pathology-tiger-algorithm-example/blob/main/evaluations/eval_segm.py).

Re: Leaderboard1 evaluation  

  By: crunch on April 4, 2022, 6:22 a.m.

Hi,

yes, the name changed to https://github.com/DIAGNijmegen/pathology-tiger-algorithm-example/blob/main/evaluations/eval_utils.py