Some Doubts Regarding the Dataset

Some Doubts Regarding the Dataset  

  By: aditya.vartak on May 24, 2024, 4:36 p.m.

Hi, I have a few doubts regarding the dataset in this competition

Doubts Regarding the Images: Metadata of the images don't contain information regarding the following:

  1. Base Magnification (Magnification of image at Level 0)
  2. Pixel-to-mm ratio for image
  3. Scanner used for scanning the image

Can these information be provided for the images?

Another question was around the problem definition. It's mentioned in the data section as "Your task is to estimate the time to biochemical recurrence (in years, as a continuous variable e.g. 1.23 years or 17.68 years).".

Does that mean we are only concerned with predicting time-to-event for the Biochemical Recurrence events only? If that is the case , since time-to-event for non-Biochemical recurrence events corresponds to the time of last follow up, should we also predict the time-to-event for non-Biochemical Recurrence events, or should those be discarded during inference?

Re: Some Doubts Regarding the Dataset  

  By: KhrystynaFaryna on May 25, 2024, 2:13 p.m.

Hi, thank you for your questions.

Doubts Regarding the Images: Metadata of the images don't contain information regarding the following:

Base Magnification (Magnification of image at Level 0) Pixel-to-mm ratio for image Scanner used for scanning the image Can these information be provided for the images?

  1. The images are provided with a maximum magnification of 0.25 microns per pixel.
  2. In order to reduce image size we removed intermediate resolutions, thus the image pyramids contain the following resolutions: 0.25, 1.0, 4.0, 16.0, .... micron per pixel
  3. These images were scanned with 3D-Histech Digital Scanner

Another question was around the problem definition. It's mentioned in the data section as "Your task is to estimate the time to biochemical recurrence (in years, as a continuous variable e.g. 1.23 years or 17.68 years).".

Does that mean we are only concerned with predicting time-to-event for the Biochemical Recurrence events only? If that is the case , since time-to-event for non-Biochemical recurrence events corresponds to the time of last follow up, should we also predict the time-to-event for non-Biochemical Recurrence events, or should those be discarded during inference?

Yes, your understanding is correct. The primary task is to estimate the time to biochemical recurrence (BCR) in years, which means we are specifically focused on predicting the time-to-event for BCR events only.

For patients who do not experience a BCR event, the time-to-event corresponds to the time of the last follow-up, and these instances are considered censored data. In survival analysis, censored data indicates that the event of interest (BCR in this case) has not occurred by the time of the last follow-up.

During inference, you should include all patients in your model, both those who have experienced BCR and those who have not (censored). However, the prediction should be focused on estimating the time to BCR. The non-BCR events (censored data) provide valuable information about the patients who have not experienced recurrence up to their last follow-up and help in accurately modeling the survival function.

So, to clarify:

  • Include both BCR and non-BCR (censored) data in your model training and inference.

  • The model should predict the time to BCR for all patients, understanding that for censored data, the prediction is an estimate beyond their last follow-up.

Re: Some Doubts Regarding the Dataset  

  By: aditya.vartak on May 27, 2024, 3:40 a.m.

Hi, Thank you for the detailed answers to the doubts. It's been mentioned in the answers that "The images are provided with a maximum magnification of 0.25 microns per pixel". Is 20x the lens objective corresponding to 0.25 microns per pixel or is it 40x?