Dear Participants,

We’re excited to share several key updates with you!

✅ The minimal baseline is now available here: minimal baseline ✅ The evaluation code is available here: evaluation code

As promised, we’re releasing additional training data for all three tasks of the CHIMERA Challenge!
To support participants with limited compute resources, we’ve also included pre-computed feature embeddings extracted from pathology slides using the UNI model.

Please find the detailed updates per task below:


🔹 Task 1: Prostate Cancer – Biochemical Recurrence Prediction¶

✨ 1. Expanded Training Set¶

We’ve added 39 new training cases, bringing the total to 95. Each case includes:

Data Type Format/Details
Histopathology Slides Packed TIFF files (.tif)
Tissue Masks Provided with TIFFs
T2-Weighted MRI Image file (.mha)
Apparent Diffusion Coefficient Image file (.mha)
Diffusion-Weighted Imaging (DWI) Image file (.mha)
MRI Segmentation Mask Image file (.mha)
Clinical Data JSON (.json)

🔬 2. Histopathology Data Updates¶

  • Packed TIFF Files:
    • Training case histopathology images are now provided as packed TIFF files.
    • This means for each patient, you'll find multiple .tif files.
    • Each .tif file contains either 1 or 2 pathology slides.
  • Gleason Grade Filtering:
    • The packed TIFF files have been curated to only include tissue regions reflecting the Gleason grades specified in the corresponding JSON clinical data file.
    • Consequently, pathology tissue with lower Gleason patterns or no Gleason patterns has been excluded from these TIFF files. This filtering is to reduce compute and will be the same for the evaluation and test sets.

🧠 3. Pre-computed Feature Embeddings: * We're excited to provide feature embeddings extracted using the UNI model from Mahmood Lab. * These embeddings are: * Extracted at a patch level from the histopathology slides. * Stored in .pt files (PyTorch tensor files). * Coordinates Provided: * Alongside the embeddings, you'll find a coordinates folder. * This folder contains .npy files, which store the coordinates of each patch used for the feature embedding extraction, allowing you to map embeddings back to their spatial locations on the slides.


🔹 Task 2: Bladder Cancer - BRS Subtype Prediction in High-Risk NMIBC¶

We have added new training set, the clinical data aligned with evaluation/test settings, pre-computed feature embeddings and an overal quality control of whole slide images (WSI).

✨ 1. Expanded Training Set¶

+50 new training cases, now totaling 182. Each case includes:

Data Type Format/Details
Histopathology Slides TIFF (_HE.tif)
Tissue Masks TIFF (_HE_mask.tif)
Clinical Data JSON (_CD.json)

Note: As pointed out by several participants, a few of the previously provided histopatholgy slides were not usable. In this release, those histopathology slides are fixed and re-released. Refer to Task 2 section for more information.

🧠 2. Pre-computed Feature Embeddings¶

  • Extracted at 0.25 mpp resolution with 224×224 patch size
  • .pt embedding files + corresponding .npy spatial coordinate files

This is extra material, not part of validation and test, and can be use for training your model.

🧪 3. Quality Control of WSI done by our pathologist:¶

Annotations are listed in task2_quality_control.csv and include:

Feature Description
Cohort Sample origin: Cohort A or B (see reference)
Tumor Type Indicates if tumor is Primary or Recurrent
WSI Quality Pathologist-rated image quality: Good or Poor
Holes Yes: holes (cored/punched) present on slide; No: no holes visible
Tumor Yes: tumor present on slide; No: no tumor visible
---

🔹 Task 3: Bladder Cancer – Recurrence Prediction¶

✨ What's New¶

We have added new training set, the clinical data aligned with evaluation/test settings, pre-computed feature embeddings and an overal quality control of whole slide images (WSI).

✨ 1. Expanded Training Set¶

+50 new cases, now totaling 176. Each case includes:

Data Type Format/Details
Histopathology Slides TIFF (_HE.tif)
Tissue Masks TIFF (_HE_mask.tif)
Clinical Data JSON (_CD.json)
RNA Data JSON (_RNA.json)

Note: As pointed out by several participants, a few of the previously provided histopatholgy slides were not usable. In this release, those histopathology slides are fixed and re-released. The newly added RNA-seq data was sequenced using a different protocol than those previously provided. Refer to the Task 3 section for more information.

🧠 2. Pre-computed Feature Embeddings¶

  • Extracted at 0.25 mpp resolution with 224×224 patch size
  • .pt embedding files + corresponding .npy spatial coordinate files

This is extra material, not part of validation and test, and can be use for training your model.

🧪 3. Quality Control of WSI done by our pathologist:¶

Annotations are listed in task3_quality_control.csv and include:

Feature Description
Cohort Sample origin: Cohort A or B (see reference)
Tumor Type Indicates if tumor is Primary or Recurrent
WSI Quality Pathologist-rated image quality: Good or Poor
Holes Yes: holes (cored/punched) present on slide; No: no holes visible
Tumor Yes: tumor present on slide; No: no tumor visible

📥 Download Instructions¶

Visit the CHIMERA Challenge page and scroll to the bottom of each task section to download the latest data and materials. Don't forget to register your team!

Happy training and good luck!

— The CHIMERA Challenge Organizers