Can AI predict breast cancer recurrence via automated quantification of tumor-infiltrating lymphocytes?

Published 6 March 2022

On January 11, 2022, we opened TIGER, the first challenge on fully automated assessment of tumor-infiltrating lymphocytes (TILs) in breast cancer, which will run until the end of April 2022. Anybody can participate in TIGER, academics, industry, computer vision engineers, researchers in medical image analysis, etc., either individually or as part of a team. Together with our participants, we aim to find the best AI-based solutions for automating the assessment of the TILs and for producing a “TILs score” that can predict the progression and therapy responsiveness of breast cancer. To that end, we released a set of 390 annotated breast cancer histopathology slides stained with hematoxylin and eosin (H&E, the most common staining procedure in histopathology), which the algorithms will be trained on. The best performing algorithms will be awarded a total of $13,000 in AWS Credits.

The TILs and the TIGER

Lymphocytes and plasma cells are part of the immune system, defending our body from viruses and cancer: together, they represent the TILs. The importance of the TILs in the context of breast cancer has been known since the beginning of the 20th century when it was reported [1] that “patients with glandular and with local lymphocytic infiltration lived 146 percent longer than patients with glandular involvement without lymphocytic infiltration. As a general fact, they give clue to the defensive mechanism of the body against malignant neoplasms”. However, the TILs were somehow forgotten for almost a century. Recently, the immune-oncology scientific community has reported the relevance of the TILs in the context of predictive (i.e., saying something about chances to respond to cancer treatment and the future disease progression) and prognostic (i.e., saying something about cancer recurrence and patient survival) biomarkers. In TIGER, we place AI at the core of TILs biomarker development and validation and focus on two specific types of breast cancer: triple-negative and Her2-positive breast cancers.

Why TNBC and Her2-positive breast cancer?

Since 2021, breast cancer is the most common form of cancer worldwide, accounting for 12% of all newly diagnosed cancers [2]. Among women, it is the leading cause of cancer-related death. But not all breast cancers are the same. Depending on the type of cancer, different factors have to be considered in defining the treatment strategy for breast cancer patients, with consequences for the prognosis of patients. On the molecular level, the breast cancer literature distinguishes between four main subtypes based on the expression of two distinct cell receptors used in clinical practice: the Hormonal Receptor (HR, which is positive if either Estrogen Receptor (ER) or Progesterone Receptor (PR) is positive) and the human epidermal growth factor receptor 2 (Her2):

  • Luminal A (HR-positive, Her2 negative)

  • Luminal B (HR-positive, Her2 positive)

  • Her2 enriched (HR negative, Her2 positive)

  • Triple-Negative (HR negative, Her2 negative)

The clinical focus of the TIGER challenge is on Her2 positive (i.e., Luminal B and Her2 enriched, regardless of their HR status) and Triple Negative breast cancers (TNBC, negative to all receptors) because these are, irrespective of therapy, the ones with the worst prognosis.

How to score the TILs?

The TILs are present in multiple compartments of breast cancer tissue, including the tumor and the surrounding connective tissue, also called “stroma”. Different patients can show differences in the composition of the morphology and in the density and spatial arrangements of the TILs, features that pathologists can analyze by recognizing different tissue compartments and detecting the TILs via visual inspection. The question is then how to convert these observations into a biomarker, a single number that can give information about the prognosis of the patient. In recent years, several approaches have been proposed to derive TILs-based biomarkers. Within the context of breast cancer, the most well-known approach is probably the one proposed by the International Immuno Oncology Working Group. They proposed to compute a TILs score by visually estimating the percentage of tumor-associated stroma area that is covered by the TILs. Another TILs-based biomarker was proposed by Galon et al., under the name of Immunoscore. In the Immunoscore, two measurements are considered, namely the density of the TILs both at the tumor invasive margin (i.e., the “border” of the tumor region) and at the center of the tumor region, and combined into a single biomarker. Other approaches to design a TILs score are of course possible, some examples can be found in these papers [3-5]. In TIGER, we do not set any rule or constraint on how the TILs score has to be computed, we give space to the creativity of the participants to come up with a biomarker that is based on spatial quantification of the TILs!

Two leaderboards, secret test data

In TIGER, we have released training data containing manual annotations of tissue compartments and TILs, as well as a set of whole-slide images with their corresponding TILs score visually assessed by a pathologist. Participants in the TIGER challenge will have to use these training data to develop a single algorithm to perform two main tasks: 1) analysis of the tissue architecture, and 2) computation of a TILs score. To evaluate the algorithm’s performance on these two tasks, we will run them on secret test images, independent of training data, that are not directly accessible by the participants (see Data section for details), and use the results to generate two leaderboards The first leaderboard (L1) will evaluate the “computer vision” performance of algorithms at segmenting several tissue compartments such as tumor and tutor-associated stroma, and at detecting the TILs (i.e., lymphocytes and plasma cells). Therefore, L1 will compute both detection (via FROC analysis) and segmentation (via Dice score) performance and rank algorithms based on a combination of this performance. The second leaderboard (L2) will evaluate the “prognostic value” of the produced TILs score, i.e., how well the TILs score can help predict the recurrence of cancer in addition to available clinical variables. Therefore, L2 will rank algorithms based on the C-index of a multivariate Cox regression model trained with predefined clinical variables and the produced TILs score. Algorithms’ performance in both L1 and L2 will be evaluated automatically by the grand-challenge platform based on hidden test sets. For the entire duration of the challenge, L1 and L2 will be updated using two “experimental” test sets, one for each leaderboard, after each new submission. At the end of the challenge, two new “final” test sets (different from the experimental ones) will be used, one for L1 and one for L2, to compute the final results of the challenge.

Teamwork

TIGER is rooted in a multi-year ongoing collaboration between the Diagnostic Image Analysis Group of Radboud University Medical Center and the International Immuno-Oncology Working Group, with the support from the Breast Cancer Research Foundation, the Institut Jules Bordet, the BCSS, NuCLS, and ExaMode projects. Amazon Web Services (AWS) will sponsor the challenge providing computational power to evaluate the submissions. In addition, AWS will award $13.000 in AWS Credits to the winning teams of both L1 and L2.

Impact

For many years, TILs have been considered a powerful, low-cost biomarker with a potentially large impact on stratifying patients for therapy. However, optimizing TIL assessment against clinically relevant endpoints and offering a reproducible assay has been challenging. TIGER may strongly facilitate pushing TIL assessment into a clinically usable biomarker. The winning algorithms will continue to live on grand-challenge.org, publicly available to be used around the world. The code for training and inference of the winning algorithms will also be made publicly available. Furthermore, leaderboards will be re-opened after the challenge for future research. We plan to publish the findings of TIGER in a peer-reviewed article, written in collaboration with the best-performing teams of the challenge.

How to join

Interested to participate in TIGER? Or eager to learn more? Visit https://tiger.grand-challenge.org/

References
  1. Sistrunk WE, MacCarty WC. “Life expectancy following radical amputation for carcinoma of the breast: a clinical and pathologic study of 218 cases”. Ann Surg. 1922;75:61–69.

  2. https://www.cancer.gov/types/common-cancers (accessed 9 February 2022).

  3. M. Amgad et al., "Joint Region and Nucleus Segmentation for Characterization of Tumor Infiltrating Lymphocytes in Breast Cancer", Proc SPIE Int Soc Opt Eng. 2019 Feb; 10956: 109560M.

  4. K. AbdulJabbar et al, "Geospatial immune variability illuminates differential evolution of lung adenocarcinoma", Nature Medicine, 26, 1054–1062 (2020)

  5. J. Saltz et al., "Spatial Organization and Molecular Correlation of Tumor-Infiltrating Lymphocytes Using Deep Learning on Pathology Images", Cell Rep. 2018 Apr 3;23(1):181-193.e7.