On, we provide tools for researchers to organize challenges in medical image analysis.

Challenges are meant to facilitate a fair and objective comparison of a set of machine learning (ML) solutions for a defined clinical problem. Such a fair and objective comparison requires a representative set of test images that the ML algorithms will be run on and the definition of evaluation standards against which those algorithms' results will be compared. provides a scalable, fast, and intuitive platform for hosting such challenges.

Specifically, we offer the following tools:

  • An easy way to create a site, add and edit pages like a wiki
  • Registration mechanisms for participants
  • Secure ways for organizers to provide challenge data to participants and for participants to upload results
  • Mechanisms for participants to submit Algorithms as Docker containers
  • Automated evaluations of uploaded results or Algorithms
  • Automated leaderboard management, including ways to tabulate, sort, and visualize the results
  • The possibility to define multiple leaderboards, corresponding to different Phases for your Challenge

As a user, you can either participate in a Challenge or create your own Challenge.

Types of Challenges

Challenges can be categorized into several types depending on whether the training/test data are publicly available:

Type Training data & labels Test data Test labels Participant's Artefact Provided by Challenge Creators
0 Open Open Open Metrics
1 Open Open Closed Predictions Evaluation Method
2 Open Closed Closed Inference Algorithm + Test data
3 Closed Closed Closed Training Algorithm + Training data
  • LUNA16 is an example of a Type 0 Challenge, where the entire dataset, including the labels, was a large subset of the publicly available LIDC-IDRI dataset. The participants were therefore asked to submit their predictions using 10-fold cross-validation.
  • DSB2017 is an example of a Type 1 Challenge, where participants were expected to submit the predictions of a publicly available test set whose labels were hidden.
  • AIROGS and CoNIC 2022 are examples of Type 2 Challenges, where participants were expected to submit Docker containers as Challenge submissions. These algorithm containers were then automatically evaluated on a hidden test set by the infrastructure present on

Type 2 Challenges offer unique opportunities for the winning algorithms to remain reproducible, accessible, and live online on our platform beyond the life of a Challenge.

💡 does not yet support Type 3 Challenges. This functionality is on our roadmap. In some Type 2 challenges, like NODE21 and STOIC2021, there is a closed phase at the end of the (Type 2) challenge where the challenge organizers train algorithms using code provided by the top-performing participants on secret additional training data. This additional training takes place outside of the platform and will inform us on how to best set up Type 3 challenges within

For a more in-depth explanation of what a challenge is and why it is useful, listen to James Meakin during one of our internal workshops: