Challenge setup

In a modern challenge on, both the test data and the test labels are hidden. Participants submit an algorithm as a solution to the challenge. This algorithm is then run on the hidden test set (which must be uploaded as an archive by the challenge admins) on the Grand Challenge platform. The results that the algorithm produces are subsequently evaluated using a custom evaluation method provided by the challenge admins. The evaluation produces a set of metrics, which are subsequently displayed on the leaderboard and used to rank submissions on specific criteria. See below for details on the underlying compute infrastructure.

In the simplest, standard case, a challenge has one task and is carried out in two phases. The first phase is usually a preliminary phase where participants familiarize themselves with the algorithm submission system and test their algorithms on a small subset of images. From experience, we know that it takes participants a few attempts to get their algorithm containers right, so it is important and strongly recommended to have such a preliminary sanity-check phase. The second phase is the final test phase, often with a single submission policy, which evaluates the submitted algorithms on a larger test set. You could also think of the two phases as a qualification and a final phase, where you use the qualification phase to select participants for the second, final test phase, as was done by STOIC.

Set-up steps

To set up your algorithm submission challenge after your challenge has been accepted, you as a challenge organizer need to take the following steps:
  1. Define the input and output interfaces that the algorithms submitted to each of your phases take and produce. Check for suitable existing interfaces here and inform the support team which interfaces need to be configured for which phase of your challenge. If no suitable interfaces exist, the support team will create new interfaces for you. If you are unfamiliar with the concept of interfaces, please have a look here first.
  2. After the interfaces have been chosen, the support team will create a challenge pack GitHub repository for you with an example algorithm, an example evaluation method as well as an archive upload script. You can find an example of a challenge pack here. The support will also create archives for each of your algorithm submission phases and share the links to those with you. You can then proceed to upload your secret test data to those archives. If your algorithms take a single image input, it might be easiest to upload the data through our UI on the archive page itself. If your algorithms take complex inputs (e.g. an image together with a segmentation mask, or some metadata) you are best advised to use our API client for uploading (the challenge pack contains an upload script for you to do so). Note that you only upload the secret test data to the archive, not the public training data and also not the groundtruth.
  3. With the data and the basic settings in place, you can then start working on an example baseline algorithm container as well as the evaluation container. You should take the example algorithm and evaluation containers in the challenge pack provided to you as a starting point.
In parallel to the above steps, you should also: Also be sure to take a look at our tips for organizing a challenge page and our frequently asked questions section. It might also help to take a look at previous challenges. Good examples are MIDOG, Airogs, CONIC, Tiger and Node21. MIDOG and Tiger have pretty good example videos so you can see how it would work from a participant's perspective.


If you host a challenge on our platform, all algorithm and evaluation containers will be run on our AWS infrastructure where storage and compute scale elastically on demand. The algorithm that participants submit to your challenge are then run on each image in the archive that you linked to the respective phase. We use a g4dn.xlarge instance (Nvidia T4, 16GB GPU memory, 4 CPU, 16GB CPU memory) or g4dn.2xlarge instance (Nvidia T4, 16GB GPU memory, 8 CPU, 32GB CPU memory) depending on what the participant selects for their algorithm image for running the container images. The participants do not get access to the internet or the logs to prevent exfiltration of the test set. You as a challenge admin get access to the results and logs of each algorithm so you can help your participants if their submissions fail.