Create an Algorithm container

Published 11 Feb. 2021

This post takes you through the steps to serve your algorithm in a Docker container compatible with the requirements of grand-challenge.org. In particular, we will focus on creating a Docker container to perform inference with your algorithm on test data.

Wrapping your algorithm in a Docker container ensures that the environments necessary for reproducing your algorithm are packaged along with the algorithm. This is the fastest and the most reliable way to reproduce algorithms.

Table of contents

  1. Prerequisites
  2. Creating an Algorithm container
  3. Building and testing the container
  4. Exporting the container
  5. Video tutorial
  6. Flexible inputs and outputs
  7. Contact

1. Prerequisites

πŸ’‘ Windows tip: It is highly recommended to install Windows Subsystem for Linux (WSL) to work with Docker on a Linux environment within Windows. Please make sure to install WSL 2 by following the instructions on the same page. In this tutorial, we have used WSL 2 with Ubuntu 18.04 LTS. Also, note that the basic version of WSL 2 does not come with GPU support. Please watch the official tutorial by Microsoft on installing WSL 2 with GPU support. The alternative is to work purely out of Ubuntu, or any other flavor of Linux.


2. Creating an Algorithm container

In this post, we will build an Algorithm container for a U-Net that segments retinal blood vessels from the DRIVE Challenge. We used evalutils to create the Algorithm container for the DRIVE challenge, and our scripts can be found here: https://github.com/DIAGNijmegen/drive-vessels-unet/tree/master/vesselSegmentor

The image below shows the output of a very simple U-Net that segments vessels.

To start the process, let's clone the repository that contains the weights from a pre-trained model and the python scripts to run inference on a new fundus image.

$ git clone https://github.com/DIAGNijmegen/drive-vessels-unet.git

2.1 Create base repository using evalutils

Evalutils provides methods to wrap your algorithm in Docker containers. Just execute the following command in a terminal of your choice:

$ evalutils init algorithm VesselSegmentationContainer

Here, VesselSegmentationContainer is the custom name we have given for our algorithm container for this walkthrough. This will create a templated repository with a Dockerfile.

You will then be asked to choose your algorithm type and computational requirements. Describe the kind of algorithm, the number of CPUs and GPUs required, and the CPU RAM and GPU memory required for running your algorithm. The remaining questions can be filled with the default answer.

The scripts for your container files will be automatically generated by evalutils. It also creates commands for building, testing, and exporting the algorithm container.

VesselSegmentationContainer/
β”œβ”€β”€ Dockerfile
β”œβ”€β”€ README.md
β”œβ”€β”€ build.bat
β”œβ”€β”€ build.sh
β”œβ”€β”€ export.bat
β”œβ”€β”€ export.sh
β”œβ”€β”€ process.py
β”œβ”€β”€ requirements.txt
β”œβ”€β”€ test
β”‚   β”œβ”€β”€ 1.0.000.000000.0.00.0.0000000000.0000.0000000000.000.mhd
β”‚   β”œβ”€β”€ 1.0.000.000000.0.00.0.0000000000.0000.0000000000.000.zraw
β”‚   └── expected_output.json
β”œβ”€β”€ test.bat
└── test.sh

πŸ“ŒNOTE: The .bat files are designed for Windows, while the .sh files are designed for Linux users.

2.2 Bring your own Algorithm in process.py

The next step is to edit process.py. This is the file where you will extend the Algorithm class of evalutils and implement your inference algorithm. In this file, a new class, VesselSegmentationContainer, has been created for you, and it is instantiated and run with:

if __name__ == "__main__":
  VesselSegmentationContainer().process()

You can build on top of this template and write your algorithm into the repository. The default algorithm generated by evalutils does simple binary thresholding of the input image. You'll have to replace that with your custom algorithm.

For example, you can see how we have edited process.py for the DRIVE Algorithm based on inference.py in this version of process.py.

2.3 Configuring the Dockerfile

Ensure that you import the right base image in your Dockerfile. For our U-Net, we will build our Docker with the official PyTorch Docker as the base image. This should take care of installing PyTorch with the necessary CUDA environments inside your Docker. If you're using TensorFlow, please build your Docker with the official base image from TensorFlow. You can browse through Docker Hub to find your preferred base image. The base image can be specified in the first line of your Dockerfile:

FROM pytorch/pytorch

πŸ”© Copying your model weights into the Docker

Ensure that you copy all the files needed to run your scripts, including the model weights, into /opt/algorithm. This can be configured in the Dockerfile using the COPY command. If your model weights are stored in a "checkpoints" folder, first copy them into the repository which contains the Dockerfile. Then add this line into the Dockerfile to ensure the checkpoints are copied into the container when the Docker gets built.

COPY --chown=algorithm:algorithm checkpoints/ /opt/algorithm/checkpoints/

πŸ“Configuring requirements.txt

Ensure that all of the dependencies with their versions are specified in requirements.txt as shown in the example below:

evalutils==0.2.4
scikit-learn==0.20.2
scipy==1.2.1
monai==0.4.0
scikit-image==0.18.1

Note that we haven't included torch as it comes with the PyTorch base image included in our Dockerfile in the previous step.

3. πŸ”¨Building and πŸ§ͺtesting the container

Once your scripts are ready, you can build the container by calling ./build.sh if you're in a Linux environment, or build.bat if you're using a Windows environment like Command Prompt or Anaconda Prompt.

⚠️ Automated testing is still limited in its functionality. Evalutils does not provide functions for creating a test suite yet, but you can extend the generated repository to build your own test suite if necessary.

Place your test images and an expected_output.json file in the test/ folder. In our case, we have replaced the default .mhd and .zraw files with 1000.0.mha.

test
β”œβ”€β”€ 1000.0.mha
└── expected_output.json  

We also modified our expected_output.json as shown below:

[
    {
        "outputs": [
            {
                "type": "metaio_image",
                "filename": "1000.0.mha"
            }
        ],
        "inputs": [
            {
                "type": "metaio_image",
                "filename": "1000.0.mha"
            }
        ],
        "error_messages": []
    }
]

To test your container, run ./test.sh if you're in a Linux environment, or test.bat if you're in Windows. If your test was successful, you should get the following output:

⚠️Docker Desktop for Windows does not have GPU support. You need a bleeding-edge version of Windows 10 and WSL 2 to test your Docker container locally with GPU mode. Note that this does not compromise your building and exporting processes. It only hampers your testing process. You can avoid this by testing with CPU-mode.

πŸ’‘ Our modifications from the templated repository generated by evalutils can be found here: https://github.com/DIAGNijmegen/drive-vessels-unet/tree/master/vesselSegmentor


4. Exporting the container

To export the container, run ./export.sh or export.bat (if you're in Windows) and wait for a few minutes until the container gets created and zipped into a single file.

Uploading to grand-challenge.org

To create an Algorithm in grand-challenge.org, go this page and click on + Add a new algorithm to create your algorithm page. If you're unable to see this button, please write to us.

Create the basic page by answering the questionnaire about your algorithm. Once the page is created, you can upload the .tar.gz file by clicking Containers β†’ Upload a Container.


5. Video tutorial

The video below provides a walkthrough of the entire process for creating an algorithm container for the DRIVE Challenge using evalutils.

Clone the repository containing the U-Net that segments retinal blood vessels from the DRIVE Challenge and then watch the video.

$ git clone https://github.com/DIAGNijmegen/drive-vessels-unet.git

6. Flexible inputs and outputs

Evalutils currently supports the creation of only three types of algorithms out of the box; classification, segmentation, and detection. In addition, support for flexible inputs and outputs is not available out of the box. However, it is very much possible to use the automatically generated repository for a classification algorithm and adapt it for a registration problem.

It is also possible to customize the repository to support flexible inputs and outputs. For example, you may want to register a moving image to a fixed image, in which case you will need two inputs. Or you may want to write a binary segmentation file and another image file containing the predicted probabilities. This is possible by inheriting and overriding the process function of evalutils. We are planning to add this feature to evalutils soon. Please contact us if you need support in the meanwhile.


7. Contact

For contact and support, please email kiranvaidhya.venkadesh@radboudumc.nl and cc support@grand-challenge.org.

Icons made by Freepik from www.flaticon.com