Using the API for algorithms¶
In this tutorial we will focus on how to interact with algorithms through the API. Specifically, we will show you how to:
- Upload input to an algorithm for inference
- Download inference results from an algorithm
- Upload multiple inputs to an algorithm for inference
- Download the results of an algorithm that produces multiple outputs
Remember that you need to request access prior to using an algorithm. You do not need to request permission if you are using your own algorithm.
If you haven't installed gcapi yet, follow the instructions here.
Import necessary libraries
import gcapi from pathlib import Path from tqdm import tqdm import SimpleITK as sitk import numpy as np
Authenticate to Grand Challenge using your personal API token.
# authorise with your personal token my_personal_gcapi_token = 'my-personal-gcapi-token' client = gcapi.Client(token=my_personal_gcapi_token)
Upload input to an algorithm for inference¶
In this section, we will use Pulmonary Lobe Segmentation by Weiyi Xie. This algorithm segments pulmonary lobes of a given chest CT scan. The algorithm uses a contextual two-stage U-Net architecture. We will use example chest CT scans from coronacases.org. They are anonymized.
First, we will retrieve the algorithm and inspect what inputs it expects.
# retrieve the algorithm, providing a slug algorithm_1 = client.algorithms.detail(slug="pulmonary-lobe-segmentation") # explore, which input the algorithm expects algorithm_1["inputs"]
Next, we will submit the inputs to the algorithm one by one. Grand Challenge creates a job instance for each set of inputs. To create a job instance use the following command:
job = client.run_external_job( algorithm="slug-of-the-algorithm", inputs={ "interface": [ file ] } )
algorithm
expects the slug of the algorithm you want to use as a string andinputs
expects a dictionary with interface slugs as keys and the corresponding input file path/url as values.
⚠️ Be aware that with gcapi version 0.5.0 the input file path/url needs be placed into a list.
# get the path to the files files = ["io/case01.mha", "io/case02.mha"] #timeout jobs = [] # submit a job for each file in your file list for file in files: job = client.run_external_job( algorithm="pulmonary-lobe-segmentation", inputs={ "generic-medical-image": [Path(file)] } ) jobs.append(job)
After starting the algorithm jobs, we can inspect their status:
jobs = [client.algorithm_jobs.detail(job["pk"]) for job in jobs] print([job["status"] for job in jobs])
After all of your jobs have ended with the status 'Succeeded', you can download the results.
You can also run the Algorithm on an existing Archive on Grand Challenge (if you have been granted access to it). For an example of how to do that go to section 4 of this tab.
Download inference results from an algorithm¶
# loop through input files for job, input_fname in tqdm(zip(jobs, files)): # loop through job outputs for output in job["outputs"]: # check whether if output exists if output["image"] is not None: # get image details image_details = client(url=output["image"]) for file in image_details["files"]: # create the output filename output_file = Path(input_fname.replace(".mha", "_lobes.mha")) if output_file.suffix != ".mha": raise ValueError("Output file needs to have .mha extension") output_file.parent.mkdir(parents=True, exist_ok=True) with output_file.open("wb") as fp: # get the image from url and write it response = client( url=file["file"], follow_redirects=True ).content fp.write(response)
Upload multiple inputs to an algorithm for inference¶
In this section we will take a look at how to upload multiple inputs to an algorithm on Grand Challenge. As an example we will use Alessa Hering's Deep Learning-Based Lung Registration algorithm.
This algorithm requires the following inputs:
- fixed image (CT)
- fixed mask (lungs segmentation)
- moving image (CT)
- moving mask (lungs segmentation)
In this case, all inputs are images. Bear in mind that other input types are possible, see Interfaces for an overview of existing interfaces. We will use the scans from the previous section as well as the algorithm output of the previous algorithm (lung lobes segmentation) in this section.
First, we have to binarize the lobe masks and create lung masks.
# provide paths of the lobe segmentations lobes = [ "io/case01_lobes.mha", "io/case02_lobes.mha", ] #loop through the files for lobe_file in lobes: #read image with sitk lobe = sitk.ReadImage(lobe_file) origin = lobe.GetOrigin() spacing = lobe.GetSpacing() direction = lobe.GetDirection() lobe = sitk.GetArrayFromImage(lobe) # binarize lobe[lobe >= 1] = 1 lungs = lobe.astype(np.uint8) lungs = sitk.GetImageFromArray(lungs) lungs.SetOrigin(origin) lungs.SetSpacing(spacing) lungs.SetDirection(direction) # write the modified image back into file sitk.WriteImage(lungs, lobe_file.replace("_lobes", "_lungs"), True)
We can retrieve the algorithm, just like we did before:
# retrieve the algorithm algorithm_2 = client.algorithms.detail( slug="deep-learning-based-ct-lung-registration") # as a reminder, you can inspect the algorithm object # to understand what kind of inputs it requires algorithm_2["inputs"]
Now we are ready to start a new algorithm job with the required inputs.
# create a job registration_job = client.run_external_job( algorithm="deep-learning-based-ct-lung-registration", inputs={ "fixed-image": [Path("io/case01.mha")], "moving-image": [Path("io/case02.mha")], "fixed-mask": [Path("io/case01_lungs.mha")], "moving-mask": [Path("io/case02_lungs.mha")], } )
Once the job has been started, we can inspect its status.
registration_job = client.algorithm_jobs.detail(registration_job["pk"]) registration_job["status"]
When the job has finished running and ended with the status 'Succeeded', you can download the result(s).
# loop through the outputs for output in registration_job["outputs"]: # get image details image_details = client(url=output["image"]) output_slug = output["interface"]["slug"] for file in image_details["files"]: output_file = Path(f"{output_slug}.mha") output_file.parent.mkdir(parents=True, exist_ok=True) with output_file.open("wb") as fp: fp.write(client(url=file["file"], follow_redirects=True).content)
⚠️ Note that both these algorithms wrote .mha
files as outputs. For algorithms that require different outputs, you can loop through the outputs of a successful job and search under "interface", which will tell you what kind of outputs you will have to download.
Download the results of an algorithm that produces multiple outputs¶
In this section we will focus on how to download results from an algorithm that produces multiple outputs. We will use the algorithm for pulmonary lobe segmentation of Covid-19 cases. This algorithm outputs the segmentation for a particular input as well as a "screenshot" of a middle slice for rapid inspection of algorithm performance.
We again start with retrieving the algorithm and inspecting its inputs.
# retrieve the algorithm, providing a slug algorithm_4 = client.algorithms.detail( slug="pulmonary-lobe-segmentation-for-covid-19-ct-scans") # explore which inputs the algorithm expects algorithm_4["inputs"]
This time, we will use images from an existing archive to pass to the algorithm as inputs.
#name of the archive archive_name = "coronacases.org" #save path on your machine output_archive_dir = 'output_scans' outputarchivedir_screenshots = 'output_screenshots' archives = client(url="https://grand-challenge.org/api/v1/archives/")["results"] corona_archive = None for archive in archives: if archive["name"] == archive_name: corona_archive = archive break if corona_archive is None: raise Exception("archive not found on GC") # extract image urls params = { 'archive': corona_archive["pk"], } response = client(url="https://grand-challenge.org/api/v1/cases/images/", params=params) urls=[] for r in response['results']: urls.append(r['api_url'])
Now we can submit the images from the archive to the algorithm
jobs = [] # submit a job for each file in your file list for url in urls[:2]: job = client.run_external_job( algorithm="pulmonary-lobe-segmentation-for-covid-19-ct-scans", inputs={ "ct-image": url } ) jobs.append(job)
Lets check the status of the job.
jobs = [client.algorithm_jobs.detail(job["pk"]) for job in jobs] print([job["status"] for job in jobs])
If the job status is 'Succeeded' we can proceed to downloading the results. In this part we will go through a scenario where we no longer have the details of the particular job that processed algorithm and inputs.
# get algorithm providing the slug algorithm = "pulmonary-lobe-segmentation-for-covid-19-ct-scans" algorithm_details = client(path="algorithms/", params={"slug": algorithm}) # extract details algorithm_details = algorithm_details["results"][0] algorithm_uuid = algorithm_details["pk"] # Define dictionaries for image uuid mappings images_mapping = {} images_mapping_scans = {} # get the desired archive archives = client(url="https://grand-challenge.org/api/v1/archives/")["results"] archive_name = 'coronacases.org' target_archive = None # loop through archives and select the one with a name that you are looking for # (e.g., 'coronacases.org') for archive in archives: if archive["name"] == archive_name: target_archive = archive break
We have generated a set of outputs to a set of inputs. Now, we need to find out which output corresponds to which input. This can be figured out via unique identifiers of images. Each image in an archive has a unique identifier (uuid). Here we create a mapping between input image names and uuids. We collect the uuids to a list.
# get uuids in archive done = False iteration = 0 image_uuids = [] # create a mapping between image uuids and input names while not done: iteration += 1 if iteration == 1: # get information about images in archive from GC API params = {'archive': target_archive['id']} response = client(url="https://grand-challenge.org/api/v1/cases/images/", params=params) else: # get information about images on next page response = client(url=response["next"]) images = response['results'] for image in images: # create a mapping for image uuids uuid = image['pk'] images_mapping[uuid] = image['name'] + "_" + uuid images_mapping_scans[uuid] = image['name'] image_uuids += [uuid] if response["next"] is None: # stop if no next page left done = True
Now, we will loop through the uuids and collect the algorithm job details corresponding to each unique image identifier.
# get algorithm results for the image output_image_files = [] screenshot_files = [] counter = 0 # loop through uuid for image_uuid in image_uuids: params = {'algorithm_image__algorithm': algorithm_uuid, 'input_image': image_uuid} # get the jobs details corresponding to a particular uuid and algorithm results = client.algorithm_jobs.iterate_all(params) #iterate through the results for result in results: counter += 1 #iterate through the outputs for output in result['outputs']: # here we go over different interfaces and write the corresponding output if output["interface"]["slug"] == "pulmonary-lobes": image = client(url=output['image']) for file in image["files"]: if file['image_type'] == "MHD": new_file = file['file'] #image['files'][0]['file'] output_image_files += [new_file] dest_path_mha = Path( output_archive_dir, images_mapping_scans[image_uuid] ) with open(dest_path_mha, 'wb') as f1: response_1 = client(url=new_file, follow_redirects=True) f1.write(response_1.content) if output["interface"]["slug"] == "pulmonary-lobes-screenshot": image = client(url=output['image']) for file in image["files"]: if file['image_type'] == "TIFF": new_file = file['file'] screenshot_files += [new_file] dest_path = Path( outputarchivedir_screenshots, images_mapping[image_uuid] ).with_suffix('.tif') with open(dest_path, 'wb') as f1: response_2 = client(url=new_file, follow_redirects=True) f1.write(response_2.content)