GC-API for Algorithms
In this tutorial we will focus on how to interact with algorithms through the API.
Specifically, we will show you how to:
- Upload input to an algorithm for inference
- Download inference results from an algorithm
- Upload multiple inputs to an algorithm for inference
- Download the results of an algorithm that produces multiple outputs
Remember that you need to request access prior to using an algorithm. You do not need to request permission if you are using your own algorithm.
If you haven't installed gcapi yet, follow the instructions here.
Import necessary libraries
import gcapi
from pathlib import Path
from tqdm import tqdm
import SimpleITK as sitk
import numpy as np
import os
Authenticate to Grand Challenge using your personal API token.
# authorise with your personal token
my_personal_GC_API_token = 'my-personal-api-token'
client = gcapi.Client(token=my_personal_GC_API_token)
1. Upload input to an algorithm for inference¶
In this section, we will use Pulmonary Lobe Segmentation by Weiyi Xie. This algorithm segments pulmonary lobes of a given chest CT scan. The algorithm uses a contextual two-stage U-Net architecture. We will use example chest CT scans from coronacases.org. They are anonymized.
First, we will retrieve the algorithm and inspect what inputs it expects.
# retrieve the algorithm, providing a slug
algorithm_1 = client.algorithms.detail(slug="pulmonary-lobe-segmentation")
# explore, which input the algorithm expects
algorithm_1["inputs"]
Next, we will submit the inputs to the algorithm one by one.
Grand Challenge creates a job instance for each set of inputs. To create a job instance use the following command:
job = client.run_external_job(
algorithm="slug-of-the-algorithm",
inputs={ "interface": [ file ] }
)
algorithm
expects the slug of the algorithm you want to use as a string andinputs
expects a dictionary with interface slugs as keys and the corresponding input file path/url as values.
⚠️ Be aware that with gcapi version 0.5.0 the input file path/url needs be placed into a list.
# get the path to the files
files = ["io/case01.mha", "io/case02.mha"]
#timeout
jobs = []
# submit a job for each file in your file list
for file in files:
job = client.run_external_job(
algorithm="pulmonary-lobe-segmentation",
inputs={
"generic-medical-image": [Path(file)]
}
)
jobs.append(job)
After starting the algorithm jobs, we can inspect their status:
jobs = [client.algorithm_jobs.detail(job["pk"]) for job in jobs]
print([job["status"] for job in jobs])
After all of your jobs have ended with the status 'Succeeded', you can download the results.
You can also run the Algorithm on an existing Archive on Grand Challenge (if you have been granted access to it). For an example of how to do that go to section 4 of this tab.
2. Download inference results from an algorithm¶
# loop through input files
for job, input_fname in tqdm(zip(jobs, files)):
# loop through job outputs
for output in job["outputs"]:
# check whether if output exists
if output["image"] is not None:
# get image details
image_details = client(url=output["image"])
for file in image_details["files"]:
# create the output filename
output_file = Path(input_fname.replace(".mha", "_lobes.mha"))
if output_file.suffix != ".mha":
raise ValueError("Output file needs to have .mha extension")
output_file.parent.mkdir(parents=True, exist_ok=True)
with output_file.open("wb") as fp:
# get the image from url and write it
response = client(url = file["file"],follow_redirects=True).content
fp.write(response)
3. Upload multiple inputs to an algorithm for inference¶
In this section we will take a look at how to upload multiple inputs to an algorithm on Grand Challenge. As an example we will use Alessa Hering's Deep Learning-Based Lung Registration algorithm.
This algorithm requires the following inputs:
- fixed image (CT)
- fixed mask (lungs segmentation)
- moving image (CT)
- moving mask (lungs segmentation)
In this case, all inputs are images, bear in mind that other input types are possible, see Interfaces for an overview of existing interfaces. We will use the scans from the previous section as well as the algorithm output of the previous algorithm (lung lobes segmentation) in this section.
First, we have to binarize the lobe masks and create lung masks.
# provide paths of the lobe segmentations
lobes = [
"io/case01_lobes.mha",
"io/case02_lobes.mha",
]
#loop through the files
for lobe_file in lobes:
#read image with sitk
lobe = sitk.ReadImage(lobe_file)
origin, spacing, direction = lobe.GetOrigin(), lobe.GetSpacing(), lobe.GetDirection()
lobe = sitk.GetArrayFromImage(lobe)
# binarize
lobe[lobe >= 1] = 1
lungs = lobe.astype(np.uint8)
lungs = sitk.GetImageFromArray(lungs)
lungs.SetOrigin(origin)
lungs.SetSpacing(spacing)
lungs.SetDirection(direction)
#write the modified image back into file
sitk.WriteImage(lungs, lobe_file.replace("_lobes", "_lungs"), True)
We can retrieve the algorithm, just like we did before:
# retrieve the algorithm
algorithm_2 = client.algorithms.detail(slug="deep-learning-based-ct-lung-registration")
# as a reminder, you can inspect the algorithm object to understand what kind of inputs it requires
algorithm_2["inputs"]
Now we are ready to start a new algorithm job with the required inputs.
# create a job
registration_job = client.run_external_job(
algorithm="deep-learning-based-ct-lung-registration",
inputs={
"fixed-image": [Path("io/case01.mha")],
"moving-image": [Path("io/case02.mha")],
"fixed-mask": [Path("io/case01_lungs.mha")],
"moving-mask": [Path("io/case02_lungs.mha")],
}
)
Once the job has been started, we can inspect its status.
registration_job = client.algorithm_jobs.detail(registration_job["pk"])
registration_job["status"]
When the job has finished running and ended with the status 'Succeeded', you can download the result(s).
# loop through the outputs
for output in registration_job["outputs"]:
# get image details
image_details = client(url=output["image"])
output_slug = output["interface"]["slug"]
for file in image_details["files"]:
output_file = Path(f"{output_slug}.mha")
output_file.parent.mkdir(parents=True, exist_ok=True)
with output_file.open("wb") as fp:
fp.write(client(url=file["file"], follow_redirects=True).content)
⚠️ Note that both these algorithms wrote .mha
files as outputs. For algorithms that require different outputs, you can loop through the outputs of a successful job and search under "interface", which will tell you what kind of outputs you will have to download.
4. Download the results of an algorithm that produces multiple outputs¶
In this section we will focus on how to download results from an algorithm that produces multiple outputs. We will use the algorithm for pulmonary lobe segmentation of Covid-19 cases. This algorithm outputs the segmentation for a particular input as well as a "screenshot" of a middle slice for rapid inspection of algorithm performance.
We again start with retrieving the algorithm and inspecting its inputs.
# retrieve the algorithm, providing a slug
algorithm_4 = client.algorithms.detail(slug="pulmonary-lobe-segmentation-for-covid-19-ct-scans")
# explore, which input the algorithm expects
algorithm_4["inputs"]
This time, we will use images from an existing archive to pass to the algorithm as inputs.
#name of the archive
archive_name = "coronacases.org"
#save path on your machine
output_archive_dir = 'output_scans'
outputarchivedir_screenshots = 'output_screenshots'
archives = client(url="https://grand-challenge.org/api/v1/archives/")["results"]
corona_archive = None
for archive in archives:
if archive["name"] == archive_name:
corona_archive = archive
break
if corona_archive is None:
raise Exception("archive not found on GC")
# extract image urls
params = {
'archive': corona_archive["pk"],
}
response = client(url="https://grand-challenge.org/api/v1/cases/images/", params=params)
urls=[]
for r in response['results']:
urls.append(r['api_url'])
Now we can submit the images from the archive to the algorithm
jobs = []
# submit a job for each file in your file list
for url in urls[:2]:
job = client.run_external_job(
algorithm="pulmonary-lobe-segmentation-for-covid-19-ct-scans",
inputs={
"ct-image": url
}
)
jobs.append(job)
Lets check the status of the job.
jobs = [client.algorithm_jobs.detail(job["pk"]) for job in jobs]
print([job["status"] for job in jobs])
If the job status is 'Succeeded' we can proceed to downloading the results. In this part we will go through a scenario where we no longer have the details of the particular job that processed algorithm and inputs.
# get algorithm providing the slug
algorithm = "pulmonary-lobe-segmentation-for-covid-19-ct-scans"
algorithm_details = client(path="algorithms/", params={"slug": algorithm})
# extract details
algorithm_details = algorithm_details["results"][0]
algorithm_uuid = algorithm_details["pk"]
# Define dictionaries for image uuid mappings
images_mapping = {}
images_mapping_scans = {}
# get the desired archive
archives = client(url="https://grand-challenge.org/api/v1/archives/")["results"]
archive_name = 'coronacases.org'
target_archive = None
# loop through archives and select the one with a name that you are looking for ('coronacases.org')
for archive in archives:
if archive["name"] == archive_name:
target_archive = archive
break
We have generated a set of outputs to a set of inputs. Now, we need to find out which output corresponds to which input. This can be figured out via unique identifiers of images. Each image in an archive has a unique identifier (uuid). Here we create a mapping between input image names and uuids. We collect the uuids to a list.
# get uuids in archive
done = False
iteration = 0
image_uuids = []
# create a mapping between image uuids and input names
while not done:
iteration += 1
if iteration == 1:
# get information about images in archive from GC API
params = {'archive': target_archive['id']}
response = client(url="https://grand-challenge.org/api/v1/cases/images/", params=params)
else:
# get information about images on next page
response = client(url=response["next"])
images = response['results']
for image in images:
# create a mapping for image uuids
uuid = image['pk']
images_mapping[uuid] = image['name'] + "_" + uuid
images_mapping_scans[uuid] = image['name']
image_uuids += [uuid]
if response["next"] is None:
# stop if no next page left
done = True
Now, we will loop through the uuids and collect the algorithm job details corresponding to each unique image identifier.
# get algorithm results for the image
output_image_files = []
screenshot_files = []
counter = 0
# loop through uuid
for image_uuid in image_uuids:
params = {'algorithm_image__algorithm': algorithm_uuid, 'input_image': image_uuid}
# get the jobs details corresponding to a particular uuid and algorithm
results = client.algorithm_jobs.iterate_all(params)
#iterate through the results
for result in results:
counter += 1
#iterate through the outputs
for output in result['outputs']:
# here we go over different interfaces and write the corresponding output
if output["interface"]["slug"] == "pulmonary-lobes":
image = client(url=output['image'])
for file in image["files"]:
if file['image_type'] == "MHD":
new_file = file['file']#image['files'][0]['file']
output_image_files += [new_file]
dest_path_mha = Path(os.path.join(output_archive_dir, images_mapping_scans[image_uuid]))
with open(dest_path_mha, 'wb') as f1:
response_1 = client(url=new_file, follow_redirects=True)
f1.write(response_1.content)
if output["interface"]["slug"] == "pulmonary-lobes-screenshot":
image = client(url=output['image'])
for file in image["files"]:
if file['image_type'] == "TIFF":
new_file = file['file']
screenshot_files += [new_file]
dest_path = os.path.join(outputarchivedir_screenshots, images_mapping[image_uuid] + '.tif')
with open(dest_path, 'wb') as f1:
response_2 = client(url=new_file, follow_redirects=True)
f1.write(response_2.content)