GC-API for Archives



In this tutorial we will focus on archives and go over how to upload and download cases from an archive on our platform.

Remember that you need to request access prior to using a particular archive. You do not need to request permission if you are using your own archive.

If you haven't installed gcapi yet, follow the instructions here.

Import necessary libraries:

import gcapi
import os
from pathlib import Path


Authenticate to Grand Challenge using your personal API token.

# authorize with your personal token
token = 'my-personal-api-token'
client = gcapi.Client(token=token)


Downloading cases from Archives on Grand Challenge

In this part we will download cases from the coronacases.org dataset on Grand Challenge.

First let's search for the archive by its name (not its slug!):

# name of the archive 
archive_name = "coronacases.org"
# save path on your machine
output_archive_dir = 'path\to\where\to\save\the\data\to\your\machine'

archives = client(url="https://grand-challenge.org/api/v1/archives/")["results"]

corona_archive = None
for archive in archives:
    if archive["name"] == archive_name :
        corona_archive = archive
        break
if corona_archive is None:
    raise Exception("archive not found on GC")


To download cases, do the following:

# Get information about images in archive from GC API
response = client(url="https://grand-challenge.org/api/v1/cases/images/", params={'archive': corona_archive['pk']})
images = response['results']

# Create image mapping, from image URL to original image mha name
images_mapping = {}
for image in images:
    # get name of mha image from GC API
    images_mapping[image["files"][0]["file"]] = image["name"]

print("Downloading {0} images..".format(len(images_mapping)))

counter = 0
for file, name in images_mapping.items():
    response = client(url=file, follow_redirects=True)
    response.raise_for_status()
    with open(os.path.join(output_archive_dir, name), 'wb') as f1:
        f1.write(response.content)
    counter += 1
    print(counter)


Uploading cases to an archive and editing archive items on Grand Challenge

Images that you upload to an archive on Grand Challenge are stored as archive items. In the simplest case, an archive item consists of just one medical image. Archive items, however, also allow you to store metadata or additional images, like an overlay, along with each image. An archive item could, for example, consist of a medical image and a segmentation map for the image, or a specific disease probability score as metadata or a combination of those.

In this section, we will go over how to upload images to an archive, as well as how to edit the resulting archive items to add metadata or additional files to them.

Uploading images to an archive on Grand Challenge

First, prepare the list of files for each image you want to upload.

files = [f.resolve() for f in Path("/path/to/files").iterdir()]


To upload these files to an archive, you need to provide the slug of the archive you want to upload to. You can find the slug in the url of the archive. For example, if you would like to upload to the archive at https://grand-challenge.org/archives/radboudcovid/, you would provide "radboudcovid" as the slug. Note that the slug is case sensitive.

You can also optionally provide an interface slug for your images. If you do not provide an interface slug, all images will be uploaded and stored as generic medical images. We recommend to use specific interfaces whenever possible. For a list of possible interfaces, go here. If your desired interface does not yet exist, please email support@grand-challenge.org with a title and description to add it to the list. In the example below, we upload and store the files as CT images (i.e., with the ct-image interface).

session = client.upload_cases(files=files, archive="radboudcovid", interface="ct-image")


⚠️ Use one session per group of files that constitute an image: that way the conversion of the files can happen in parallel.

The above command starts a session that converts your files, and then adds the standardized images to the selected archive as separate archive items once it has succeeded. You can refresh the session object with

session = c(url=session["api_url"])


and check the session status with

session["status"]


Once the session completed successfully, you can retrieve the created archive items with:

archive = next(client.archives.iterate_all(params={"slug": "radboudcovid"}))
items = list(client.archive_items.iterate_all(params={"archive": archive["pk"]}))


For each archive item in the list of items, you will get the following information:

  { 
      "pk": "...", # the pk (primary key field) of the archive item
      "archive":"https://gc.localhost/api/v1/archives/.../", # the url to the archive that the item belongs to
       # a list of all values attached to the archive item
       # in this case each item has exactly one value, the CT image you uploaded in the previous step
      "values":[ 
         { 
            "interface":{ 
               "title":"CT Image", # the interface kind specified during image upload
               "description":"Any CT image", 
               "slug":"ct-image", 
               "kind":"Image", 
               "pk":..., 
               "default_value":"None", 
               "super_kind":"Image", 
               "relative_path":"images" 
            }, 
            "value":"None", 
            "file":"None", 
            "image":{ 
               "pk":"...", 
               "name":"filename.mha" 
            }, 
            "pk": ... 
         }, 
      ] 
   }


Adding additional images to an existing archive item

You can now add additional images, for example an overlay, to each of the archive items you created in the previous step. To add an overlay to the first archive item, do the following:

client.upload_cases(
    files=[Path("/path/to/overlay/image/file")], 
    archive_item=items[0]["pk"], 
    interface="generic-overlay"
)


⚠️ Note that this time you have to specify an interface. The interface you specify needs to be different from the ones already attached to the archive item, because an archive item can only have one of each interface type attached to it. In the example here, you could not add another ct-image, for instance. We will use the generic overlay interface, but we again encourage you to use specific interfaces whenever possible.

The updated archive item will then look like this:

  { 
      "pk": "...", 
      "archive":"https://gc.localhost/api/v1/archives/.../", 
       # the CT image you added in the first step
      "values":[ 
         { 
            "interface":{ 
               "title":"CT Image",
               "description":"Any CT image", 
               "slug":"ct-image", 
               "kind":"Image", 
               "pk":..., 
               "default_value":"None", 
               "super_kind":"Image", 
               "relative_path":"images" 
            }, 
            "value":"None", 
            "file":"None", 
            "image":{ 
               "pk":"...", 
               "name":"filename.mha" 
            }, 
            "pk": ... 
         }, 
        # the overlay you added in the second step
         { 
            "interface":{ 
               "title":"Generic Overlay", 
               "description":"", 
               "slug":"generic-overlay", 
               "kind":"HeatMap", 
               "pk": ..., 
               "default_value":"None", 
               "super_kind":"Image", 
               "relative_path":"images" 
            }, 
            "value":"None", 
            "file":"None", 
            "image":{ 
               "pk":"...", 
               "name":"overlay_filename.mha" 
            }, 
            "pk": ... 
         },  
      ] 
   }


Adding non-image files or other metadata to an existing archive item

Next to images, you can also upload non-image files, like a PDF or a CSV file, or metadata information to accompany the medical images in your archive.

To add, for example, a PDF report and a lung volume value to the first archive item, do the following:

client.update_archive_item( 
    archive_item_pk=items[0]['pk'], 
    values={ 
        "report": [Path("/path/to/pdf/report")], 
        "lung-volume": 1.9, 
    }, 
)


Both report and lung-volume correspond to existing interface slugs. If your desired interface does not yet exist, please email support@grand-challenge.org with a title and description to add it to the list.

⚠️ Note that you do not need to specify the already existing CT image and overlay values when updating the item. Only specify values that you would like to add.

Overwriting existing values of an archive item

You can also update the values of an archive item. Simply provide the slug of the interface you want to change and the new value or file:

client.update_archive_item( 
    archive_item_pk=items[0]['pk'],  
    values={  
        "ct-image": [Path("/path/to/new/ct/image")],  
    },  
)


The above code example will detach the previously uploaded CT image from the archive item and replace it with the newly provided file. ⚠️The previous CT image will then no longer be part of your archive, so only do this if you are sure you want to overwrite the file.

Changing the interface type of uploaded images

Finally, for images, it is also possible to change the interface type. For example, you might want to be more specific for the overlay attached to your image and rather than classifying it as a generic overlay, you want it to be a tumor likelihood map. For that you need to provide the api url of the previously uploaded overlay image and the new interface slug you would like to use:

# retrieve all items from your archive
archive = next(client.archives.iterate_all(params={"slug": "radboudcovid"}))
items = list(client.archive_items.iterate_all(params={"archive": archive["pk"]}))

# retrieve the pks of the generic overlay images together with their item pk 
item_pks_to_image_pks = {item["pk"]:value["image"]["pk"] for item in items for value in item["values"] if value["interface"]["slug"] == "generic-overlay"}

# retrieve the overlay image of a specified archive item
image = c.images.detail(pk=item_pks_to_image_pks[items[0]["pk"]])

# update the archive item
client.update_archive_item( 
    archive_item_pk=items[0]['pk'],  
    values={  
        "tumor-likelihood-map": image["api_url"],   
    },  
)


The archive item will then no longer contain a generic-overlay value, but instead a tumor-likelihood-map value.

⚠️ Note that this is currently only possible for images, but not for other file types (pdf, csv) or metadata values.

Deleting archive items

Deleting archive items is not possible through the api. If you would like to delete archive items, please send an email to support@grand-challenge.org with the slug of your archive and a list of the archive item pks you would like to have removed.