GC-API for Archives
In this tutorial we will focus on archives and go over how to upload and download cases from an archive on our platform.
Remember that you need to request access prior to using a particular archive. You do not need to request permission if you are using your own archive.
If you haven't installed gcapi yet, follow the instructions here.
Import necessary libraries:
import gcapi
import os
from pathlib import Path
Authenticate to Grand Challenge using your personal API token.
# authorize with your personal token
token = 'my-personal-api-token'
client = gcapi.Client(token=token)
Downloading cases from Archives on Grand Challenge¶
In this part we will download cases from the coronacases.org dataset on Grand Challenge.
First let's search for the archive by its slug:
# slug of the archive
archive_slug = "coronacasesorg"
# save path on your machine
output_archive_dir = 'path\to\where\to\save\the\data\to\your\machine'
archive = client.archives.detail(slug=archive_slug)
# This returns a dictionary with some information about the archive
{
'pk': 'f6190e90-d432-4c7d-9e38-cfeda3a3ff73',
'name': 'coronacases.org',
'title': 'coronacases.org',
'algorithms': [],
'logo': 'https://rumc-gcorg-p-public.s3.amazonaws.com/logos/archive/f6190e90-d432-4c7d-9e38-cfeda3a3ff73/Screenshot_20200609_223804.x20.jpeg',
'description': '10 CT scans from the website https://coronacases.org/',
'api_url': 'https://grand-challenge.org/api/v1/archives/f6190e90-d432-4c7d-9e38-cfeda3a3ff73/',
'url': 'https://grand-challenge.org/archives/coronacasesorg/'
}
To download cases, do the following:
# Get information about images in archive from GC API
response = client(
url="https://grand-challenge.org/api/v1/cases/images/",
params={'archive': archive['pk']}
)
images = response['results']
#Download images
for image in images:
client.images.download(
files=image['files'],
filename=os.path.join(output_archive_dir, image['name'])
)
Uploading cases to an archive and editing archive items on Grand Challenge¶
Images that you upload to an archive on Grand Challenge are stored as archive items. In the simplest case, an archive item consists of just one medical image. Archive items, however, also allow you to store metadata or additional images, like an overlay, along with each image. An archive item could, for example, consist of a medical image and a segmentation map for the image, or a specific disease probability score as metadata or a combination of those.
In this section, we will go over how to upload images to an archive, as well as how to edit the resulting archive items to add metadata or additional files to them. As a preliminary note, there are multiple ways of uploading data to archives. Which way you choose will depend on what you want to upload.
Uploading images to an archive on Grand Challenge¶
If your archive only contains image files and if a case consists of only one single image, follow theses steps.
First, prepare the list of files for each image you want to upload.
files = [f.resolve() for f in Path("/path/to/files").iterdir()]
To upload these files to an archive, you need to provide the slug of the archive you want to upload to. You can find the slug in the url of the archive. For example, if you would like to upload to the archive at https://grand-challenge.org/archives/radboudcovid/, you would provide "radboudcovid" as the slug. Note that the slug is case sensitive.
You can also optionally provide an interface slug
for your images. If you do not provide an interface slug, all images will be uploaded and stored as generic medical images. We recommend to use specific interfaces whenever possible. For Type 2 challenge archives, the interface will need to correspond to the interface that has been configured as input for the challenge algorithms. For a list of possible interfaces, go here. If your desired interface does not yet exist, please email support@grand-challenge.org with a title and description to add it to the list. In the example below, we upload and store the files as CT images (i.e., with the ct-image
interface).
session = client.upload_cases(files=files, archive=archive_slug, interface="ct-image")
⚠️ Use one session per group of files that constitute an image: that way the conversion of the files can happen in parallel.
The above command starts a session that converts your files, and then adds the standardized images to the selected archive as separate archive items once it has succeeded. You can refresh the session object with
session = client(url=session["api_url"])
and check the session status with
session["status"]
Once the session completed successfully, you can retrieve the created archive items with:
archive = client.archives.detail(slug=archive_slug)
items = list(client.archive_items.iterate_all(params={"archive": archive["pk"]}))
For each archive item in the list of items, you will get the following information:
{
"pk": "...",
# the pk (primary key field) of the archive item
"archive":"https://grand-challenge.org/api/v1/archives/.../",
# the url to the archive that the item belongs to
"values":[
# a list of all values attached to the archive item
# in this case each item has exactly one value,
# the CT image you uploaded in the previous step
{
"interface":{
"title":"CT Image", # the interface kind specified during image upload
"description":"Any CT image",
"slug":"ct-image",
"kind":"Image",
"pk":...,
"default_value":"None",
"super_kind":"Image",
"relative_path":"images"
},
"value":"None",
"file":"None",
"image":{
"pk":"...",
"name":"filename.mha"
},
"pk": ...
},
]
}
Adding additional images to an existing archive item¶
If you already have archive items in your archive, but wish to add for example an overlay to each of the archive items, do the following:
# add a single overlay image to the first archive item from your archive
client.upload_cases(
files=[Path("/path/to/overlay/image/file")],
archive_item=items[0]["pk"], #see previous step how to retrieve your existing archive items
interface="generic-overlay"
)
⚠️ Note that this time you have to specify an interface. The interface you specify needs to be different from the ones already attached to the archive item, because an archive item can only have one of each interface type attached to it. In the example here, you could not add another ct-image
, for instance. We will use the generic overlay interface, but we again encourage you to use specific interfaces whenever possible.
The updated archive item will then look like this:
{
"pk": "...",
"archive":"https://grand-challenge.org/api/v1/archives/.../",
# the CT image you added in the first step
"values":[
{
"interface":{
"title":"CT Image",
"description":"Any CT image",
"slug":"ct-image",
"kind":"Image",
"pk":...,
"default_value":"None",
"super_kind":"Image",
"relative_path":"images"
},
"value":"None",
"file":"None",
"image":{
"pk":"...",
"name":"filename.mha"
},
"pk": ...
},
# the overlay you added in the second step
{
"interface":{
"title":"Generic Overlay",
"description":"",
"slug":"generic-overlay",
"kind":"HeatMap",
"pk": ...,
"default_value":"None",
"super_kind":"Image",
"relative_path":"images"
},
"value":"None",
"file":"None",
"image":{
"pk":"...",
"name":"overlay_filename.mha"
},
"pk": ...
},
]
}
Adding non-image files or other metadata to an existing archive item¶
Next to images, you can also upload non-image files, like a PDF or a CSV file, or metadata information to accompany the medical images in your archive.
To add, for example, a PDF report and a lung volume value to the first archive item, do the following:
client.update_archive_item(
archive_item_pk=items[0]['pk'],
values={
"report": [Path("/path/to/pdf/report")],
"lung-volume": 1.9,
},
)
Both report
and lung-volume
correspond to existing interface slugs. If your desired interface does not yet exist, please email support@grand-challenge.org with a title and description to add it to the list.
⚠️ Note that you do not need to specify the already existing CT image and overlay values when updating the item. Only specify values that you would like to add.
Overwriting existing values (i.e. images or metadata) of an archive item¶
You can also update the values of an archive item. Simply provide the slug of the interface you want to change and the new value or file:
client.update_archive_item(
archive_item_pk=items[0]['pk'],
values={
"ct-image": [Path("/path/to/new/ct/image")],
},
)
The above code example will detach the previously uploaded CT image from the archive item and replace it with the newly provided file.
⚠️The previous CT image will then no longer be part of your archive, so only do this if you are sure you want to overwrite the file.
Special file formats and complex archive items¶
Instead of first uploading a set of images to an archive and then adding a secondary image or some metadata to each of the created archive items as described in the steps above, you can also create an empty archive item first and then attach all linked files per item in one go.
This is also currently the only way to upload special file types that are not yet supported by the Grand Challenge UI, such as .json, .mp4 or .obj files.
ct= Path("/path/to/ct-image.mha")
overlay= Path("/path/to/overlay.mha")
archive = client.archives.detail(slug=archive_slug)
archive_url = archive["api_url"]
# Archive items link together related files/image in an archive
# (eg those that belong to the same case), first create a new archive item:
ai = client.archive_items.create(archive=archive_url, values=[])
# Upload the two image files to the newly created archive item, each with their own
# interface (in this case a ct-image and a generic-overlay)
client.update_archive_item(
archive_item_pk=ai["pk"],
values={
"ct-image": [ct],
"generic-overlay": [overlay],
}
)