Using the API for archives¶
In this tutorial we will focus on archives and go over how to upload and download cases from an archive on our platform.
Remember that you need to request access prior to using a particular archive. You do not need to request permission if you are using your own archive.
If you haven't installed gcapi yet, follow the instructions here.
Import necessary libraries:
import gcapi from pathlib import Path
Authenticate to Grand Challenge using your personal API token.
# Authorize with your personal token. token = 'my-personal-api-token' client = gcapi.Client(token=token)
Downloading cases from Archives on Grand Challenge¶
In this part we will download cases from the coronacases.org dataset on Grand Challenge.
First let's search for the archive by its slug:
# slug of the archive archive_slug = "coronacasesorg" # save path on your machine output_archive_dir = 'path\to\where\to\save\the\data\on\your\machine' archive = client.archives.detail(slug=archive_slug) # This returns an object with some information about the archive. Archive( pk='526e7795-95ff-4b0e-9171-28d03f5b7bf0', title='Archive', logo='https://public.grand-challenge-user-content.org/logos/archive/526e7795-95ff-4b0e-9171-28d03f5b7bf0/foo.x20.jpeg', description='Test archive', api_url='https://grand-challenge.org/api/v1/archives//526e7795-95ff-4b0e-9171-28d03f5b7ba0/', url='https://grand-challenge.org/api/v1/archives/archives/archive/' )
To download cases, do the following:
# Get information about images in the archive from the API images = client.images.iterate_all( params={'archive': archive['pk']} ) # Download images for image in images: client.images.download( files=image.files, filename=Path(output_archive_dir, image.file) )
Uploading cases to an archive and editing archive items¶
Images that you upload to an archive on Grand Challenge are stored as archive items. In the simplest case, an archive item consists of just one medical image. Archive items, however, also allow you to store metadata or additional images, like an overlay, along with each image. An archive item could, for example, consist of a medical image and a segmentation map for the image, or a specific disease probability score as metadata or a combination of those.
In this section, we will go over how to upload images to an archive, as well as how to edit the resulting archive items to add metadata or additional files to them. As a preliminary note, there are multiple ways of uploading data to archives. Which way you choose will depend on what you want to upload.
Uploading images to an archive on Grand Challenge¶
If your archive only contains image files and if a case consists of only one single image, follow theses steps.
First, prepare the list of files for each image you want to upload.
files = [f.resolve() for f in Path("/path/to/files").iterdir() if f.is_file()]
To upload these files to an archive, you need to provide the slug of the archive you want to upload to. You can find the slug in the url of the archive. For example, if you would like to upload to the archive at https://grand-challenge.org/archives/radboudcovid/, you would provide "radboudcovid" as the slug. Note that the slug is case sensitive.
You can also need to provide an interface slug
for your images (note that this refers to the slug of the chosen socket). For challenge archives, the sockets will need to correspond to the sockets that has been configured as input for the challenge algorithms. For a list of possible sockets, go here. If your desired socket does not yet exist, please email support@grand-challenge.org with a title and description to add it to the list. In the example below, we upload and store the files as CT images (i.e., with the ct-image
socket).
client.add_cases_to_archive(archivearchive=archive_slug, values=[{"ct-image": files}])
The above command starts a process that converts your files, and then adds the standardized images to the selected archive as separate archive items once it has succeeded
Once the process completed successfully, you can retrieve the created archive items with:
archive = client.archives.detail(slug=archive_slug) items = list(client.archive_items.iterate_all(params={"archive": archive.pk}))
For each archive item in the list of items, you will get the following information as an ArchiveItem
:
ArchiveItem( # the pk (primary key field) of the archive item pk='7cea484b-34a1-45ea-88ab-bec1d4ed3d9c', # the api url to the archive that the item belongs to archive='https://grand-challenge.org/api/v1/archives/526e7795-95ff-4b0e-9171-28d03f5b7ba0/', # a list of all values attached to the archive item # in this case each item has exactly one value, # the CT image you uploaded in the previous step values=[ HyperlinkedComponentInterfaceValue( interface=ComponentInterface( title="CT Image", description='', slug='ct-image', kind='Image', pk=1, default_value=None, super_kind='Image', relative_path='images/ct-image', overlay_segments=[], look_up_table=None ), value=None, file=None, image='https://grand-challenge.org/api/v1/cases/images/fe1543ff-7c00-4c77-838d-9fd35996f319/', pk=2 ) ], hanging_protocol=None, optional_hanging_protocols=[], view_content={}, title='' )
Adding additional images to an existing archive item¶
If you already have archive items in your archive, but wish to add for example an overlay to each of the archive items, do the following:
# add a single overlay image to the first archive item from your archive client.update_archive_item( values={"generic-overlay": Path("/path/to/overlay/image/file")}, archive_item=items[0].pk, #see previous step how to retrieve your existing archive items )
⚠️ Note that this time you have to specify a socket. The socket (i.e. interface
) you specify needs to be different from the ones already attached to the archive item, because an archive item can only have one of each socket type attached to it. In the example here, you could not add another ct-image
, for instance. We will use the generic overlay socket, but we encourage you to use specific sockets whenever possible.
Adding non-image files or other metadata to an existing archive item¶
Besides images, you can also upload non-image files, like a PDF or a CSV file, or metadata information to accompany the medical images in your archive.
To add, for example, a PDF report and a lung volume value to the first archive item, do the following:
client.update_archive_item( archive_item_pk=items[0].pk, values={ "report": [Path("/path/to/pdf/report")], "lung-volume": 1.9, }, )
Both report
and lung-volume
correspond to existing socket slugs. If your desired socket does not yet exist, please email support@grand-challenge.org with a title and description to add it to the list.
💡 Note that you do not need to specify the already existing CT image and overlay values when updating the item. Only specify values that you would like to add.
Overwriting existing values (i.e. images or metadata) of an archive item¶
You can also update the values of an archive item. Simply provide the slug of the socket you want to change and the new value or file:
client.update_archive_item( archive_item_pk=items[0].pk, values={ "ct-image": Path("/path/to/new/ct/image"), }, )
The above code example will detach the previously uploaded CT image from the archive item and replace it with the newly provided file.
⚠️ The previous CT image will then no longer be part of your archive, so only do this if you are sure you want to overwrite the file.
Special file formats and complex archive items¶
Instead of first uploading a set of images to an archive and then adding a secondary image or some metadata to each of the created archive items as described in the steps above, you can also create an empty archive item first and then attach all linked files per item in one go.
ct= Path("/path/to/ct-image.mha") overlay= Path("/path/to/overlay.mha") archive = client.archives.detail(slug=archive_slug) archive_url = archive.api_url # Archive items link together related files/image in an archive # (eg those that belong to the same case), first create a new archive item: archive_item = client.archive_items.create(archive=archive_url, values=[]) # Upload the two image files to the newly created archive item, each with their own # socket (called interface for backwards compatibility, in this case a ct-image and a generic-overlay) client.update_archive_item( archive_item_pk=archive_item.pk, values={ "ct-image": [ct], "generic-overlay": [overlay], } )