Grand Challenge

Dense features ¶

By: Telcrome on May 27, 2025, 9:51 a.m.

Hi!

Is there documentation that explains how the algorithm should output features for the current segmentation adapters? The minimal example seems to use 1D embeddings for patches within regions, is that correct?
If it's not possible already, could you also add the option for algorithms to provide spatial features such as (Channels, spatials...)? this layer and this layer would need to be replaced by a resize, is that correct?

Thanks in advance!

Last edited by: Telcrome on May 27, 2025, 11:33 a.m., edited 1 time in total.

Re: Dense features ¶

By: MichelleStegeman on May 27, 2025, 2:48 p.m.

Hi! Thanks for reaching out with your questions.

You're right, the minimal example currently demonstrates the use of 1D embeddings for patches within regions. This setup is primarily intended for demonstration purposes and is a simplified setup. If you're interested in implementing your own adaptor strategy, you can check out the instructions in this repository: https://github.com/DIAGNijmegen/unicorn_eval. It explains how to submit a PR with your custom adaptor.

That said, we’re happy to look into providing small adjustments to the simplified setup that accepts 2D embeddings if you’d prefer to use this demonstrator with minimal modifications. Please feel free to share any specific use case or requirements you have in mind so we can better tailor the solution to your needs.

Regarding your question about spatial features, I’m not entirely sure I follow. Could you clarify what you mean?

Re: Dense features ¶

By: Telcrome on May 27, 2025, 7:30 p.m.

Thanks for your quick reply!

In the case of 2D image input, (e.g. channels=3, H=256, W=256), with spatial features I mean 2D embeddings that might be downsampled (e.g. embedding-size=42, H=256//4, W=256//4). For a 3D input, such as (C=1, H, W, D), it could be (Z, H//4, W//4, D).

A suitable adapter could then be a 2D or 3D convolutional layer. Would you be interested to provide something like this as a modification to the current example?

foundation_model_output = ... # Z, H//4, W//4 tensor
ground_truth = ... # classes, H, W tensor

# Adapter
predictions = nn.Conv2d(EMBEDDING_DIM:=42, NUMBER_CLASSES, kernel_size=1, padding=0)(foundation_model_output)
ground_truth_downsampled = F.resize(ground_truth, predictions.shape[-2:], interpolation=InterpolationMode.NEAREST)
loss = crossentropy(predictions, ground_truth_downsampled)

Last edited by: Telcrome on May 27, 2025, 7:35 p.m., edited 1 time in total.

Re: Dense features ¶

By: clemsg on May 27, 2025, 8:51 p.m.

Hi, interesting question!

To give you a bit of context: when we framed the challenge, we focused on (vision) foundation models that output a 1D embedding per image (this is the case for the vast majority of models). It also allows control on the total size of the file containing the features. If your model instead produces 2D embeddings, you can consider the following workaround:

If the output height and width are consistent across all image inputs, you can flatten the 2D tensor into 1D before saving, and simply reshape it back to the original 2D shape in your adaptor after loading the (flattened) features from disk.

It would look like this:

foundation_model_output_flattened = ... # embedding_size * H//4 * W//4 tensor
foundation_model_output_2d = foundation_model_output_flattened.reshape(embedding_size, H//4, W//4)

predictions = nn.Conv2d(embedding_size, NUMBER_CLASSES, kernel_size=1, padding=0)(foundation_model_output_2d)

We strongly encourage participants to implement their own adaptor logic! Instructions for doing so are available in the evaluation toolkit repository: the README.md explains how to submit a PR with your custom adaptor.

Alternatively, we could consider extending the accepted feature format to support 2D grids natively. However, this would require internal discussion and may not be feasible in the short term. For now, we recommend using the flattening approach, which should be sufficient in most cases.

Let us know how this works out for you!

Last edited by: clemsg on May 27, 2025, 9:13 p.m., edited 2 times in total.

Re: Dense features ¶

By: Telcrome on June 25, 2025, 2:22 p.m.

Thank you for your suggestion! Converting features into 1D patches for the platform seems to work in principle. However, I encountered some issues:

When averaging features into 1D embeddings resulting in small embedding vectors, I found that the Task 09 adapter works down to a patch size of 64x64. Going smaller causes the adapter to exit with an 'out of memory' error.
When flattening the features as you proposed, the adapter errors because an embedding size of for example 4096 is probably too large. I received the following error message: 'The output file 'patch-neural-representation.json' is not valid. JSON does not fulfill schema: instance is too long".

Is there anything I can do about these problems?

Last edited by: Telcrome on June 25, 2025, 2:22 p.m., edited 1 time in total.

Re: Dense features ¶

By: MichelleStegeman on June 27, 2025, 12:37 p.m.

The adaptor indeed has memory constraints when working with a large number of patches. Smaller patch sizes significantly increase the number of patches extracted from each image, which raises memory usage during processing in the evaluation Docker. Unfortunately, we don’t have an exact upper limit on the number of patches or embedding dimensions that the evaluation Docker can handle.

Please note that this is a limitation of the current adaptor. We are working on a new version that should be more memory efficient, but it won’t be ready before the end of next week at the earliest. Meanwhile, if you prefer, we encourage you to implement a custom adaptor and submit a pull request to this repository.

Regarding the error about 'patch-neural-representation.json' not being valid: This is related to the JSON schema. We previously set a feature length limit of 2048, but we have requested that this limit be removed. This issue should be resolved very soon! The previous limit has been removed to offer greater flexibility. However, note that the total size of the feature JSON file must not exceed 5GB - this is a hard limit and cannot be changed.