Share Your Public Data¶
We recommend using either Zenodo or the AWS Open Data Registry to make your training data publicly available to participants.
Zenodo¶
Zenodo is an open-source platform that:
- Promotes Open Science
- Is free to use
- Assigns a DOI to every dataset (improving traceability)
However, note the 50 GB size limit per repository.
👉 Add your dataset to our Grand Challenge Zenodo Community if you use this platform.
AWS Open Data Registry¶
An alternative option is the AWS Open Data Registry, which offers:
- No size restrictions like Zenodo
- Direct access via AWS S3
- Free data downloads
📝 Note: Applications are reviewed quarterly, so submit early.
To add your dataset, follow the instructions on GitHub.
Choosing a Data License¶
Lastly, it's important to select a suitable license for the public training dataset used in your challenge. We recommend using a permissive Creative Commons (CC) license to ensure that participants can freely access and use the data for training and evaluation.
✅ Recommended: CC BY¶
We prefer the CC BY license, which:
- Allows others to distribute, remix, adapt, and build upon the data — even commercially
- Requires only that they give appropriate credit
- Maximizes usability and reduces ambiguity for participants
This license strikes the best balance between openness and attribution and ensures that participation in your challenge remains straightforward and unrestricted.
⚠️ Avoid: CC BY-NC-ND¶
We advise against using restrictive licenses such as CC BY-NC-ND, which:
- Prohibits commercial use
- Disallows distribution of adapted or modified versions of the data
These restrictions may create legal uncertainty for participants. For example, the model weights generated during training might be interpreted as a derivative of the dataset. Under a CC BY-NC-ND license, sharing or publishing such models could be considered a violation, effectively preventing participants from training and submitting models — which defeats the purpose of the challenge.
💬 Need Help Deciding?¶
If using a CC BY license is not possible for your dataset, or if you would like to discuss licensing options, we are happy to assist. Please contact us at:
⚠️ Important: Your test data must be uploaded to Grand Challenge and should not be shared publicly.