DRAGON RoBERTa Large Domain-specific V2


Logo for DRAGON RoBERTa Large Domain-specific V2

About

Creator:
Image Version:
41922d65-d1c8-45ea-bb5c-6b714b4568df
Last updated:
Feb. 17, 2025, 2:10 p.m.

Interfaces

This algorithm implements all of the following input-output combinations:

Inputs Outputs
1
    NLP Task Configuration
    NLP Training Dataset
    NLP Validation Dataset
    NLP Test Dataset
    NLP Predictions Dataset

Challenge Performance

Date Challenge Phase Rank
Feb. 17, 2025 DRAGON Synthetic 9
Feb. 17, 2025 DRAGON Validation 1
Feb. 18, 2025 debug-dragon Task 025 1

Model Facts

Summary

This algorithm is an adaptation of the DRAGON baseline (version 0.2.1), with pretrained foundational model joeranbosma/dragon-roberta-large-domain-specific. This algorithm is used for the pretraining experiment in the DRAGON manuscript [1], according to the pre-specified statistical analysis plan [2]. See dragon.grand-challenge.org/manuscript for the latest info on the DRAGON manuscript. Please cite the manuscript [1] when using this model.

[1] J. S. Bosma, K. Dercksen, L. Builtjes, R. André, C, Roest, S. J. Fransen, C. R. Noordman, M. Navarro-Padilla, J. Lefkes, N. Alves, M. J. J. de Grauw, L. van Eekelen, J. M. A. Spronck, M. Schuurmans, A. Saha, J. J. Twilt, W. Aswolinskiy, W. Hendrix, B. de Wilde, D. Geijs, J. Veltman, D. Yakar, M. de Rooij, F. Ciompi, A. Hering, J. Geerdink, H. Huisman, DRAGON Consortium. The DRAGON Benchmark for Clinical NLP. Under review.

[2] J. S. Bosma, K. Dercksen, L. Builtjes, R. André, C, Roest, S. J. Fransen, C. R. Noordman, M. Navarro-Padilla, J. Lefkes, N. Alves, M. J. J. de Grauw, L. van Eekelen, J. M. A. Spronck, M. Schuurmans, A. Saha, J. J. Twilt, W. Aswolinskiy, W. Hendrix, B. de Wilde, D. Geijs, J. Veltman, D. Yakar, M. de Rooij, F. Ciompi, A. Hering, J. Geerdink, H. Huisman, DRAGON Consortium (2024). DRAGON Statistical Analysis Plan (v1.0). Zenodo. https://doi.org/10.5281/zenodo.10374512

Mechanism

For details on the pretrained foundational model, check out HuggingFace: joeranbosma/dragon-roberta-large-domain-specific.

The following settings were used in the the DRAGON baseline:

model_name = "joeranbosma/dragon-roberta-large-domain-specific"
per_device_train_batch_size = 1
gradient_accumulation_steps = 8
gradient_checkpointing = False
max_seq_length = 512
learning_rate = 1e-05

Validation and Performance

N/A.

Uses and Directions

This algorithm was developed for research purposes only.

Warnings

You should anonymize your reports before uploading them to Grand Challenge.

Common Error Messages

Some logs are incorrectly seen as warnings, so each successful algorithm job will still say "Succeeded, with warnings". This warning can typically be ignored.

Information on this algorithm has been provided by the Algorithm Editors, following the Model Facts labels guidelines from Sendak, M.P., Gao, M., Brajer, N. et al. Presenting machine learning model information to clinical end users with model facts labels. npj Digit. Med. 3, 41 (2020). 10.1038/s41746-020-0253-3