DRAGON CLTL MedRoBERTa.nl


Logo for DRAGON CLTL MedRoBERTa.nl

About

Creator:
Image Version:
e3c010fc-ec5e-48fa-a56c-e05da85085a8
Last updated:
June 20, 2024, 12:24 p.m.

Interfaces

This algorithm implements all of the following input-output combinations:

Inputs Outputs
1
    NLP Task Configuration
    NLP Training Dataset
    NLP Validation Dataset
    NLP Test Dataset
    NLP Predictions Dataset

Challenge Performance

Date Challenge Phase Rank
June 20, 2024 DRAGON Synthetic 33
July 2, 2024 DRAGON Validation 20

Model Facts

Summary

This algorithm is an adaptation of the DRAGON baseline (version 0.2.1), with pretrained foundational model CLTL/MedRoBERTa.nl. This algorithm is used for the pretraining experiment in the DRAGON manuscript [1], according to the pre-specified statistical analysis plan [2]. See dragon.grand-challenge.org/manuscript for the latest info on the DRAGON manuscript. Please cite the manuscript [1] when using this model.

[1] J. S. Bosma, K. Dercksen, L. Builtjes, R. André, C, Roest, S. J. Fransen, C. R. Noordman, M. Navarro-Padilla, J. Lefkes, N. Alves, M. J. J. de Grauw, L. van Eekelen, J. M. A. Spronck, M. Schuurmans, A. Saha, J. J. Twilt, W. Aswolinskiy, W. Hendrix, B. de Wilde, D. Geijs, J. Veltman, D. Yakar, M. de Rooij, F. Ciompi, A. Hering, J. Geerdink, H. Huisman, DRAGON Consortium. The DRAGON Benchmark for Clinical NLP. Under review.

[2] J. S. Bosma, K. Dercksen, L. Builtjes, R. André, C, Roest, S. J. Fransen, C. R. Noordman, M. Navarro-Padilla, J. Lefkes, N. Alves, M. J. J. de Grauw, L. van Eekelen, J. M. A. Spronck, M. Schuurmans, A. Saha, J. J. Twilt, W. Aswolinskiy, W. Hendrix, B. de Wilde, D. Geijs, J. Veltman, D. Yakar, M. de Rooij, F. Ciompi, A. Hering, J. Geerdink, H. Huisman, DRAGON Consortium (2024). DRAGON Statistical Analysis Plan (v1.0). Zenodo. https://doi.org/10.5281/zenodo.10374512

Mechanism

For details on the pretrained foundational model, check out HuggingFace: CLTL/MedRoBERTa.nl.

The following settings were used in the the DRAGON baseline:

self.model_name = "CLTL/MedRoBERTa.nl"
self.per_device_train_batch_size = 4
self.gradient_accumulation_steps = 2
self.gradient_checkpointing = False
self.max_seq_length = 512
self.learning_rate = 1e-05

Validation and Performance

N/A.

Uses and Directions

This algorithm was developed for research purposes only.

Warnings

You should anonymize your reports before uploading them to Grand Challenge.

Common Error Messages

Some logs are incorrectly seen as warnings, so each successful algorithm job will still say "Succeeded, with warnings". This warning can typically be ignored.

Information on this algorithm has been provided by the Algorithm Editors, following the Model Facts labels guidelines from Sendak, M.P., Gao, M., Brajer, N. et al. Presenting machine learning model information to clinical end users with model facts labels. npj Digit. Med. 3, 41 (2020). 10.1038/s41746-020-0253-3