DRAGON BERT Base Domain-specific

About
Interfaces
This algorithm implements all of the following input-output combinations:
Inputs | Outputs | |
---|---|---|
1 |
|
|
Challenge Performance
Date | Challenge | Phase | Rank |
---|---|---|---|
May 10, 2024 | DRAGON | Synthetic | 14 |
June 16, 2024 | DRAGON | Test | 8 |
June 21, 2024 | DRAGON | Validation | 10 |
Model Facts
Summary
This algorithm is an adaptation of the DRAGON baseline (version 0.2.1), with pretrained foundational model joeranbosma/dragon-bert-base-domain-specific. This algorithm is used for the pretraining experiment in the DRAGON manuscript [1], according to the pre-specified statistical analysis plan [2]. See dragon.grand-challenge.org/manuscript for the latest info on the DRAGON manuscript. Please cite the manuscript [1] when using this model.
[1] J. S. Bosma, K. Dercksen, L. Builtjes, R. André, C, Roest, S. J. Fransen, C. R. Noordman, M. Navarro-Padilla, J. Lefkes, N. Alves, M. J. J. de Grauw, L. van Eekelen, J. M. A. Spronck, M. Schuurmans, A. Saha, J. J. Twilt, W. Aswolinskiy, W. Hendrix, B. de Wilde, D. Geijs, J. Veltman, D. Yakar, M. de Rooij, F. Ciompi, A. Hering, J. Geerdink, H. Huisman, DRAGON Consortium. The DRAGON Benchmark for Clinical NLP. Under review.
[2] J. S. Bosma, K. Dercksen, L. Builtjes, R. André, C, Roest, S. J. Fransen, C. R. Noordman, M. Navarro-Padilla, J. Lefkes, N. Alves, M. J. J. de Grauw, L. van Eekelen, J. M. A. Spronck, M. Schuurmans, A. Saha, J. J. Twilt, W. Aswolinskiy, W. Hendrix, B. de Wilde, D. Geijs, J. Veltman, D. Yakar, M. de Rooij, F. Ciompi, A. Hering, J. Geerdink, H. Huisman, DRAGON Consortium (2024). DRAGON Statistical Analysis Plan (v1.0). Zenodo. https://doi.org/10.5281/zenodo.10374512
Mechanism
For details on the pretrained foundational model, check out HuggingFace: joeranbosma/dragon-bert-base-domain-specific.
The following settings were used in the the DRAGON baseline:
model_name = "joeranbosma/dragon-bert-base-domain-specific" per_device_train_batch_size = 4 gradient_accumulation_steps = 2 gradient_checkpointing = False max_seq_length = 512 learning_rate = 1e-05
Validation and Performance
This model was tested on the 28 tasks in the DRAGON Benchmark for clinical NLP. For the full test leaderboard see here and for the performance of this model see here.
Uses and Directions
This algorithm was developed for research purposes only.
Warnings
You should anonymize your reports before uploading them to Grand Challenge.
Common Error Messages
Some logs are incorrectly seen as warnings, so each successful algorithm job will still say "Succeeded, with warnings". This warning can typically be ignored.
Information on this algorithm has been provided by the Algorithm Editors, following the Model Facts labels guidelines from Sendak, M.P., Gao, M., Brajer, N. et al. Presenting machine learning model information to clinical end users with model facts labels. npj Digit. Med. 3, 41 (2020). 10.1038/s41746-020-0253-3