DRAGON BERT Base General-domain


Logo for DRAGON BERT Base General-domain

About

Creator:
Image Version:
41ac9009-66bd-44e7-9611-06b8607259b7
Last updated:
May 5, 2024, 11:15 a.m.

Interfaces

This algorithm implements all of the following input-output combinations:

Inputs Outputs
1
  • NLP Task Configuration (Anything)
  • NLP Training Dataset (Anything)
  • NLP Validation Dataset (Anything)
  • NLP Test Dataset (Anything)
  • NLP Predictions Dataset (Anything)
  • Challenge Performance

    Date Challenge Phase Rank
    May 7, 2024 DRAGON Synthetic 11
    June 16, 2024 DRAGON Validation 8
    June 16, 2024 DRAGON Test 6

    Model Facts

    Summary

    This algorithm is an adaptation of the DRAGON baseline (version 0.2.1), with pretrained foundational model GroNLP/bert-base-dutch-cased. This algorithm is used for the pretraining experiment in the DRAGON manuscript [1], according to the pre-specified statistical analysis plan [2]. See dragon.grand-challenge.org/manuscript for the latest info on the DRAGON manuscript. Please cite the manuscript [1] when using this model.

    [1] J. S. Bosma, K. Dercksen, L. Builtjes, R. André, C, Roest, S. J. Fransen, C. R. Noordman, M. Navarro-Padilla, J. Lefkes, N. Alves, M. J. J. de Grauw, L. van Eekelen, J. M. A. Spronck, M. Schuurmans, A. Saha, J. J. Twilt, W. Aswolinskiy, W. Hendrix, B. de Wilde, D. Geijs, J. Veltman, D. Yakar, M. de Rooij, F. Ciompi, A. Hering, J. Geerdink, H. Huisman, DRAGON Consortium. The DRAGON Benchmark for Clinical NLP. Under review.

    [2] J. S. Bosma, K. Dercksen, L. Builtjes, R. André, C, Roest, S. J. Fransen, C. R. Noordman, M. Navarro-Padilla, J. Lefkes, N. Alves, M. J. J. de Grauw, L. van Eekelen, J. M. A. Spronck, M. Schuurmans, A. Saha, J. J. Twilt, W. Aswolinskiy, W. Hendrix, B. de Wilde, D. Geijs, J. Veltman, D. Yakar, M. de Rooij, F. Ciompi, A. Hering, J. Geerdink, H. Huisman, DRAGON Consortium (2024). DRAGON Statistical Analysis Plan (v1.0). Zenodo. https://doi.org/10.5281/zenodo.10374512

    Mechanism

    For details on the pretrained foundational model, check out HuggingFace: GroNLP/bert-base-dutch-cased.

    The following settings were used in the the DRAGON baseline:

    model_name = "GroNLP/bert-base-dutch-cased"
    per_device_train_batch_size = 4
    gradient_accumulation_steps = 2
    gradient_checkpointing = False
    max_seq_length = 512
    learning_rate = 1e-05
    

    Validation and Performance

    This model was tested on the 28 tasks in the DRAGON Benchmark for clinical NLP. For the full test leaderboard see here and for the performance of this model see here.

    Uses and Directions

    This algorithm was developed for research purposes only.

    Warnings

    You should anonymize your reports before uploading them to Grand Challenge.

    Common Error Messages

    Some logs are incorrectly seen as warnings, so each successful algorithm job will still say "Succeeded, with warnings". This warning can typically be ignored.

    Information on this algorithm has been provided by the Algorithm Editors, following the Model Facts labels guidelines from Sendak, M.P., Gao, M., Brajer, N. et al. Presenting machine learning model information to clinical end users with model facts labels. npj Digit. Med. 3, 41 (2020). 10.1038/s41746-020-0253-3