Balaitous


Logo for Balaitous

About

Creators:
Contact email:
Version:
59a5ba2f-dce4-4117-bad8-e19f6199e964
Last updated:
April 8, 2022, 4:33 p.m.
Inputs:
  • CT Image  (Any CT image)
Outputs:
  • Probability COVID-19  (Probability of a positive RT-PCR result)
  • Probability Severe COVID-19  (Probability of death or intubation at 1-month follow up)

Challenge Performance

Date Challenge Phase Rank
Jan. 30, 2022 STOIC2021 Qualification 5
April 9, 2022 STOIC2021 Qualification (last submission) 1

Model Facts

Summary

Balaitous is an updated version of the AI-severity model described in Lassau et al., 2021. It has been open sourced on GitHub.

Given an input CT scan, Balaitous outputs a probability for COVID disease and a probability for severe outcome, defined as intubation or death within one month.

It was trained on 2,000 patients from the public STOIC database and achieved the best performance on an hold-out validation dataset of 800 patients during qualification phase of the STOIC-2021 challenge (see leaderboard).

Mechanism

The processing steps of Balaitous are the following :

  • The scan is resized to a pixel spacing of (1.5mm, 1.5mm, 5mm) and reshaped to a shape of (224, 224, Z)
  • A lung segmentation mask is obtained using a 2D U-Net (source)
  • The scan is cropped to the slices containing the lungs
  • A first feature extractor is applied to get a first vector X_full
  • The lung mask is applied to the image (only lungs are now visible)
  • A second feature extractor is applied to get a second vector X_lung
  • For the severe outcome, 2 logistic regressions are applied to [X_full, age, sex] and [X_lung, age, sex] and the 2 probabilities are averaged
  • For the covid outcome, 2 logistic regressions are applied to X_full and X_lung and the 2 probabilities are averaged

The first feature extractor is a ViT-L model pretrained on ImageNet-22k using iBOT (source) and finetuned for 35 epochs on 165k CT slices (4k patients from 7 public datasets). The second feature extractor is the same ViT-L model without finetuning. Model weights can be found on Zenodo.

Only the 4 logistic regressions were trained on the STOIC database, and only COVID positive patients were used to train the 2 logistic regressions for the prediction of severity.

Note : hyper-parameters and feature extractors have been choosen following cross-validation results on the public STOIC database (n=2,000 patients). Using the finetuned iBOT model on the plain image instead of the ImageNet model only brought modest performance gains.

Validation and Performance

The ROC-AUC performances (in %) of Balaitous are:

AUC severity AUC covid
Training - X_full 79.01 +- 2.63 80.65 +- 2.16
Training - X_lung 79.00 +- 3.30 82.63 +- 1.99
Training 80.36 +- 2.80 82.98 +- 2.01
Validation 80.44 83.22

There were n=2,000 patients in the training dataset (n=1,205 COVID positive) and around n=800 patients in the validation dataset.

Performances on the training dataset are computed using a stratified 4x8-fold cross-validation scheme. Following the STOIC-2021 challenge, the AUC for the severity prediction task is computed only among COVID positive patients.

Uses and Directions

This algorithm was developed for research purposes only.

Warnings

Common Error Messages

Information on this algorithm has been provided by the Algorithm Editors, following the Model Facts labels guidelines from Sendak, M.P., Gao, M., Brajer, N. et al. Presenting machine learning model information to clinical end users with model facts labels. npj Digit. Med. 3, 41 (2020). 10.1038/s41746-020-0253-3