Balaitous

About

Editors:

simon.j

LuukBoulogne

Contact email:

simon.jegou.ia@gmail.com

Image Version:

59a5ba2f-dce4-4117-bad8-e19f6199e964 — April 8, 2022

Summary

Balaitous is an updated version of the AI-severity model described in Lassau et al., 2021. It has been open sourced on GitHub.

Given an input CT scan, Balaitous outputs a probability for COVID disease and a probability for severe outcome, defined as intubation or death within one month.

It was trained on 2,000 patients from the public STOIC database and achieved the best performance on an hold-out validation dataset of 800 patients during qualification phase of the STOIC-2021 challenge (see leaderboard).

Mechanism

The processing steps of Balaitous are the following :

The scan is resized to a pixel spacing of (1.5mm, 1.5mm, 5mm) and reshaped to a shape of (224, 224, Z)
A lung segmentation mask is obtained using a 2D U-Net (source)
The scan is cropped to the slices containing the lungs
A first feature extractor is applied to get a first vector X_full
The lung mask is applied to the image (only lungs are now visible)
A second feature extractor is applied to get a second vector X_lung
For the severe outcome, 2 logistic regressions are applied to [X_full, age, sex] and [X_lung, age, sex] and the 2 probabilities are averaged
For the covid outcome, 2 logistic regressions are applied to X_full and X_lung and the 2 probabilities are averaged

The first feature extractor is a ViT-L model pretrained on ImageNet-22k using iBOT (source) and finetuned for 35 epochs on 165k CT slices (4k patients from 7 public datasets). The second feature extractor is the same ViT-L model without finetuning. Model weights can be found on Zenodo.

Only the 4 logistic regressions were trained on the STOIC database, and only COVID positive patients were used to train the 2 logistic regressions for the prediction of severity.

Note : hyper-parameters and feature extractors have been choosen following cross-validation results on the public STOIC database (n=2,000 patients). Using the finetuned iBOT model on the plain image instead of the ImageNet model only brought modest performance gains.

Interfaces

This algorithm implements all of the following input-output combinations:

Inputs Outputs

	Inputs	Outputs
1	CT Image Slug `ct-image` Description Any CT image Kind Image Read from `/input/images/ct/<uuid>.mha` or `/input/images/ct/<uuid>.tif` CT Image	Probability COVID-19 Slug `probability-covid-19` Description Probability of a positive RT-PCR result Kind Float Write to `/output/probability-covid-19.json` Example file Download an example. Example value 42.0 Probability COVID-19 Probability Severe COVID-19 Slug `probability-severe-covid-19` Description Probability of death or intubation at 1-month follow up Kind Float Write to `/output/probability-severe-covid-19.json` Example file Download an example. Example value 42.0 Probability Severe COVID-19

Validation and Performance

The ROC-AUC performances (in %) of Balaitous are:

	AUC severity	AUC covid
Training - X_full	79.01 +- 2.63	80.65 +- 2.16
Training - X_lung	79.00 +- 3.30	82.63 +- 1.99
Training	80.36 +- 2.80	82.98 +- 2.01
Validation	80.44	83.22

There were n=2,000 patients in the training dataset (n=1,205 COVID positive) and around n=800 patients in the validation dataset.

Performances on the training dataset are computed using a stratified 4x8-fold cross-validation scheme. Following the STOIC-2021 challenge, the AUC for the severity prediction task is computed only among COVID positive patients.

Challenge Performance

Date	Challenge	Phase	Rank
Jan. 30, 2022	STOIC2021	Qualification	5
April 9, 2022	STOIC2021	Qualification (last submission)	1

Uses and Directions

This algorithm was developed for research purposes only.

Warnings

Left empty by the Algorithm Editors

Common Error Messages

Left empty by the Algorithm Editors

Information on this algorithm has been provided by the Algorithm Editors, following the Model Facts labels guidelines from Sendak, M.P., Gao, M., Brajer, N. et al. Presenting machine learning model information to clinical end users with model facts labels. npj Digit. Med. 3, 41 (2020). 10.1038/s41746-020-0253-3