Updated calcuation of evaluation metrics ¶
By: evihuijben on July 3, 2023, 1:42 p.m.
Dear participant,
We have been alerted by a participant to an error in our evaluation pipeline for calculating PSNR and SSIM. Both metrics consider a population-wide data range, which was originally set to [-1000, 2000], but many scans contain values that are outside this range. We have redefined this data range to [-1024, 3000], and for the calculation of PSNR and SSIM we have clipped both CT and sCT scans to these values.
The updated script for calculating these metrics can be found on our Github.
All validation task 1 and 2 submissions made before this bug was fixed have already been re-evaluated. We analyzed the difference in ranking position for each submission, and on average the positions changed by 0.29 and 0.27 for task 1 and 2, respectively. The maximum position shift was 3 places, and this occurred only once. All shifts of more than one position occurred in the bottom half of the leader board.
Best,
Evi & SynthRAD Organizers