PatientSex values

PatientSex values  

  By: simon.j on Dec. 28, 2021, 10:12 p.m.

Hello,

Looking at the PatientSex key in the MetaData of the 2,000 mha files I found: - 1050 M (males) - 792 F (females) - 119 O, 1 A and 38 missing values

What does "O" and "A" stands for ? Should we consider it as missing values too ?

Thanks

Re: PatientSex values  

  By: LuukBoulogne on Dec. 29, 2021, 9:47 a.m.

Hi Simon,

Thank you for your question. We will look into this together with AP-HP, who we are organizing the challenge with. This could take until after the holidays. For now the O, A and missing labels can be considered as missing.

Re: PatientSex values  

  By: miriamelia on Jan. 19, 2022, 1:48 p.m.

Hi,

Link to data.csv, overview of full training data set

Also, this list of the 38 noisy samples might help:

``` Processing samples NOISY SAMPLES: 6902 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 9644 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 634 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 3617 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 7488 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 3418 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 5959 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 2372 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 3622 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 2375 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 7924 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 4022 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 5665 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 8734 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 9645 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 7673 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 7733 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 6871 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 1175 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 7613 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 234 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 9320 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 3386 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 5427 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 2313 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 2817 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 2465 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 1731 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 10212 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 7537 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 827 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 9285 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 5542 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 3584 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 5187 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 8325 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 1467 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') NOISY SAMPLES: 9433 ('ITK_InputFilterName', 'ITK_original_direction', 'ITK_original_spacing', 'PatientAge', 'PatientID', 'PatientName', 'SliceThickness') PatientID PatientAge PatientSex ... shape_resampled_z probCOVID probSevere 0 343 75 M ... 227 0 0 1 1610 c 35 M ... 246 0 0 2 2364 45 M ... 231 1 0 3 4288 85 F ... 220 1 1 4 5419 85 M ... 238 1 0 ... ... ... ... ... ... ... ... 1957 8853 75 M ... 289 1 0 1958 7582 75 M ... 227 1 0 1959 2909 45 M ... 211 0 0 1960 8736 75 F ... 218 0 0 1961 9176 85 F ... 218 1 0

[1962 rows x 14 columns]

Number of samples: 1962

Shape org x - Min: 124 Max: 1199 Median: 433.5 Shape org y - Min: 512 Max: 512 Median: 512.0 Shape org z - Min: 512 Max: 512 Median: 512.0 Shape resampled x - Min: 72 Max: 226 Median: 121.0 Shape resampled y - Min: 146 Max: 320 Median: 237.5 Shape resampled z - Min: 146 Max: 320 Median: 237.5

Spacing x - Min: 0.44921875 Max: 0.988 Median: 0.7333984375 Spacing y - Min: 0.44921875 Max: 0.988 Median: 0.7333984375 Spacing z - Min: 0.3 Max: 2.5 Median: 0.7999999999999998

PatientAge - Min: 35 Max: 85 Median: 65.0 PatientSex - Labels: ['A' 'F' 'M' 'O'] Frequency: [ 1 792 1050 119]

COVID-Severity - Labels: [0 1] Frequency: [1666 296] COVID-PCR - Labels: [0 1] Frequency: [ 780 1182]

 Last edited by: miriamelia on Aug. 15, 2023, 12:55 p.m., edited 2 times in total.

Re: PatientSex values  

  By: LuukBoulogne on Feb. 10, 2022, 6:24 p.m.

The 119 O, 1 A and 38 missing values for PatientSex have now been updated in the s3 bucket containing the public training set. See the corresponding announcement for more information.