Some images have different labels between G1 and G2, and there are no labels provided by G3 for these images

Hi masterawadh,

Thank you for your question. Grader 3 was only involved if graders 1 and 2 did not agree on the main label (referable or no referable glaucoma). There was no adjudication when graders 1 and 2 provided different labels for the glaucomatous features.

How you handle those cases during training is up to the participants. During testing, this is handled by the modified Hamming distance. Here is a brief example, with just 4 labels for the glaucomatous features. Suppose the graders' labels are G1 = [0 0 1 1] and G2 = [0 0 0 1] and the algorithm provided the labels AI = [0 1 0 1]. The first label of the AI is correct. The second is wrong. Since the 3rd label of the graders do not match, that one is ignored. And the fourth label is correct again. So we get [correct incorrect ignored correct]. The Hamming distance is then 1, which is normalized by the length of the valid labels (which was three), resulting in a modified Hamming distance of 0.33.

Let me know if you need more help with this!

Best, Koen

Some images have different labels between G1 and G2, and there are no labels provided by G3 for these images

Some images have different labels between G1 and G2, and there are no labels provided by G3 for these images ¶

Re: Some images have different labels between G1 and G2, and there are no labels provided by G3 for these images ¶