Grand Challenge

Further Clarification on O4 ¶

By: ahmed.mahooqi on Feb. 15, 2022, 7:27 a.m.

Dear Organizers,

Can you please provide us with further clarification regarding the output O4?

Last edited by: ahmed.mahooqi on Aug. 15, 2023, 12:55 p.m., edited 1 time in total.

Re: Further Clarification on O4 ¶

By: coendevente on Feb. 15, 2022, 9:37 a.m.

This is the description of output O4:

"a non-thresholded scalar value that is positively correlated with the likelihood for ungradability (e.g. the entropy of a probability vector produced by a machine learning model or the variance of an ensemble)"

It is a measure for ungradability; the higher the O4 output, the more likely your solution deems the case ungradable. It is similar to a likelihood for ungradability, but O4 is not bounded by the interval [0, 1]. You could also read the code for how we use O4 (a.k.a. multiple-referable-ungradability-scores) during evaluation here.

Re: Further Clarification on O4 ¶

By: ahmed.mahooqi on Feb. 15, 2022, 10:49 a.m.

Thanks for the clarification. I'm still a little bit confused about the "non-threshold" part, since the entropy value will always be a continous value between 0 and 1.

Re: Further Clarification on O4 ¶

By: coendevente on Feb. 15, 2022, 11:50 a.m.

Instead "non-thresholded", we could have also said "non-binary" / "non-binarized". We used this term, since O3 is actually thresholderd/binarized (likely a thresholded value of O4, but that is not a requirement).

Indeed, the Shannon entropy on a binary probability vector will give you a continuous value between 0 and 1, but we gave entropy just as an example. If you would use other ways of calculating O4 that would result in other bounds, that is also allowed.

Re: Further Clarification on O4 ¶

By: ahmed.mahooqi on Feb. 16, 2022, 9:01 a.m.

Thanks for the clarification. Makes sense now!