Dear orgnization team,

In the training set of the category 1 detection task, each video sample has only a single ground-truth label. The tool presences in different frames are not fully captured by the ground-truth label although they are different. Do the ground-truth labels of the final test set for the detection task form similar to those of the training set? Or you will provide the frame-wise tools presence label for fine evaluation on the final test set?

Please excuse me for many questions. Best