Task 1. details ¶
By: ryo-hachiuma on July 9, 2022, 2:44 p.m.
Hi, I have a question for the first task.
This category will require the teams to train multi-label fully supervised models. The model should classify all tools present within each frame of the video clips in the test set by training on the tool presence labels provided in the training set.
According to the explanation of the first task, the task is conducted within fully supervised setting. However, in the training set, only a ground-truth label for the entire video is provided. On the other hand, the task is to predict the tool presence label for "each frame". To clarify my understanding, the task will be weakly-supervised multi-label classification problem (the label is given only for the entire video.)?
Thank you in advance!